Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new xtypes: range, hms, dms, uri, uuid #24

Merged
merged 9 commits into from
Jul 5, 2023
124 changes: 110 additions & 14 deletions DALI.tex
Original file line number Diff line number Diff line change
Expand Up @@ -625,17 +625,18 @@ \subsection{Multiple Values}
Services must respond with an error if the request includes multiple values for
parameters defined to be single-valued.

\subsection{Literal Values}
\subsection{Data Types and Literal Values}
In this section we specify how values are to be expressed. These literal values
are used as input or output from DAL services: as parameter values when
invoking simple services, as data values in response documents (e.g. VOTable),
etc. We define some general purpose values for the \xmlel{xtype} attribute of
the VOTable \xmlel{FIELD} and \xmlel{PARAM} elements for simple
structured values: \emph{timestamp}, \emph{interval},
\emph{point}, \emph{circle}, \emph{polygon}, \emph{moc}, \emph{multipolygon},
and \emph{shape} (see below).
Services may
use non-standard \xmlel{xtype} values for non-standard datatypes, but if they
\emph{hms}, \emph{dms},
\emph{point}, \emph{circle}, \emph{range}, \emph{polygon}, \emph{moc},
\emph{multipolygon}, \emph{shape}, \emph{uri}, and \emph{uuid} (see below).

Services may use non-standard \xmlel{xtype} values for non-standard datatypes, but if they
do so they should include a simple prefix (a string followed by a colon
followed by the non-standard xtype) so client software can easily determine
if a value is standard or not. For example, an \xmlel{xtype} for a
Expand Down Expand Up @@ -757,6 +758,25 @@ \subsubsection{Intervals}
\verb|MAX| child elements to describe the (minimum) lower bound and (maximum)
upper bound of interval(s) respectively.

\subsubsection{Sexagessimal Coordinates}
Coordinate values expressed in sexagessimal form can be described using the following
xtypes in both VOTable \xmlel{FIELD} and \xmlel{PARAM} elements:

right ascension: \verb|datatype="char"| \verb|arraysize="*"| \verb|xtype="hms"|

declination: \verb|datatype="char"| \verb|arraysize="*"| \verb|xtype="dms"|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be dot points? They look a bit strange in the PDF as paras


For \verb|xtype="hms"|, the values are serialised as hours:minutes:seconds where hours
and minutes are integer values and seconds is a real value. For \verb|xtype="dms"|, the values
are serialised as degrees:minutes:seconds where degrees and minutes are integer
values and seconds is a real value. All hours must fall within [0,24], degrees
(latitude) must fall within [-90,90], minutes must fall within [0,60), and seconds
must fall within [0,60). Valid values for \verb|xtype="hms"| are from 0:0:0 to 24:0:0.
Valid values for \verb|xtype="dms"| are from -90:0:0 to 90:0:0; an optional + sign at
the start is allowed (e.g. +10:20:30) but not required. Since the upper bound on minutes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since here makes the sentence feel unfinished.

and seconds is not part of the valid range; for example 12:34:60 is not allowed and must
be expressed as 12:35:00 instead.

\subsubsection{Point}
Geometry values are two-dimensional; although they are usually longitude and
latitude values in spherical coordinates this is specified in the coordinate
Expand Down Expand Up @@ -807,13 +827,47 @@ \subsubsection{Circle}
of a table), but specific services may define something that is applicable in a
more limited context.

\subsubsection{Range}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no discussion in this section of ranges that span the antimeridian. I think something should be said: either you can't do that using range, or you can do it by providing reversed longitude max/min values (or in some other way).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, SIA and SODA also do not discuss this. with the text as written, I would interpret that
359 1 -1 1 spans the meridian and is small (2x2 deg) and technically 1 359 -1 1 would be a large band 358 deg x 2 deg... that's what you meant by "reversed longitude"? I've rewritten it this way.

that probably means I have to be careful with using the word "interval" since 359,1 would not be a valid interval (it wasn't intended to be, but I'll fix that anyway).

The same thing kind of comes up in polygons, where they look pretty odd to the eye in this case

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think the rewrite is OK.

With reference to polygons, STC 1.33 sec 4.5.1.4 resolves this by saying

In order to avoid ambiguities in direction, vertices need to be less than 180° apart in both coordinates.

Should we explicitly defer to STC concerning polygons? But this is probably wider than DALI - see Alberto's recent Interop talk.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent was that we were referring to the STC definition in it's entirety, but the text was too brief and gave a different impression. That's more or less a bug... erratum?

Now that STC-1.33 is really old and more or less obsoleted by Coords and Meas, I wonder if we should fork the complete definition and include it in DALI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that STC-1.33 is really old and more or less obsoleted by Coords and Meas, I wonder if we should fork the complete definition and include it in DALI.

You'd have my vote for it.

For Range (if we want it), I think the only sane convention is to let RA be in (-360,360] or in [0, 720] or something comparable. Otherwise, you'll get insane trying to define the behaviour along the stiching line.

Range values serialised in VOTable or service parameters must have the following
metadata in the \xmlel{FIELD} element: \verb|datatype="double"| or \verb|datatype="float"|,
\verb|arraysize="4"|, \verb|xtype="range"|. A range is a coordinate bounding box specified
as two pairs of coordinate values: min-coordinate1 max-coordinate1 min-coordinate2 max-coordinate2.
For example:

\begin{verbatim}
10.0 11.0 20.0 21.0
\end{verbatim}

includes values from 10 to 11 (coordinate1) and from 20 to 21 (coordinate2).

In spherical coordinates, coordinates are longitude followed by latitude; longitude values must
fall within [0,360] and all latitude values within [-90,90]. This range form is used as part of
the value of the POS parameter in \citep{2015ivoa.spec.1223D} and \citep{2017ivoa.spec.0517B}
(see also "shape" below). A range can span the meridian (longitude 0): 359 1 -1 1 is interpretted
as the small (2x2 degree) coordinate range from 359 across the meridan to 1 degree longitude.

Range-valued service parameters may include additional metadata like minimum and
or maximum value. These are specified using a custom interpretation of the
\verb|MAX| child element with a value that is the largest range that makes sense
for the operation. The value could be a maximum allowed by the service or simply
the range where larger ranges and ranges outside the specified maximum will not yield
useful results.

There is no general purpose definition of a minimum range value for parameters or
a definition of a minimum or maximum range to describe field values (in a column
of a table), but specific services may define something that is applicable in a
more limited context.

\subsubsection{Polygon}
Polygon values serialised in VOTable or service parameters must have the following metadata in the
\xmlel{FIELD} element: \verb|datatype="double"| or \verb|datatype="float"|, \verb|arraysize="*"|, \verb|xtype="polygon"|
(where arraysize may also be fixed length or variable length with limit).
The array holds a sequence of vertices (points) (e.g. longitude latitude longitude
latitude ...) with an even number of values and at least three (3) points (six
(6) numeric values). For example:
(6) numeric values). A polygon is always implicitly closed: there is an implied edge from
the last point back to the first point; explicitly including the first point at the end is
not allowed (highly discouraged?) because it creates an edge of length 0 that has
negative side effects on some polygon computations. For example:

\begin{verbatim}
10.0 10.0 10.2 10.0 10.2 10.2 10.0 10.2
Expand Down Expand Up @@ -859,9 +913,6 @@ \subsubsection{Multi-Polygon}
11.0 11.0 11.2 11.0 11.2 11.2 11.0 11.2
\end{verbatim}

NOTE: Prototypes will determine if a single NaN value (as above) or two adjacent NaN values (a NaN point)
is easier to serialise, parse, and validate.

A multi-polygon without a separator is allowed, so all (simple) polygons are also valid multi-polygons. The
component polygons in a multipolygon may touch (vertex of one on an edge of another, including sharing vertices)
but may not have any common area.
Expand All @@ -873,19 +924,28 @@ \subsubsection{Shape}
Shape values serialised in VOTable or service parameters must have the following metadata in the
\xmlel{FIELD} element: \verb|datatype="char"|, \verb|arraysize="*"|, \verb|xtype="shape"|
(where arraysize may also be fixed length or variable length with limit).
The value is a polymorphic shape made up of a type label (equivalent to an existing xtype: \verb|circle|,
\verb|polygon|, or \verb|multipolygon|) and the string serialisation of the value
as described above. For example:
The value is a polymorphic shape made up of a type label (equivalent to an existing simple
geometric xtype and the string serialisation of the value as described above.

The allowed shapes are: \verb|circle|, \verb|range|, \verb|polygon|. For example:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multipolygon has been removed in this commit. Is the intention that a shape can contain multiple elements (in which case multipolygon can be represented by several polygon entries - but that possibility is not currently documented here) or are we deciding that shape should not be multipolygon-capable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, shape contains exactly one label and associated numbers. The primary use case is the POS param of SIA (+DAP) and SODA; that does not require multipolygon.

The secondary use case would be values in ObsCore s_region column and as the string value in the ADQL region() function (to replace non-normative adql:REGION from TAP-1.0). In that context including multipolygon in shape is potentially useful.

The concern that Markus raised is that input to a service (upload tables, POS param) could include "shape" and if that includes multipolygon that is more work. I personally don't think it's a huge amount of extra work once you accept shape. Do you think adding it to shape in a later version is going to be painful? We need but don't have a way for services to list xtypes they understand (for input), so right now we would have to rely on good error messages in which case changing the definition of shape wouldn't be much worse... but if we did have a way to list supported xtypes then changing the definition of shape would be a subtle way to break stuff. Still more to discuss; I might post this topic to the DAL list for wider audience.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That secondary s_region case was what I had in mind. Services like ESO TAP_OBS have more complicated s_regions, e.g. UNION(POLYGON..POLYGON...). But I suppose this is not intended to be a drop-in replacement for ADQL:region.

I don't think that later addition of e.g. multipolygon would be particularly painful, but if people are going to want it, it may be better to include it from the start. However since I'm not involved in either implementing parsers or authoring regions, I don't have a strong opinion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think shape is intended to standardise the minimal part of adql region that are justified by use cases. I'd like to go as far as having ADQL-2.1 (already in RFC) say that the region() function takes a DALI shape as the string argument so that adql region from TAP-1.0 can finally go away. Not exactly a drop-in replacement because we decided that including coordsys in the literal values was a bad idea (it should be field metadata and query writers should do the work, like they have to do with units in every other numeric column), but still a replacement.

If there are use cases for multipolygon being included in shape, then that would justify including it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two additional questions here:

  1. coming back on what @mbtaylor said, how ESO would be able to describe their complicated s_regions (e.g. UNION(POLYGON..POLYGON...))? With a polygon? A multipolygon? It seems not very trivial to do. Maybe @almicol has an idea?
  2. Would it be possible to add moc in the list of allowed shapes? I think it would be a useful way to describe the region covered by an observation in ObsCore.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may require an update to ObsCore to permit it, but the idea would be that ObsCore implementations could declare the most appropriate xtype for the s_region column, probably allowing any of the defined geometric types (circle, polygon, shape, multipolygon). Although allowed, range is generally not a good description of observation bounds so I doubt it would get much use (as a column type or within shape columns) but I'd probably not disallow it.

With the current state of the PR, ESO would have to use multipolygon (and simple polygons are valid multipolygons). We (via CAOM model) have circles and polygons so would probably use shape, but I'll note that the polygons are generally outer simple polygons and we also have a more detailed multipolygon that sometimes differs from the simple outer polygon... the latter is in a different column so we'd have a column with xtype="multipolygon" in CAOM TAP services. We would map ObsCore.s_region to the shape column.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for shape also including moc, I have not implemented that and so while it sounds logical I can't say what's involved in doing a good job of that. I do think moc and other shapes are for different purposes so it's not clear if there is a compelling use case at this time.

If shape did include moc then an upload table could include circles, polygons, and mocs and a service would be trying to put those all into a column in the database to support spatial queries... that sounds kind of tricky to me and I don't know how I'd do that. I think it also likely that someone who doesn't support moc would have "partial shape support" and since shape is also the xtype to describe the SIA and SODA POS param that would probably introduce confusion and pain. So I would not want to add moc to shape right now and we'll have to figure out if/how to add things to such a polymorphic xtype and what that means. We need polymorphism, but it's hard and we shouldn't go overboard :-)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly an aside:

At the top of this thread, @mbtaylor pointed out that this PR removes multipolygon from the list of shapes. This was to lessen the burden of implementing table upload (shape column) and to avoid subtly extending the POS param in SIA and SODA (if/when someone uses xtype="shape" for POS). I have implemented a queryable shape column in the CAOM database so I know how to di it and make it work, but I haven't added support to our tap upload yet.

However, I will point out that if shape did include multipolygon, then ESO could use xtype="shape" for their ObsCore.s_region and not have to chose between multipolygon and polymorphism. I don't know how they would implement that with the spatial support in their database or whether they also need the polymorphism (for some circles), but we do need both polymorphism and multipolygon in CAOM and the model could have been simpler if shape included multipolygon directly. In detail, there are pros and cons to both choices.


\begin{verbatim}
circle 12.3 45.6 0.5
\end{verbatim}

\begin{verbatim}
range 10.0 11.0 20.0 21.0
\end{verbatim}

\begin{verbatim}
polygon 10.0 10.0 10.2 10.0 10.2 10.2 10.0 10.2
\end{verbatim}

The interpretation and constraints on the coordinate values are as specified for the individual xtypes above.
The interpretation and constraints on the coordinate values are as specified
for the individual xtypes above.

The shape xtype provides a compatible description of the POS parameter in
\citep{2015ivoa.spec.1223D} and \citep{2017ivoa.spec.0517B}.

Shape-valued service parameters may include additional metadata to describe minimum
and/or maximum values. These are specified using a custom interpretation of the
Expand Down Expand Up @@ -922,6 +982,42 @@ \subsubsection{Shape}
of a table), but specific services may define something that is applicable in a
more limited context.

\subsubsection{URI}
URI values \citep{std:RFC3986} serialised in VOTable or service parameters
should have the following metadata in the \xmlel{FIELD} element: \verb|datatype="char"|,
\verb|arraysize="*"|, \verb|xtype="uri"| (where arraysize may also be fixed length or
variable length with limit).

\subsubsection{UUID}
Univeral Unique Identifier (UUID) values serialised in VOTable or service parameters
should have the following metadata in the \xmlel{FIELD} element: \verb|datatype="char"|,
\verb|arraysize="36"|, \verb|xtype="uuid"| (where arraysize may also be fixed length or
variable length with limit).

UUID values \citep{std:RFC4122} are serialised using the canonical ascii (hex)
representation, for example: e0b895ca-2ee4-4f0f-b595-cbd83be40b04.

\subsubsection{Unsupported Types}

Support for any specific \xmlel{xtype} in implementations (client or service) is specified in
the service standard document. However, support for a specific \xmlel{xtype} as input (params
and uploaded content) should generally be considered optional. Implementations should
be able to read and write the underlying data type without knowing the semantics added
by the \xmlel{xtype}. In cases where understanding the meaning of an \xmlel{xtype} is required (for
example, the POS param in SODA) and a service does not support the serialized value,
the service should issue an error message that starts with the following text with the
most specific \xmlel{xtype} noted:
\begin{verbatim}
unsupported-xtype: {xtype} [optional detail here]
\end{verbatim}
and may include additional detail where noted. For example, the value of the SODA POS parameter
is a \verb|xtype="shape"|, but if the implementation does not support the "range" construct, it
would respond (minimally) with:
\begin{verbatim}
unsupported-xtype: range
\end{verbatim}
This behaviour will allow for new \xmlel{xtype}s to be introduced and for \verb|xtype="shape"|
to be extended to include additional subtypes in the future.

\subsection{Standard Parameters}

Expand Down Expand Up @@ -1402,7 +1498,7 @@ \subsection{PR-DALI-1.2}
\begin{itemize}
\item Clarified that truncation indicated by OVERFLOW can occur independent of
MAXREC
\item added new xtypes: moc, shape, multipolygon
\item added new xtypes: hms, dms, moc, multipolygon, range, shape, uri, uuid
\item changed VOSI-availability to optional
\item changed VOSI-capability so it is only required for registered services
\end{itemize}
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ DOCNAME = DALI
DOCVERSION = 1.2

# Publication date, ISO format; update manually for "releases"
DOCDATE = 2023-04-17
DOCDATE = 2023-05-29

# What is it you're writing: NOTE, WD, PR, or REC
DOCTYPE = WD
Expand Down