-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new xtypes: range, hms, dms, uri, uuid #24
Conversation
removed multipolygon from shape added range to shape for SIA and SODA compat
still need to get the RFC 4122 citation included |
DALI.tex
Outdated
values and seconds is a real value. All hours must fall within [0,23], degrees | ||
must fall within [0,359], minutes must fall within [0,59], and seconds | ||
must fall within [0,59]; the minimum values are thus 0:0:0 (hms and dms) and the maximum | ||
values are ~23:59:59.9 (hms) and ~359:59:59.9 (dms) (decimal values may have | ||
additional precision). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The limits certainly need some changes: degrees in the range [-90,90] at least have to be permitted. But I don't know whether we should be a bit less prescriptive in general: do people use negative hours in hms sometimes? or dms for longitude values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, dms should go from -90:0:0 to 90:0:0; good catch.
In other coordinates we limited the range of valid values so that providers were responsible for range reduction and I haven't heard any complaints about that, but here it's more markup for existing widespread use so I'm a lot less sure,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realised that in the other coordinate validity specs we just said [-90,90] and [0,360] without worrying to much about the endpoint wrap around (which could have been [0,360) if we wanted to be really careful). Do you think the more relaxed spec like [0,360] is enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since as you say this is mostly about marking up existing content I'd incline to being more permissive here. But we don't have [0,360], we have [-90,90], and both extrema have to be allowed in that case anyway.
Another thing: I suggest also clarifying that an optional explicit "+" sign on dms is allowed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional "+" at the front, because "-" is also a possibility? and we don't need that for hms because they are only positive anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added optional + for dms
@@ -800,6 +816,36 @@ \subsubsection{Circle} | |||
of a table), but specific services may define something that is applicable in a | |||
more limited context. | |||
|
|||
\subsubsection{Range} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no discussion in this section of ranges that span the antimeridian. I think something should be said: either you can't do that using range, or you can do it by providing reversed longitude max/min values (or in some other way).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm, SIA and SODA also do not discuss this. with the text as written, I would interpret that
359 1 -1 1
spans the meridian and is small (2x2 deg) and technically 1 359 -1 1
would be a large band 358 deg x 2 deg... that's what you meant by "reversed longitude"? I've rewritten it this way.
that probably means I have to be careful with using the word "interval" since 359,1 would not be a valid interval (it wasn't intended to be, but I'll fix that anyway).
The same thing kind of comes up in polygons, where they look pretty odd to the eye in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I think the rewrite is OK.
With reference to polygons, STC 1.33 sec 4.5.1.4 resolves this by saying
In order to avoid ambiguities in direction, vertices need to be less than 180° apart in both coordinates.
Should we explicitly defer to STC concerning polygons? But this is probably wider than DALI - see Alberto's recent Interop talk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent was that we were referring to the STC definition in it's entirety, but the text was too brief and gave a different impression. That's more or less a bug... erratum?
Now that STC-1.33 is really old and more or less obsoleted by Coords and Meas, I wonder if we should fork the complete definition and include it in DALI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that STC-1.33 is really old and more or less obsoleted by Coords and Meas, I wonder if we should fork the complete definition and include it in DALI.
You'd have my vote for it.
For Range (if we want it), I think the only sane convention is to let RA be in (-360,360] or in [0, 720] or something comparable. Otherwise, you'll get insane trying to define the behaviour along the stiching line.
The value is a polymorphic shape made up of a type label (equivalent to an existing simple | ||
geometric xtype and the string serialisation of the value as described above. | ||
|
||
The allowed shapes are: \verb|circle|, \verb|range|, \verb|polygon|. For example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multipolygon has been removed in this commit. Is the intention that a shape can contain multiple elements (in which case multipolygon can be represented by several polygon entries - but that possibility is not currently documented here) or are we deciding that shape should not be multipolygon-capable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, shape contains exactly one label and associated numbers. The primary use case is the POS param of SIA (+DAP) and SODA; that does not require multipolygon.
The secondary use case would be values in ObsCore s_region column and as the string value in the ADQL region() function (to replace non-normative adql:REGION from TAP-1.0). In that context including multipolygon in shape is potentially useful.
The concern that Markus raised is that input to a service (upload tables, POS param) could include "shape" and if that includes multipolygon that is more work. I personally don't think it's a huge amount of extra work once you accept shape. Do you think adding it to shape in a later version is going to be painful? We need but don't have a way for services to list xtypes they understand (for input), so right now we would have to rely on good error messages in which case changing the definition of shape wouldn't be much worse... but if we did have a way to list supported xtypes then changing the definition of shape would be a subtle way to break stuff. Still more to discuss; I might post this topic to the DAL list for wider audience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That secondary s_region case was what I had in mind. Services like ESO TAP_OBS have more complicated s_regions, e.g. UNION(POLYGON..POLYGON...). But I suppose this is not intended to be a drop-in replacement for ADQL:region.
I don't think that later addition of e.g. multipolygon would be particularly painful, but if people are going to want it, it may be better to include it from the start. However since I'm not involved in either implementing parsers or authoring regions, I don't have a strong opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think shape is intended to standardise the minimal part of adql region that are justified by use cases. I'd like to go as far as having ADQL-2.1 (already in RFC) say that the region() function takes a DALI shape as the string argument so that adql region from TAP-1.0 can finally go away. Not exactly a drop-in replacement because we decided that including coordsys in the literal values was a bad idea (it should be field metadata and query writers should do the work, like they have to do with units in every other numeric column), but still a replacement.
If there are use cases for multipolygon being included in shape, then that would justify including it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two additional questions here:
- coming back on what @mbtaylor said, how ESO would be able to describe their complicated
s_regions
(e.g.UNION(POLYGON..POLYGON...)
)? With a polygon? A multipolygon? It seems not very trivial to do. Maybe @almicol has an idea? - Would it be possible to add
moc
in the list of allowed shapes? I think it would be a useful way to describe the region covered by an observation in ObsCore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may require an update to ObsCore to permit it, but the idea would be that ObsCore implementations could declare the most appropriate xtype for the s_region column, probably allowing any of the defined geometric types (circle, polygon, shape, multipolygon). Although allowed, range is generally not a good description of observation bounds so I doubt it would get much use (as a column type or within shape columns) but I'd probably not disallow it.
With the current state of the PR, ESO would have to use multipolygon (and simple polygons are valid multipolygons). We (via CAOM model) have circles and polygons so would probably use shape, but I'll note that the polygons are generally outer simple polygons and we also have a more detailed multipolygon that sometimes differs from the simple outer polygon... the latter is in a different column so we'd have a column with xtype="multipolygon" in CAOM TAP services. We would map ObsCore.s_region to the shape column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for shape also including moc
, I have not implemented that and so while it sounds logical I can't say what's involved in doing a good job of that. I do think moc
and other shapes are for different purposes so it's not clear if there is a compelling use case at this time.
If shape did include moc then an upload table could include circles, polygons, and mocs and a service would be trying to put those all into a column in the database to support spatial queries... that sounds kind of tricky to me and I don't know how I'd do that. I think it also likely that someone who doesn't support moc would have "partial shape support" and since shape is also the xtype to describe the SIA and SODA POS param that would probably introduce confusion and pain. So I would not want to add moc to shape right now and we'll have to figure out if/how to add things to such a polymorphic xtype and what that means. We need polymorphism, but it's hard and we shouldn't go overboard :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possibly an aside:
At the top of this thread, @mbtaylor pointed out that this PR removes multipolygon from the list of shapes. This was to lessen the burden of implementing table upload (shape column) and to avoid subtly extending the POS param in SIA and SODA (if/when someone uses xtype="shape" for POS). I have implemented a queryable shape column in the CAOM database so I know how to di it and make it work, but I haven't added support to our tap upload yet.
However, I will point out that if shape did include multipolygon, then ESO could use xtype="shape" for their ObsCore.s_region and not have to chose between multipolygon and polymorphism. I don't know how they would implement that with the spatial support in their database or whether they also need the polymorphism (for some circles), but we do need both polymorphism and multipolygon in CAOM and the model could have been simpler if shape included multipolygon directly. In detail, there are pros and cons to both choices.
also added some clarification to xtype=polygon about implicit closing of loop
On Tue, Jun 06, 2023 at 02:34:43PM -0700, Patrick Dowler wrote:
@pdowler commented on this pull request.
As for shape also including `moc`, I have not implemented that and
so while it sounds logical I can't say what's involved in doing a
good job of that. I do think `moc` and other shapes are for
It's orders of magnitude simpler than multipolygon, at least if
you're using pgsphere (and I'd so much wish we'd dump multipolygon in
favour of MOCs).
different purposes so it's not clear if there is a compelling use
case at this time.
As soon as I have more complicated shapes in my obscore table, I, for
one, will almost certainly want to move to MOC (right now, I only
have polygons and turn circles (e.g., spectra with aperture) into
approximating polygons).
If shape did include moc then an upload table could include
circles, polygons, and mocs and a service would be trying to put
Frankly, the only way I can see myself implementing anything like
shape in an upload table is through MOCs, in particular if we allow
multipolygon. That's why I'm so after loosening the requirement on
roundtripping geometries (i.e., I want to legally return a MOC when I
get some geometry in).
I'd do that. I think it also likely that someone who doesn't
support moc would have "partial shape support" and since shape is
also the xtype to describe the SIA and SODA POS param that would
probably introduce confusion and pain. So I would not want to add
I'd really be curious how you do uploads of shape columns anyway...
If it's through the famous hidden columns then please let's not do
it. They're an extremely painful kludge that'll break left and right
in dozens of corner cases.
things to such a polymorphic xtype and what that means. We need
polymorphism, but it's hard and we shouldn't go overboard :-)
MOCs FTW. But could you quickly point me to the use case(s) of shape
in upload tables? Perhaps I can think of a less horrible
alternative...
|
use case for shape as input: SIA and SODA POS param. ADQL REGION function string arg should refer to DALI shape. People can upload table of shapes now in TAP, which would just be a string, but once xtype="shape" is defined in DALI they have some expectation that it will behave like a geometry (in query). Presumably we have to add something to one or more registry extensions so services can say which xtypes they support (or require specific error messages for unsupported input data types)... so that applies to TAP at a minimum. It was always the plan to add table upload support to SIA (now DAP), where one could have a column of geometries and I expect refer to the column with a query param (details TBD). The POS param is xtype="shape" (as currently specified in this PR and with the addition of range) so that would immediately lead to an expectation of upload tables with a shape column in there. Right now, with multiple occurrences of POS param, implementations can plausibly just do that inline in the SQL but the whole point of upload table would be to scale up the list such that inline wasn't the right solution (my opinion). In general, not everyone uses postgresql and pgsphere so we can't hang everything on
so clients can reasonably figure out what to look for. |
Hi Gregory, Markus, and Pat,
While I still have to study DALI “multipolygons” and the new ADQL version
(sorry I’m late for that, but I still hope to be able to give comments before June 20th),
I can just say that at ESO we need points, polygons, multipolygons, and circle,
so “shape” is the polymorphism we need.
Regarding MOC, the ESO databases (MS SQLServer, SAP IQ) do not support them.
Regarding use cases to upload shapes, I can see 2 use cases:
1. A user interested in finding counterpart of a gravitational wave could either upload a multipolygon or a MOC describing the GW credible region to find ObsCore products that could have observed the given GW event.
2. A user wants to cross-correlate ObsCore s_region footprints of observed/calibrated datasets (some are circle, some are multipolygon) with a different TAP service offering catalogs, to see which catalog sources match the given footprints.
Off now for almost a week (in between a champions league soccer match to attend), cheers,
Alberto
From: Grégory Mantelet ***@***.***>
Reply to: ivoa-std/DALI ***@***.***>
Date: Friday, 2. June 2023 at 16:00
To: ivoa-std/DALI ***@***.***>
Cc: Alberto Micol ***@***.***>, Mention ***@***.***>
Subject: Re: [ivoa-std/DALI] new xtypes: range, hms, dms, uri, uuid (PR #24)
This email was sent from outside of ESO from the address ***@***.***. If it looks suspicious, please report it to ***@***.***
@gmantele commented on this pull request.
________________________________
In DALI.tex<#24 (comment)>:
@@ -866,19 +909,28 @@ \subsubsection{Shape}
Shape values serialised in VOTable or service parameters must have the following metadata in the
\xmlel{FIELD} element: \verb|datatype="char"|, \verb|arraysize="*"|, \verb|xtype="shape"|
(where arraysize may also be fixed length or variable length with limit).
…-The value is a polymorphic shape made up of a type label (equivalent to an existing xtype: \verb|circle|,
-\verb|polygon|, or \verb|multipolygon|) and the string serialisation of the value
-as described above. For example:
+The value is a polymorphic shape made up of a type label (equivalent to an existing simple
+geometric xtype and the string serialisation of the value as described above.
+
+The allowed shapes are: \verb|circle|, \verb|range|, \verb|polygon|. For example:
I have two additional questions here:
1. coming back on what @mbtaylor<https://github.com/mbtaylor> said, how ESO would be able to describe their complicated s_regions (e.g. UNION(POLYGON..POLYGON...))? With a polygon? A multipolygon? It seems not very trivial to do. Maybe @almicol<https://github.com/almicol> has an idea?
2. Would it be possible to add moc in the list of allowed shapes? I think it would be a useful way to describe the region covered by an observation in ObsCore.
—
Reply to this email directly, view it on GitHub<#24 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADCLQ452LH3HMKINL56XPKLXJHWWPANCNFSM6AAAAAAYTDTMNY>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
On Wed, Jun 07, 2023 at 08:44:35AM -0700, Patrick Dowler wrote:
Presumably we have to add something to one or more registry
extensions so services can say which xtypes they support (or
require specific error messages for unsupported input data
types)... so that applies to TAP at a minimum.
I'm more on the side of specific error messages, but I'm even more on
the side of writing specs in a way that people won't have to issue
too many error messages like this.
It was always the plan to add table upload support to SIA (now
DAP), where one could have a column of geometries and I expect
refer to the column with a query param (details TBD). The POS param
Hmyeahwell... how would you implement that, unless you're on a
database that somehow magically allows circles and polygons (and
possibly even multipolygons) in the same column? Does such a
database even exist?
In general, not everyone uses postgresql and pgsphere so we can't
hang everything on `moc` (or any other specific type) and we have
Oh, but MOC is an IVOA standard in no way bound to pgsphere. It's in
there because GAVO spent the equivalent of the price of a small
server on hiring a postgres consultant to put it in.
If cash-strapped GAVO can do that, I'd say it's not unreasonable to
politely ask one of the adopters of SQLServer to shell out a similar
amount of money just *once* and have a MOC extension for SQLServer
built; actually, it's not *that* hard anyway (I was just busy
elsewhere and couldn't spend the money on paying someone locally).
With a bit of luck you could even get a suitably capable summer
student to do it.
MOCs simply are *so much* cooler than anything else representing
geometries on a sphere that it's absolutely worth it.
Meanwhile, as long as I'm allowed to return MOCs when someone uploads
polygons or multipolygons or STC-S or whatever, I'll shut up.
|
I think this is now "done" and completes what we need for DALI-1.2 The extra section about unsupported-xtype error messages indicates the way I envision implementers (client and service) deal with any xtype(s) we might add in future or how we might change xtype="shape" by adding more subtypes. I could see that happening, and maybe soon if @almicol's ESO use cases require that we add multipolygon now rather than later. Basically, anyone should be able to round-trip values even when they don't grok the xtype (that was always true) and only have to fail with this error message when the usage requires it. I would like to get this to RFC soon as I think the ADQL would benefit from saying that the string value in the REGION function (constructor) are specified by DALI "shape". I still think we can add that in ADQL-2.1 with minimal delay. Right now, REGION has no normative spec so this can finally fix that with minimal actual change in usage. |
On Tue, Jun 13, 2023 at 11:54:43AM -0700, Patrick Dowler wrote:
multipolygon now rather than later. Basically, anyone should be
able to round-trip values even when they don't grok the xtype (that
was always true) and only have to fail with this error message when
the usage requires it.
Just for the record, I'm still very much in favour of letting people
return different geometry types and qualify the round-tripping
requirement accordingly. Yes, that gives the xtypes a bit more
meaning that they previously had, but it *really* helps
implementability a lot. Carrying through hidden columns and then
deciding when they (rather than the value of some expression) should
become visible again is painful and fragile.
|
Well, there is nothing in DALI that says anything about how column types and values with xtypes are handled through an input + output cycle. That would be up to a service to specify (TAP, DAP, etc). Here we just define them. I haven't thought through all the pros and cons, but my very initial instinct is that TAP could potentially have more specific requirements than a more abstract protocol like DAP (SIA), if the latter were to support tabular input. There are details of this sort to work out for WD-TAP-1.2 (because TAP_UPLOAD and the proposed user-created tables do mostly the same work and can use the same code) and that's the place to figure out these details. |
Hi all, It has been silent here but I would like to merge this and have the WD published to the doc repo so we can move forward toward RFC. I have services that will make use of most of the new xtypes that I'll tweak to do so once the WD is published... probably the same is true for other early implementers so we need to take that step and move on to wider community feedback. If someone could check the technical stuff (build, references, etc) and merhe, that would be great. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reserve the right to quibble about the various geometry types (and in particular being allowed to turn them into MOCs when round-tripping) during RFC, but that discussion perhaps isn't for a github issue in the first place; I'd much rather have it somewhere where we can control its preservation for posterity.
So, you have my blessing.
|
||
right ascension: \verb|datatype="char"| \verb|arraysize="*"| \verb|xtype="hms"| | ||
|
||
declination: \verb|datatype="char"| \verb|arraysize="*"| \verb|xtype="dms"| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be dot points? They look a bit strange in the PDF as paras
(latitude) must fall within [-90,90], minutes must fall within [0,60), and seconds | ||
must fall within [0,60). Valid values for \verb|xtype="hms"| are from 0:0:0 to 24:0:0. | ||
Valid values for \verb|xtype="dms"| are from -90:0:0 to 90:0:0; an optional + sign at | ||
the start is allowed (e.g. +10:20:30) but not required. Since the upper bound on minutes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since
here makes the sentence feel unfinished.
will fix the above in a separate PR |
removed multipolygon from shape
added range to shape for SIA and SODA compat