Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lack of templated endpoints mechanism #27

Open
Bonnarel opened this issue Oct 22, 2019 · 14 comments
Open

Lack of templated endpoints mechanism #27

Bonnarel opened this issue Oct 22, 2019 · 14 comments
Labels
enhancement New feature or request TBD

Comments

@Bonnarel
Copy link
Contributor

We miss a mechanism to describe variable RESTful endpoints. This can be achieved by a templating mechanism addressed by pull request #28

@lmichel
Copy link

lmichel commented Nov 13, 2019

The discussion on this topic has been moved to the PR panel too early I guess, so I'm back to the issue thread.

Why URL templates

Here is why I'm a (the?) big fan of introducing URL templating in DataLink. If we go in section 4 (Draft P16), we have a service descriptor with te following PARAM:

<PARAM name="ID" datatype="char" arraysize="*" value="" ref="primaryID"/>

and the following access URL:

<PARAM name="accessURL" datatype="char" arraysize="*" value="http://example.com/mylinks" />

The spec says that the service URL will be built that way:

http://example.com/datalink/mylinks?ID=<obs_publisher_did value>

Now, if my service is REST like, I would like to describe it like this:

<PARAM name="accessURL" datatype="char" arraysize="*" value="http://example.com/mylinks/{$ID/download}" />

To build the following URL:

http://example.com/datalink/mylinks/<obs_publisher_did value>/download

This can be extended to any PARAM of the service descriptor.

Why rfc6570 [https://www.rfc-editor.org/rfc/rfc6570.txt]

If we agree on using templates, there is no reason neither to reinvent the wheel nor to implement useless complicated stuff. This is why I propose to take out from rfc6570 the basic features we need and to incorporate them to the spec with a proper reference to rfc6570 .

I think the above use case is relevant and reasonably painful for client developers.

@Bonnarel
Copy link
Contributor Author

Hi all,

When reading this email from Laurent I realize that I didn't create an example for my proposal.
The main difference with what Laurent is proposing is to let accessURL PARAM unchanged (no templating) and use special PARAMS for relative URL templating.
This allows 1 ) to keep the accessURL as a root URL to the service

2 ) to define several templated endpoints in the same service descriptor

See now :

"Proposal" and "Config" are column names The same service provides either "proposals" or "configuration descriptions" in several possible formats.
The templated strings in the two PARAMS in endpoints should be added to the root URL in accessURL

<RESOURCE type="meta" utype="adhoc:service" name="{links} for Obscore"> 
<DESCRIPTION>Links resources to datasets</DESCRIPTION> 
<PARAM name="accessURL" datatype="char" arraysize="*" value="http://herschel.esac.esa.int/Docs/KPOT/KPOT_accepted.html"/>
 <PARAM name="contentType" datatype=”char” arraysize=”*” value="text/ascii" > 
<GROUP name="endpoints">
 <PARAM name ="Proposal" utype="templat
e"  datatype="char" arraysize="*" value="P/{?Proposal*}" />
 <PARAM name="Configuration Description" utype="template" datatype="char" arraysize="*" value="D/{?Config*}"/>
 </GROUP> 
<GROUP name="inputParams">
 <PARAM name="FORMAT" datatype="char" arraysize="*" value="" >
 <VALUES>
 <OPTION value="text">
 <OPTION value="pdf">
 <OPTION value="html">
 </VALUES> 
</PARAM>
 </GROUP>
 </RESOURCE>

@lmichel
Copy link

lmichel commented Nov 20, 2019 via email

@Bonnarel
Copy link
Contributor Author

Thanks Pierre - this has a lot of great analysis and use cases that DataLink can/should address. I think we will need to come up with a coherent design to add support for fragments and RESTful path elements to the current query param support - especially since they can in principle all play together (eg in VOSpace one uses both path and query params and that is a common pattern, you showed using query params and fragment, and I can easily see using path, query, and fragment making sense in some context).

I will try to condense this into a single github issue with reference to this mail thread. We can continue broad discussion here.

--
Patrick Dowler
Canadian Astronomy Data Centre
Victoria, BC, Canada

On Fri, 15 Nov 2019 at 03:04, Pierre Fernique [email protected] wrote:

Dear Datalink contributors and Apps members

This is an open discussion concerning DATALINK vs LINK usage.

I looked in detail at the new Datalink document and I tried to evaluate the potential impact on the current use of the classic VOTable LINK. Specifically, I tried to write the VOTable corresponding to a basic Simbad result, containing two direct links, one to get the basic Simbad result page, the other to list the associated bib references (this is a real/live example since 18 years - the expected result in Aladin is provided at the end of this mail).

Currently, I have to admit that I remain a little bit dubious about the complexity of the Datalink solution to implement one or more direct simple links.
In any case, here are the results of my effort:
1) Method using VOTable LINK (available from v1.0 and following), using templating convention ($ {columnName} ... described in VOTable appendices)

We can easily describe in one XML line the 4 essential elements of a link: 1-how is built the link, 2-where it must be placed, 3-what is its associated text, 4-what it will return. The knowledge of the links is placed at the beginning of the VOTable stream which will simplify the life of the clients
    <VOTABLE><RESOURCE><TABLE>
       <FIELD ID="MAIN_ID" name="MAIN_ID" ucd="meta.id;meta.main" datatype="char" arraysize="*" width="22">
          <DESCRIPTION>Main identifier for an object</DESCRIPTION>
          <LINK contentType="text/html" href="http://simbad.u-strasbg.fr/simbad/sim-id?Ident=${MAIN_ID}&NbIdent=1"/>
       </FIELD>
       ...
       <FIELD ID="BIBLIST" name="BIBLIST" ucd="meta.bib" datatype="short" width="4">
           <DESCRIPTION>List of Bibcodes</DESCRIPTION>
           <LINK contentType="text/html" title="${BIBLIST} references for ${MAIN_ID}"
                 href="http://simbad.u-strasbg.fr/simbad/sim-id?bibdisplay=refsum&Ident=${MAIN_ID}#lab_bib"/>
       </FIELD>
    ...
    <DATA><TABLEDATA>
       <TR>
          <TD>[VV98] J084822.3+274553</TD>
          ...
          <<TD>19</TD>
       </TR>
       ...
    </TABLEDATA></DATA></RESOURCE></VOTABLE>
2) The same thing expressed in Datalink 1.1 (templating alternative discussed in Datalink author groups (see Datalink github) is presented below)

We can describe via additional RESOURCES one or more links: 1-we know how to build it but with limitations (everything before the ? is constant and only key=value parameters are possible), 2- we do not know where the link should be placed (on which column), 3- we can describe the associated text, 4- we can describe the type of data returned. The knowledge of the links is given at the end of the stream which can/will force the client to buffer. Finally the client must check that these services descriptors do not use a VO standardID or multiple choice parameters (OPTION).
    <VOTABLE><RESOURCE><RESOURCE type="result"><TABLE>
       <FIELD ID="MAIN_ID" name="MAIN_ID" ucd="meta.id;meta.main" datatype="char" arraysize="*" width="22">
          <DESCRIPTION>Main identifier for an object</DESCRIPTION>
       </FIELD>
       ...
       <FIELD ID="BIBLIST" name="BIBLIST" ucd="meta.bib" datatype="short" width="4">
           <DESCRIPTION>List of Bibcodes</DESCRIPTION>
       </FIELD>
    ...
    <DATA><TABLEDATA>
       <TR>
          <TD>[VV98] J084822.3+274553</TD>
          ...
          <<TD>19</TD>
       </TR>
       ...
    </TABLEDATA></DATA></RESOURCE>

    <RESOURCE type="meta" utype="adhoc:service" name="SimbadMainPage">
       <DESCRIPTION>Link to Simbad main page</DESCRIPTION>
       <PARAM name="accessURL" datatype="char" arraysize="*" value="http://simbad.u-strasbg.fr/simbad/sim-id"/>
       <PARAM name="contentType" datatype="char" arraysize="*" value="text/html"/>
       <GROUP name="inputParams">
           <PARAM name="Ident" datatype="char" arraysize="*" ref="MAIN_ID"/>
           <PARAM name="NbIdent" datatype="char" arraysize="*" value="1"/>
       </GROUP>
    </RESOURCE>

    <RESOURCE type="meta" utype="adhoc:service" name="SimbadBiblio">
       <DESCRIPTION>Link to Simbad biblio</DESCRIPTION>
       <PARAM name="accessURL" datatype="char" arraysize="*" value="http://simbad.u-strasbg.fr/simbad/sim-id"/>
       <PARAM name="contentType" datatype="char" arraysize="*" value="text/html"/>
        <GROUP name="inputParams">
           <PARAM name="bibdisplay" datatype="char" arraysize="*" value="refsum"/>
           <PARAM name="Ident" datatype="char" arraysize="*" ref="MAIN_ID"/>
           <!-- Not possible to describe the end of the URL : #lab_bib  -->
        </GROUP>
    </RESOURCE>

    </RESOURCE>
    </VOTABLE>
Or via the alternative templating possibility proposed in 1.1 datalink discussion (only the last link description is provided - thanks to François Bonnarel for his help)
    ...
    <RESOURCE type="meta" utype="adhoc:service" name="SimbadBiblio">
       <DESCRIPTION>Link to Simbad biblio</DESCRIPTION>
       <PARAM name="accessURL" datatype="char" arraysize="*" value="http://simbad.u-strasbg.fr/simbad/sim-id"/>
       <PARAM name="contentType" datatype="char" arraysize="*" value="text/html"/>
        <GROUP name="inputParams">
           <PARAM name="bibdisplay" datatype="char" arraysize="*" value="refsum"/>
        </GROUP>
        <GROUP name="endpoints" >
           <PARAM name="identifier" utype="template" datatype="char" arraysize="*" ucd="meta.code" value= "Ident={?MAIN_ID*}#lab_bib />
        </GROUP>
    </RESOURCE>
    ...
At the end,  and in the context of basic direct links, from a LINK solution in one code line without using any additional dedicated structure, we had to deploy a solution with additional RESOURCES, nested,  structured, 10 to 15 times more verbose, but who can not do the job completely. There are two main issues in addition of the complexity :

    There is no possibility in Datalink to say that this link in associated to this value of this row, on this other link to this other value. Datalink is only row oriented.
    Datalink can only describe key=value parameters URL implying constant values for the root URL (no REST URLs for instance)

My reaction is it will not be easy to sell this Datalink alternative, and the risk of writing/parsing errors will be proportionate to this complexity.

As far as I understood and after my exercise, I would tend to think that Datalink can not/should not replace the use of basic LINK - with or without templating - and probably each method will keep its niche:  the basic LINK for direct links associated with the values of a column, the Datalink for lists of links with descriptions, SODA and more for each line result. I'm afraid that if we do not keep the classic LINK usage, we'll just see Aladin Desktop's links disappear (as it was occured  for the NED results for example).

Feel free to react to this open discussion, maybe only is the DAL mailing list for avoiding dupplications.

Best regards
Pierre Fernique

---

PS. FYI, here is the list of clients and data providers who currently use - to my knowledge - the LINK (with or without templating). To my knowledge, none of them has had any inclination in these last 4 years to move to Datalink1.0 alternative for basic direct links (except Aladin Desktop).

Clients (clearly CDS oriented):

       Aladin Desktop (Java)
       CDS portal (JS)
       Simbad Simplay (Flash)

Data base services using VOTable LINK facility with templating (${columnName} variables in the LINK URL template)

       CDS Simbad
       CDS VizieR
       CDS photometric service (VizieR SED)
       IMCCE Skybot
       LEDA Hypercat
       All HiPS providers using progenitor facility (for accessing to the original images)

Some examples (in use):

1) Simbad: http://alasky.u-strasbg.fr/cgi/simbad-flat/simbad-cs.py?target=08+47+18.60770+%2B26+53+20.1212&SR=54.63&format=votable-tsv&SRUNIT=arcmin&SORTBY=nbref
    <FIELD ID="BIBLIST" name="BIBLIST" ucd="meta.bib" datatype="short" width="4"><DESCRIPTION>List of Bibcodes</DESCRIPTION><LINK value="ref (${BIBLIST})" href="http://simbad.u-strasbg.fr/simbad/sim-id?bibdisplay=refsum&Ident=${MAIN_ID}#lab_bib"/></FIELD>
2) VizieR SED: http://vizier.u-strasbg.fr/viz-bin/sed?-c=04+59+02.70698+%2B21+44+11.2538&-c.rs=5.0
        <FIELD name="_tabname" ucd="meta.table" datatype="char" arraysize="32*">
          <DESCRIPTION>Table name</DESCRIPTION>
          <LINK href="http://vizier.u-strasbg.fr/viz-bin/VizieR-5?-info=XML&amp;-out.add=.&amp;-source=${_tabname}&amp;${_ID}"/>
        </FIELD>
    ...
        <FIELD ID="sed_filter" name="_sed_filter" ucd="meta.id;instr.filter" unit="" datatype="char" width="32" arraysize="32*">
          <DESCRIPTION>Filter designation, in the form photoSystem:filterName; a designation starting by ':=' is an assumed monochromatic point; this column is empty when the frequency is specified for each data point.</DESCRIPTION>
          <LINK href="http://cdsarc.u-strasbg.fr/viz-bin/metafilter?${_sed_filter}"/>
        </FIELD>
3) VizieR: vizier.u-strasbg.fr/viz-bin/votable?-source=I%2F284%2Fout&-c=05+34+31.93920+%2B22+00+52.2000&-out.add=_RAJ,_DEJ&-oc.form=dm&-out.meta=DhuL&-out.max=999999&-c.rm=23.82&-out=_VizieR,*Mime(image/fits),*&-mime=TSV
        <FIELD name="_V" ucd="meta.ref" datatype="char" arraysize="6">
          <DESCRIPTION>Link to the VizieR record with all details</DESCRIPTION>
          <LINK href="http://vizier.u-strasbg.fr/viz-bin/VizieR-5?-info=XML&amp;-out.add=.&amp;-source=I/284/out&amp;-c=${RAJ2000}${DEJ2000}&amp;-c.eq=J2000.000&amp;-c.rs=0.5"/>
        </FIELD>
4) HiPS - HST B example: http://alasky.u-strasbg.fr/HST-hips/filter_B_hips/HpxFinder/metadata.xml
    <FIELD name="DATASET" datatype="char" arraysize="*"><LINK href="http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/AdvancedSearch/?Observation.observationID=${DATASET}" title="Full metainfo"/></FIELD>
5) IMCCE: http://vo.imcce.fr/webservices/skybot/skybotconesearch_query.php?-ep=2019-11-11&-ra=131.82753208333332&-dec=26.888922555555556&-size=109.26,109.26&-mime=votable&-out=basic&-loc=500&-search=Asteroids+and+Planets&-filter=120+arcsec&-from=Aladin
   <vot:FIELD ID="name" name="Name" ucd="meta.id;meta.main" datatype="char" arraysize="32"><vot:DESCRIPTION>Solar system object name</vot:DESCRIPTION><vot:LINK href="${ExternalLink}"/></vot:FIELD>
6) LEDA: http://leda.univ-lyon1.fr/leda/leda-aladin.cgi?type=astrores&ra=210.80242917&de=54.34875&width=1.4&height=1.4
        <FIELD name="Designation" ucd="IDENT" datatype="A" width="21">
            <DESCRIPTION>LEDA designation</DESCRIPTION>
            <LINK href="http://leda.univ-lyon1.fr/leda/querybyname.cgi?objname=${Designation}&amp;donnee=mean&amp;Submit=Submit">${Designation}</LINK>
        </FIELD>

@pdowler
Copy link
Collaborator

pdowler commented Nov 20, 2019

Just adding a comment here that we more or less forgot to support a way to add a fragment to a constructed URL (in a service descriptor) in 1.0 ... fragments have to come after query string to they cannot be in the base URL. In general, a template solution has to support path elements, query params, and fragments ina coherent way.

See this email thread on DAL mailing list for motivating use cases:
http://mail.ivoa.net/pipermail/dal/2019-November/008242.html

@msdemlei
Copy link
Collaborator

msdemlei commented Nov 20, 2019 via email

@Bonnarel
Copy link
Contributor Author

Bonnarel commented Nov 21, 2019 via email

@pdowler pdowler added TBD enhancement New feature or request labels May 4, 2020
@molinaro-m
Copy link
Member

RFC6570 link (from DataLink follow up session): https://www.rfc-editor.org/rfc/rfc6570.txt

@lmichel
Copy link

lmichel commented Sep 2, 2020

RFC 6570 Template Examples

Here are a few examples of templated Vizier URLs.

Using level 3 would be very valable since this wopld allow specify lots of separators and to deal with empty/undefined values

If an inptParam

values taken out from {inputParams}

Basic form type example

Level 1

<RESOURCE type="meta" utype="adhoc:service" name="templated_service">
   <PARAM name="accessURL" datatype="char" arraysize="*"
         value="http://simbad.u-strasbg.fr/simbad/sim-id?bibdisplay={bibdisplay}"/>
   <PARAM name="contentType" datatype="char" arraysize="*" value="text/html"/>
    <GROUP name="inputParams">
       <PARAM name="bibdisplay" datatype="char" arraysize="*" value="refsum"/>
    </GROUP>
</RESOURCE>

Expansion:

http://simbad.u-strasbg.fr/simbad/sim-id?bibdisplay=refsum

Same in level 3

<RESOURCE type="meta" utype="adhoc:service" name="templated_service">
   <PARAM name="accessURL" datatype="char" arraysize="*"
         value="http://simbad.u-strasbg.fr/simbad/sim-id{?bibdisplay}"/>
   <PARAM name="contentType" datatype="char" arraysize="*" value="text/html"/>
    <GROUP name="inputParams">
       <PARAM name="bibdisplay" datatype="char" arraysize="*" value="refsum"/>
    </GROUP>
</RESOURCE>

Expansion:

http://simbad.u-strasbg.fr/simbad/sim-id?bibdisplay=refsum
<RESOURCE type="meta" utype="adhoc:service" name="templated_service">
   <PARAM name="accessURL" datatype="char" arraysize="*"
         value="http://simbad.u-strasbg.fr/simbad/sim-id{?bibdisplay, author}"/>
   <PARAM name="contentType" datatype="char" arraysize="*" value="text/html"/>
    <GROUP name="inputParams">
       <PARAM name="bibdisplay" datatype="char" arraysize="*" value="refsum"/>
       <PARAM name="author" datatype="char" arraysize="*" value="Einstein"/>
    </GROUP>
</RESOURCE>

Expansion:

http://simbad.u-strasbg.fr/simbad/sim-id?bibdisplay=refsum&author=Einstein

Form type without var name

Level 1

<RESOURCE type="meta" utype="adhoc:service" name="templated_service">
   <PARAM name="accessURL" datatype="char" arraysize="*"
         value="https://vizier.u-strasbg.fr/viz-bin/Cat?{catid}  "/>
   <PARAM name="contentType" datatype="char" arraysize="*" value="text/html"/>
    <GROUP name="inputParams">
       <PARAM name="catid" datatype="char" arraysize="*" value="1234"/>
    </GROUP>
</RESOURCE>

Expansion:

https://vizier.u-strasbg.fr/viz-bin/Cat?1234

Simple substitution

Level 1

<RESOURCE type="meta" utype="adhoc:service" name="templated_service">
   <PARAM name="accessURL" datatype="char" arraysize="*"
         value="http://vizier.u-strasbg.fr/viz-bin/VizieR-5?-info=XML&-source=LSBG&LSBG={lsbg}"/>
   <PARAM name="contentType" datatype="char" arraysize="*" value="text/html"/>
    <GROUP name="inputParams">
       <PARAM name="lsbg" datatype="char" arraysize="*" value="1234"/>
    </GROUP>
</RESOURCE>

Expansion:

http://vizier.u-strasbg.fr/viz-bin/VizieR-5?-info=XML&-source=LSBG&LSBG=1234

Path segment

Level 3

<RESOURCE type="meta" utype="adhoc:service" name="templated_service">
   <PARAM name="accessURL" datatype="char" arraysize="*"
         value="http://tapvizier.u-strasbg.fr/TAPVizieR/tap{/synOrasync}?REQUEST=doQuery"/>
   <PARAM name="contentType" datatype="char" arraysize="*" value="text/html"/>
    <GROUP name="inputParams">
       <PARAM name="synOrasync" datatype="bool"  value="sync"/>
    </GROUP>
</RESOURCE>

Expansion:

http://tapvizier.u-strasbg.fr/TAPVizieR/tap/sync?REQUEST=doQuery

Values taken out from {Field}

Var names are prefixed with Field:

This is not part of the the RFC and wold require a specific preprocessing.

Level 3

<RESOURCE type="result">
  <TABLE>
    <FIELD ID="RAJ2000" name="RAJ2000"  datatype="double" arraysize="1"/>
    <FIELD ID="DEJ2000" name="DEJ2000"  datatype="double" arraysize="1"/>
     ...

  </TABLE>
</RESOURCE>


<RESOURCE type="meta" utype="adhoc:service" name="templated_service">
   <PARAM name="accessURL" datatype="char" arraysize="*"
         value="http://vizier.u-strasbg.fr/viz-bin/VizieR-5?-info=XML&-source=PFr&-c={Fields:RAJ2000}{Fields:DEJ2000}"/>
   <PARAM name="contentType" datatype="char" arraysize="*" value="text/html"/>
    <GROUP name="inputParams">
       <PARAM name="synOrasync" datatype="bool"  value="refsum"/>
    </GROUP>
</RESOURCE>

Expansion:

http://vizier.u-strasbg.fr/viz-bin/VizieR-5?-info=XML&-source=PFr&-c=123.4-56.7

@mbtaylor
Copy link
Member

mbtaylor commented Sep 2, 2020

It looks to me like client-side code to decode RFC6570 level 3 templating would not be all that hard to do. But I don't currently have an opinion about whether there is a pressing use case for this.

@lmichel
Copy link

lmichel commented Sep 3, 2020

In my understanding, the main advantage of using level 3, is the management of undefined variables.

This would allow DalaLink clients to ignore such or such parameter without being worry about the URL validity

1- The following template

http://server/service{?p1, p2}

will be transformed into this if both p1 and p2 are defined

http://server/service?p1=v1&p2=v2

and into this if p1 is undefined

http://server/service?p2=v2

2- The following template

http://server/service{/p1}?catid

will be transformed into this if p1 is undefined

http://server/service?catid

whereas, at level1

http://server/service/{p1}?catid

will be transformed into this if p1 is undefined

http://server/service/?catid

Which may be misinterpreted

@Bonnarel
Copy link
Contributor Author

Bonnarel commented Sep 4, 2020 via email

@lmichel
Copy link

lmichel commented Sep 4, 2020

fixed

@Bonnarel Bonnarel mentioned this issue Nov 19, 2020
@Bonnarel
Copy link
Contributor Author

Bonnarel commented Nov 26, 2020

See new PR #54 for that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request TBD
Projects
None yet
Development

No branches or pull requests

6 participants