Support for Ariadne portal and/or ADS #34

dietervu · 2024-06-25T14:20:12Z

For Ariadne portal example link:

https://portal.ariadne-infrastructure.eu/resource/d37baec6fe87dcdb28108a90f0f4ea010dd6d758eef7e232dab8517e818179b9

There is an API available that exposes the metadata as JSON or XML (see landing page links)

In fact the part that DOG should be able to access is the PDF, which seems only available via the original location:

https://archaeologydataservice.ac.uk/library/browse/issue.xhtml?recordId=1178177&recordType=GreyLitSeries
(DOI: https://doi.org/10.5284/1076468)

The missing part is the ADS API (or possibly the OASIS API) which would provide access to all metadata fields in a machine readable manner.

Also to be investigated: which sections of the ADS site are relevant?

The archive? (example record: https://doi.org/10.5284/1117220)
ARCHSEARCH?
The ADS Library?

dietervu · 2024-08-13T12:02:56Z

Preliminary answer from Julian:

It would probably help us answer your question by knowing the context and what you want to do, but here are some initial pointers.

The ARIADNE portal is an aggregator and only holds what we regard as metadata (although in some cases we ingest the whole of a dataset at record level so it tells the full story) But generally the fuller datasets (and certainly all downloadable files) are all held by repositories such as ADS. As well as the json and XML downloads, there is a public SPARQL endpoint for the whole ARIADNE knowledge base. I can send you the link if you are interested.

For ADS we have a few APIs, including an OASIS one which Tim knows all about, and are planning to update some others. But we are also in the process of migrating our three separate search interfaces to a single search using the ARIADNE portal framework, and the ARIADNE triple store - so all our metadata would then be interrogable via the ARIADNE SPARQL endpoint. (As you might imagine these three search interfaces are confusing to users who may not know if they want an archive, a brief site record, a journal article or report!) This will happen within the next two years, starting with ArchSearch and Archives, so within the ATRIUM timescale.

Does that help at all? if you can explain what underlies the question we may be able to give more information.

dietervu · 2024-08-13T13:30:08Z

Adding some relevant pointers for the ARIADNE sparql endpoint:

https://graphdb.ariadne.d4science.org/
Documentation at https://data.d4science.net/EvVX

dietervu · 2024-09-02T09:34:38Z

follow-up tom Tim and Julia:

Thanks a lot for the information. For us and for the short term I think it might make most sense to setup a concrete implementation for a simple case, where we try to process the text in a PDF report that has a DOI, just as the example I mentioned in my earlier mail: https://archaeologydataservice.ac.uk/library/browse/issue.xhtml?recordId=1178177&recordType=GreyLitSeries

Now our concrete question is: what would be the best way to extract the link to the PDF (https://archaeologydataservice.ac.uk/catalogue/adsdata/arch-882-1/dissemination/pdf/acarchae2-347153_1.pdf) when you start with the DOI? (https://doi.org/10.5284/1076468)

Worst case we could try to parse the HTML, but maybe the OASIS API has some specific call for this?

answer Tim:

I'm afraid the OASIS API won't be of use here, primarily as it only returns the DOI and not the link to the specific file(s). The DOI is registered using a UID for the metadata landing page. It's the page itself that then queries another underlying database to pull out the relevant file(s). At the moment there's nothing practical we have to hand that allows an external user to extract this data. Could you try the parsing approach for now? If this proves too difficult then let me know and I'll have a think.

@MichalGawor is currently working on a plugin that is based on HTML parsing

dietervu · 2024-10-07T09:16:54Z

First implementation available at https://alpha-dog.clarin.eu/

dietervu assigned dietervu and MichalGawor Sep 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Ariadne portal and/or ADS #34

Support for Ariadne portal and/or ADS #34

dietervu commented Jun 25, 2024 •

edited

Loading

dietervu commented Aug 13, 2024

dietervu commented Aug 13, 2024

dietervu commented Sep 2, 2024

dietervu commented Oct 7, 2024

Support for Ariadne portal and/or ADS #34

Support for Ariadne portal and/or ADS #34

Comments

dietervu commented Jun 25, 2024 • edited Loading

dietervu commented Aug 13, 2024

dietervu commented Aug 13, 2024

dietervu commented Sep 2, 2024

dietervu commented Oct 7, 2024

dietervu commented Jun 25, 2024 •

edited

Loading