-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes the numbering scheme in the XOAI resumption token cursor #81
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Move from org.dspace to io.gdcc group - Make version a variable - Make Maven use up-to-date plugins via plugin management - Add release profile with necessary plugins for releases to Maven Central
This adds the code as-is with the exception of the packages being renamed to the io.gdcc namespace.
- Add README note - Pull extractors from the library into our codebase - Replace custom made Hamcrest XPathMatcher with XMLUnit one - Add missing package-info.java with hints about origin
…ing release only) This is due to being able to install the JARs locally without a published version of the main/parent POM, as this might fail due to broken submodules. Might be reverted later.
The necessary bits of the lyncode/xml-io library were moved to our submodule xoai-xmlio to decouple us from the non-maintained upstream XML library.
- The code had some explicit generics written out that have been removed as unnecessary. - Some class vars were assigned via constructor but not set final when never changed after the fact. - A varargs function has been updated with a compiler hint about its safe usage.
- Move stax2-api to newer version via parent POM - Update StAX2 Parser Woodstox to latest version from FasterXML - Make the parser scope runtime and optional to allow swapping for different version (appserver provided etc) or even switch to other implementation (like Aalto)
- Make class variables final where possible - Remove explicit, unnecessary generics - Remove some minor explicit and unnecessary modifiers like public for interface etc
Adapt xmlio.XmlWriter to implement AutoClosable and make use of it in xml.XmlWriter by using a try-with-resources to avoid missing close() calls.
Replace custom made Hamcrest XPathMatcher with XMLUnit one.
- Remove the usages of Commons Lang3 from xoai-data-provider - Replace usages of random String generation with custom random generator living inside xoai-common util package io.gdcc.xoai.util.Randoms
…nextEvent() This is necessary to use these basic routines within the EchoElement, as we will not read from a String there, but make it capable to read from an InputStream, too.
Before, the XML string sent to the EchoElement was stuffed into a XmlReader with a ByteArrayInputStream. Now we extend this to be capable of reading from an arbitrary InputStream when given on object creation. This commit also adds benchmarking tests using JMH to learn about the speed decrease that parsing the XML from the input stream causes. It get's compared to the "native" copy of input to output as seen in Dataverse. Run it via `mvn -Pbenchmark clean test` Another small change happened, too: instead of the deprecated Stack class we now use the replacement "Deque".
This implementation of FilterInputStream has been copied from Apache POI or, more precise, its origin at Inbot. Currently, the Dataverse OAI-PMH data provider uses this filter to remove the XML declaration from the pregenerated XML metadata files. It is being added here, but flagged as deprecated, purely for benchmarking reasons.
This is a fast alternative to EchoElement, which does not do any XML parsing before it copies XML data from an InputStream into the XmlWriter. It takes care of tricking the writer into accepting the data without further addo, but REQUIRES that the writer already contains a wrapping element (you cannot write at root with this).
…esent Dataverse usage
…XOAIMetadata All of these share the common interface XmlWritable, so we can store the element to write out as metadata as this type (no switch or if necessary). This also moves the data handling to using classes - they create the data and the XML writables, this class is just used for the modeling of a fluent API. It is linked to the creation of items, which are generated by the application using this library via the repository interfaces. The application may decide how to fill in metadata when creating items.
When adding new transformers, ignore null ones and proceed. Simplifies adding transformers from context and metadata format in data-provider.
…fault Instead of having to override the function unnecessary, provide a default of return an empty list
Instead of an Item, make the methods sending an identifier only, return an object of type ItemIdentifier to make it more clear this does not carry metadata - this interface does not expose a getMetadata(), which is done with the Item interface only. The getItem() method is changed to send along a metadata format, so the application can expose pregenerated or cached metadata within the specific format. See also classes CopyElement and Metadata.copyFromStream()
- Shell out the XSL pipeline handling to a new MetadataHelper to avoid duplicated code in GetRecordHandler and ListRecordsHandler - Send metadata format via ItemRepository.getItem(identifier, format) to retrieve an item filled with metadata - The refactored Metadata class allows to distinguish if the underlying data needs processing or not. Reusing this here to skip the XSL pipeline when unnecessary. It's up to the application to provide valid metadata that create a validatable OAI-PMH response! - Skipping the processing allows for pregeneration/caching of potentially large metadata like DDI codebooks
- Refactor InMemoryItem and InMemoryItemRepo with a more consice API - Extend the GetRecordHandler and ListRecordsHandler tests: - Add explicit test example cases that include non-deleted, random metadata items - Add explicit test example cases that include non-deleted items associated with a CopyElement and InputStream - Verify the correct existance, but do not validate the OAI-PMH response (yet?)
No one should use this FilterInputStream, so we exclude it from being shipped with the xoai-common JAR. As it's only being used for EchoElementBenchmark, it can happily live within src/test
20 & 21 - skip metadata pipeline
- java.util.Date has many flaws and should not be used anymore - Changing all necessary classes and tests to use Instants - Simplified the implementation of UTCDateProvider - Added lots of test for the date provider class to ensure compatibility See also: https://stackoverflow.com/a/59940399
- java.util.Date has many flaws and should not be used anymore - Changing all necessary classes and tests to use Instants See also: https://stackoverflow.com/a/59940399
- java.util.Date has many flaws and should not be used anymore - Changing all necessary classes and tests to use Instants See also: https://stackoverflow.com/a/59940399
…rovider interface #19 Instead of creating an instance every time, lets just use static methods. Deleting the implementation and sticking with the interface makes it still changeable.
This is a change originally done by @mike-podolskiy90 in commit f0445e0 It's slightly extended with: 1) do this for ListSets and and ListIdentifiers, too and 2) only add the number if there are results by checking in the ResumptionTokenHelper
19 replace time and 8 GBIF change with totalResults
… spec says the first position must be 0, not 1. (#30)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
copy-and-pasting from #30:
The OAI spec says the cursor position should start with 0; the XOAI implementation starts with 1.
I.e., currently the resumption token under the 1st page of results looks like this:
It should instead say
cursor="0"
etc.