Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pagination for temporal instances does not allow retrieving more than 100 instances with the same timestamp #178

Open
gregorylevilain opened this issue Jan 25, 2024 · 6 comments

Comments

@gregorylevilain
Copy link

Tested with Mintaka 0.5.40 and 0.6.0

1/ Post 200 instances of a temporal entity at once.
The modifiedAt value is defined by Orion-LD. Let's say that we have modifiedAt=2024-01-25T14:46:47.158Z for all 200 instances.

2/ Use a temporal query to retrive the instances :
/temporal/entities?id=urn:MyEntity&timeproperty=modifiedAt&timerel=after&timeAt=2024-01-25T14:46:47.157Z

Result: 100 instances are returned, with the following header:
content-range: date-time 2024-01-25T14:46:47.157-2024-01-25T14:46:47.158/*

Issue: there is no way to retrieve the next 100 instances.

  • Using timeAt=2024-01-25T14:46:47.157Z again will return the same page again
  • Using timeAt=2024-01-25T14:46:47.158Z will return nothing.
@wistefan
Copy link
Collaborator

wistefan commented Feb 9, 2024

Hi,
what would be the use-case for that? How could you get 200 instances of an entity within the same millisecond?
However, if you really need that, you would have to provide a higher limit on the request (e.g. queryparameter limit=200)
I would rather suggest to review the use-case, since increasing the limit will also increase the memory consumption of mintaka.
Best,
Stefan

@gregorylevilain
Copy link
Author

Hi Stefan, thank you for your reply.
The use-case is the following (and is actually a real use-case that we are delivering in production): some system provides us with a batch of temporal data (let's say the electricity consumption of a building for one month). We use a temporal query to store it in Orion.
Then we read the data with Mintaka, using timeproperty=modifiedAt, because we want to retrieve only the newly added data and deal with it.
As only one query was made with Orion to ingest let's say 5000 temporal data for the same property of the Building entity, then each temporal instance of this property has the same date in its modifiedAt attribute. Hence the described issue.

The "limit" query parameter that you are mentioning allows to limit the number of entities returned by Mintaka, thererfore it does not help in this situation. What we need to increase is the number of returned temporal instances of a particular property of a particular entity. Actually we have already forked Mintaka in order to increase this limit, but it is not a viable solution, as we will need to increase it again and again depending on the use-cases. It would be better to provide a mean of pagination.

Best Regards,
Greg

@wistefan
Copy link
Collaborator

wistefan commented Feb 9, 2024

Hi,
I see that this does not work. However, its somewhat a "misuse" of the temporal api. The temporal instances of a property should represent different points in time of that property. The case of multiple modifications at the same timestamp is not really specified, thus pagination through one property and its instances within the same timestamp is not defined in NGSI-LD. Is it possible to use another timeproperty like observedAt for that?
Best,
Stefan

@greglevilain
Copy link

Hi wistefan, don't you think that any data that can be stored with Orion-LD should be readable through Mintaka ?
The temporal instances of my property actually represent different points in time of the property, and for this they each have a different observedAt value. I have juste stored several of them at the same time in Orion, therefore they all have the same modifiedAt.
There is actually another way to fix this issue (again, from my perspective, any data that can be stored with Orion-LD should be readable with Mintaka): Orion uses the same modifiedAt value whatever the number of temporal instances inserted at once. By simply updating this behaviour by using the "now()" function of the database, and using suffisant precision (up to the microsecond), this issue would be fixed.

@wistefan
Copy link
Collaborator

Hi,
if the observedAt is used, you can use that property to retrieve them. I think changing the modifiedAt to be set by the database is not really a good option, since from an API perspective, they are all modified at the same time. What would be your idea to paginate in your case?

@gregorylevilain
Copy link
Author

Hi wistefan,
I can imagine the following solutions for this.

Let's take my previous example.

  • timeproperty=modifiedAt
  • A temporal entity that have 200 temporal instances with the same modifiedAt=2024-01-25T14:46:47.158Z, and 50 temporal instances with a modifiedAt=2024-01-25T14:46:47.159Z

As a result of the first query, Mintaka should return the first 200 temporal instances (with modifiedAt=2024-01-25T14:46:47.158Z) because they have the same temporal value, instead of limiting to 100.

  • the retuned content-range header is still the following: date-time 2024-01-25T14:46:47.157-2024-01-25T14:46:47.158/*
  • the http returned code is 206

This allows querying the next "page" of results without missing any value with the following query:
/temporal/entities?id=urn:MyEntity&timeproperty=modifiedAt&timerel=after&timeAt=2024-01-25T14:46:47.158Z
This query will then return the 50 remaining temporal instances.

(Note: for the time being, in order to deal with this issue, we have forked both Orion-LD and Mintaka to allow timestamp precision up to micro-second, and use the postgres "clock_timestamp()" function to define the value of the "modifiedAt" attribute at modification time. This conveniently fixes the issue in our case)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants