Replies: 3 comments 1 reply
-
In Weather API first we implemented the pagination by iterating over individual items. Then we changed this to iterate over the pages of data on the grounds of a better efficiency since the iterator is asynchronous. You can check the history of this file for both solutions: https://github.com/frequenz-floss/frequenz-api-weather/blob/v0.x.x/py/frequenz/client/weather/_historical_forecast_iterator.py |
Beta Was this translation helpful? Give feedback.
-
For the Reporting API client we decided to use a private method to iterate over pages. It returns a dedicated page object, which provides an iterator over individual samples . For now we only expose a public method to iterate over samples (which calls the method to iterate over pages), but it would be trivial to expose both. As long as there is no need for access the pages, there are no plans to expose it though. |
Beta Was this translation helpful? Give feedback.
-
Just as a reminder, the main point of iterating over pages is performance (iterating over individual items asynchronously) is ~3x slower in Python 3.11: from typing import AsyncIterator
async def pages() -> AsyncIterator[list[int]]:
for i in range(10):
yield [i] * 1000
async def items() -> AsyncIterator[int]:
async for page in pages():
for j in page:
yield j
async def iter_pages():
total = 0
async for page in pages():
for item in page:
total += item
async def iter_items():
total = 0
async for item in items():
total += item Run:
And more than 4x slower in 3.12:
Iterating over pages only adds only a very minor inconvenience, so if we will provide only one way to iterate, I'm more inclined to provide pages instead, as it is the safest approach to avoid the users ending up with performance issues. I would only provide iteration over individual items for convenience for small scripts, but probably it is not worth it because of the extra complexity. It might also be that the performance penalty is negligible, we'll run some tests after we have an example working with a lot of data, using both iteration methods to see if there is any real-world difference in performance. |
Beta Was this translation helpful? Give feedback.
-
Coming from:
I suggest making gRPC methods that retrieve pages return a wrapper object with 2 ways to iterate over the results:
For example, for the reporting API:
So
list_microgrid_components_data()
could return aResponse
object with properties like (pseudo-code using the current method names):Beta Was this translation helpful? Give feedback.
All reactions