-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trace API plugin response times vary by block height #1219
Comments
This is not unexpected, although the actual times are larger than I would have expected. The |
Yeah, the addition of something like 200ms is a lot more than I expected would occur - especially on faster drives like nvme. Would it be possible to use an index in the I'm not opposed to dropping the FWIW - The reason why we'd like this improvement is that our history solution (Roborovski) processes data sequentially from the trace_api like this. In real time as its keeping up with head/lib, this isn't too much of a problem - but when replaying the entire chain to create a new instance, this bottleneck significantly slows down processing time. (e.g. going from 3 days to 1 month to replay). |
I do think using the block number from the file name would provide quick determination of the correct file. Or we could create a meta index of block ranges to files. |
Options to address this:
However given the limited audience that this issue impacts, and the available workaround of setting a smaller stride size, we're going to close this issue for now, pending additional reports. |
During some history data processing on Jungle4, I came across some pretty consistent API performance differences while using the
v1/trace_api/get_block
API call. It appears that the block height being queried significantly affects the amount of time/processing required to return the data for the call.I wrote a script to run some additional tests to reproduce my findings and below is a chart that illustrate the performance degradation I'm seeing consistently based on block height.
https://docs.google.com/spreadsheets/d/1DRaY4NsE4nU_nZ1Al8x926aVeYYthuq4qP3RumUc9xA/edit?usp=sharing
The red line is the
trace_api
and the blue line is thechain_api
, all are measured by the milliseconds it took to respond to the API call. The test I ran was to sequentially query 100,000 blocks, using thev1/chain/get_block
as a baseline then followed by the same sequence of blocks against thev1/trace_api/get_block
endpoint. The test starts at height39890000
and ends at39990000
.It's worth noting that any sequence will yield the same results, this was just a random sampling.
You can see the results from the
chain_api
yield consistently very fast (<10ms), while the results from thetrace_api
seem range from very fast (<10ms) to very slow (200ms). The performance seems to be correlated to the block number, cycling every 10,000 blocks. At block 10,000 the performance is at its best and 19,999 being the worst, then repeat.A few things to note about these results:
trace-no-abis = true
.Theories
After talking about this with the team at length and trying to troubleshoot anything it might be on our setup/configuration, one thing we noticed was that the
trace_api
stride size matches exactly with which blocks perform the best and which perform the worst. Each file the trace_api outputs has 10,000 blocks in it by default.Our working theory at this point is that the reason that block 10,000 is fast is because it's at the beginning of the file, while block 19,999 would be at the end of the file - and potentially seeking through that file could be the reason for the delay in API response times. It's a guess at this point though, since we didn't look into or have a previous understanding of how these files and their pointers work.
Expected results
In this situation, with 99% empty blocks, our expectation of how the
trace_api
should function is that regardless of the block number, the time to return results should be consistent (much like thechain_api
response times).Reproduction
After discovering this issue with our history services, I then reproduced the results using this script:
https://github.com/aaroncox/getblocktest
This script has a hardcoded IP address to the Jungle4 server I was using that is running the
trace_api
plugin, so it can be run by anyone looking to see the results in real time. TheAPIClient
defined on line 3 can also be edited to point at any other server running a similar configuration. Using nodejs v18+, these tests can be run by:The text was updated successfully, but these errors were encountered: