Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the ingest rate displayed in the CLI. #4682

Merged
merged 3 commits into from
May 10, 2024
Merged

Fix the ingest rate displayed in the CLI. #4682

merged 3 commits into from
May 10, 2024

Conversation

fulmicoton
Copy link
Contributor

The CLI was showing the average rate since the CLI was started.

This PR changes it to an estimation over the last 2 seconds.

@fulmicoton fulmicoton force-pushed the cli-ingest-rate branch 3 times, most recently from 4493938 to aa56186 Compare March 6, 2024 04:47
@trinity-1686a
Copy link
Contributor

I tried it and it seems somehow broken. After a while, I get this:

⠋ [00:20:00] 20.82 GiB/20.82 GiB (91.4 MiB/s)

Ingestion rate should be in the order of 17MiB/s, but the number I got never seemed to go below 40MiB/s

@fulmicoton
Copy link
Contributor Author

oh really? let me check! Thanks for testing it!

@trinity-1686a
Copy link
Contributor

if that helps, it seems to only take a few discrete values, a few of which are 51.4, 45.7, 40, 62.8, 57.1, 74.2, that seems to be increments of 5.7 from each other

@fulmicoton fulmicoton force-pushed the cli-ingest-rate branch 6 times, most recently from 29ec0bc to 9235400 Compare March 8, 2024 07:28
@fulmicoton fulmicoton requested a review from guilload March 8, 2024 07:36
@fulmicoton
Copy link
Contributor Author

@trinity-1686a There was a bug in the rate estimator indeed. I added @guilload to the review.

@trinity-1686a
Copy link
Contributor

the estimator always seems to give a multiple of 5.71 for this workload. This is because it essentially count batches of (quasi-)fixed size, and there aren't that many batches processed over a 2s window. In my case, there are almost always 3 batches processed, so I get 17.1MiB most of the time, and 11.4 or 22.8 when 2 or 4 batches got processed.
Maybe for that estimator, it would be better to count the time it took to send the last X batches, rather than to count how many batches were sent in X seconds.

fulmicoton and others added 3 commits May 10, 2024 11:14
The CLI was showing the average rate since the CLI was started.

This PR changes it to an estimation over the last 2 seconds.
The rate estimator was returning inflated results whenever
we did not record any work during a bucket period.

The fix does two things:
- It replace the 1-bit color logic `(bucket_ord / num_buckets) % 2`` to a more generic and more accurate 8 bit hash `bucket_ord % 243`.

Collision may only happen if no write targetted this specific bucket for `num_buckets x 243 x bucket_period`.

Upon read, it checks for the hash and dismisses buckets that are not matching the hash.
@fulmicoton
Copy link
Contributor Author

I think there are different ways to improve the estimation, but I'll merge this for the bugfix. @trinity-1686a feel free to revisit it.

@fulmicoton fulmicoton merged commit fab785c into main May 10, 2024
5 checks passed
@fulmicoton fulmicoton deleted the cli-ingest-rate branch May 10, 2024 02:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants