-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: message count read in forwarder #1030
Conversation
Signed-off-by: veds-g <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we separately track control messages?
perhaps let's publish total read metrics and also data messages, the difference should be ctrl mesgs. |
Should we enable multi-partition for the rater e2e test pipeline so that we catch similar issues in future? https://github.com/numaproj/numaflow/blob/main/test/e2e/functional_test.go#L129 |
I think the ctrl message is the implementation details of Numaflow so we probably shouldn't expose it to users... Now that we exclude the ctrl messages from the read total metric, it's accurate from the user's standpoint that the pipeline read N data messages. An issue with the current approach could be the pipeline gets throttled because of high load and user can't see the real load from the metric because ctrl msgs are excluded but that won't happen because ctrl msgs are very small portion of a high throughput pipeline. |
We should expose metrics even though it is not useful to users. It is quite important for us to debug if we are seeing higher than normal ctrl messages. We expose a lot of metrics that are not valid to users, e.g., WAL count, etc. |
Signed-off-by: veds-g <[email protected]>
@veds-g why not just expose total read messages rather than just ctrl-messages? |
no particular reason, I thought we wanted control separately. Should I go for the total instead? |
go for the total because it represents everything. |
Signed-off-by: veds-g <[email protected]>
Signed-off-by: veds-g <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @KeranYang please review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. TY for fixing this issue. There is one comment that I want to address but shouldn't be a blocker for merging this PR. I will work on it myself when I have time.
Signed-off-by: veds-g <[email protected]> Signed-off-by: Dillen Padhiar <[email protected]>
Signed-off-by: veds-g <[email protected]>
Fixes #1026
readMessagesCount
inforward.go
was adding both data messages as well as higher control messages(for low throughput pipelines with multiple partitions) leading to a higher processing rate calculated and shown for the sink vertexcount of messages read now is just based on data messages