Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asynchronous close(batchPipeline) in kafka/main.go can lead to panic #706

Open
ibukanov opened this issue May 20, 2024 · 0 comments
Open
Labels
bug Something isn't working

Comments

@ibukanov
Copy link
Collaborator

StartConsumers() from kafka/main.go calls panic(batchPipeline) when readAndCommitBatchPipelineResults() returns an error. But this will trigger panic and terminate the process if processMessagesIntoBatchPipeline() posts a new message to batchPipeline channel.

In general the logic of handling transient errors is broken. processMessagesIntoBatchPipeline() can only terminate by panic and that kills the process. If this is intended bbehavior, then the code to call StartConsumers() as a recovery should be removed.

@ibukanov ibukanov added the bug Something isn't working label May 20, 2024
ibukanov added a commit that referenced this issue May 21, 2024
The current code closes the batchPipeline channel when it receive transient
errors to restart the readers. But this does not close Kafka readers so they
continue to read. When this happens, a message will be written to a closed
channel triggering panic in processMessagesIntoBatchPipeline. As that does
not use recover, this will crash the whole process.

This change refactors error handling to ensure that the errors are properly
propagated to the caller while also supporting context cancellation which
allowed to provide extended test coverage. The reader are always closed on
errors.

CLOSE #706
ibukanov added a commit that referenced this issue May 22, 2024
The current code closes the batchPipeline channel when it receive transient
errors to restart the readers. But this does not close Kafka readers so they
continue to read. When this happens, a message will be written to a closed
channel triggering panic in processMessagesIntoBatchPipeline. As that does
not use recover, this will crash the whole process.

This change refactors error handling to ensure that the errors are properly
propagated to the caller while also supporting context cancellation which
allowed to provide extended test coverage. Kafka readers/writers are always
closed on errors.

This change still uses an explicit exit on errors and lets the container
runtime to restart the process.

CLOSE #706
ibukanov added a commit that referenced this issue May 23, 2024
The current code closes the batchPipeline channel when it receive transient
errors to restart the readers. But this does not close Kafka readers so they
continue to read. When this happens, a message will be written to a closed
channel triggering panic in processMessagesIntoBatchPipeline. As that does
not use recover, this will crash the whole process.

This change refactors error handling to ensure that the errors are properly
propagated to the caller while also supporting context cancellation which
allowed to provide extended test coverage. Kafka readers/writers are always
closed on errors.

This change still uses an explicit exit on errors and lets the container
runtime to restart the process.

CLOSE #706
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant