Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pool: Correct endpoint waitgroup logic. #359

Merged
merged 1 commit into from
Sep 14, 2023

Conversation

davecgh
Copy link
Member

@davecgh davecgh commented Sep 14, 2023

This requires #358.

This corrects the waitgroup logic in endpoint handling to ensure it properly terminates when the context is canceled.

It also enables the wait at the end of the endpoint test to help prove correctness.

Copy link
Member

@jholdstock jholdstock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an issue here which results in dcrpool never shutting down.

  1. connect is in the process of creating a new client
  2. ctx cancelled
  3. disconnect calls cancel on every client in the list
  4. connect adds new client to the list, it is never canceled.

I triggered this condition by adding a time.Sleep immediately after the call to NewClient and CTRL+C'ing dcrpool whilst the sleep was running.

@davecgh
Copy link
Member Author

davecgh commented Sep 14, 2023

There's an issue here which results in dcrpool never shutting down.
1. connect is in the process of creating a new client
2. ctx cancelled
3. disconnect calls cancel on every client in the list
4. connect adds new client to the list, it is never canceled.

I triggered this condition by adding a time.Sleep immediately after the call to NewClient and CTRL+C'ing dcrpool whilst the sleep was running.

I suspect it's something else, because the new client is created with the context that is already canceled, so it's already canceled.

This corrects the waitgroup logic in endpoint handling to ensure it
properly terminates when the context is canceled.

It also enables the wait at the end of the endpoint test to help prove
correctness.
@davecgh davecgh force-pushed the pool_correct_endpoint_waitgroup branch from aa6fc48 to 413e190 Compare September 14, 2023 10:00
@davecgh
Copy link
Member Author

davecgh commented Sep 14, 2023

For reference later when anyone else is reviewing these PRs. We tracked the issue down offline and @jholdstock submitted #367 to resolve the actual root cause of what he was seeing.

@jholdstock
Copy link
Member

Retested with the changes from #367 and this is working well now.

@jholdstock jholdstock merged commit e0953d0 into decred:master Sep 14, 2023
2 checks passed
@davecgh davecgh deleted the pool_correct_endpoint_waitgroup branch September 14, 2023 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants