-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
p2p: resolved deadlock on p2p server shutdown #2183
Conversation
Could you also post the deadlock callstack? |
p2p/server.go
Outdated
@@ -448,18 +445,7 @@ func (srv *Server) Stop() { | |||
} | |||
close(srv.quit) | |||
srv.lock.Unlock() | |||
|
|||
stopChan := make(chan struct{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous code tries to skip srv.loopWG.Wait()
if stopTimeout
reach, maybe this is to solve issues before?
Is it ok just remove above case <-cs.handler.stopCh:
, and keep this logic unchanged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's ok but I removed because:
- this is hiding potential deadlock in the future
- there is no right timeout value, it was 5s but for graceful shutdown of node with 2k connection is much more
I am ok with returning it but increasing timeout to like 30s or something, but still, geth has nothing like that in their code and they just wait
f08eda2
to
f0d9f61
Compare
Description
This PR removes forceful stop of p2p server that was caused by deadlock on channels. In
protoTracker()
we should be waiting for signals from allhandlerDoneCh
, but with return onstopCh
we returned and protocol handler got stuck on sending signal tohandlerDoneCh
. This changes removes this deadlock and also part of the code from p2p server that forcefully killed the server.