Skip to content
This repository has been archived by the owner on Jan 3, 2023. It is now read-only.

Restart active server causes itself dead occasionally #2230

Open
lipppppp opened this issue Feb 4, 2021 · 3 comments
Open

Restart active server causes itself dead occasionally #2230

lipppppp opened this issue Feb 4, 2021 · 3 comments

Comments

@lipppppp
Copy link
Contributor

lipppppp commented Feb 4, 2021

After restarting active server on ssm1, the service started normally. But the node info page shows that the status of ssm1 is dead, and cmdlets cannot run on ssm1. This problem is accidental, repeated many times the problem will appear. When ssm1 stoped, there are some error messages in the log.
image
image
image
image
image

@lipppppp
Copy link
Contributor Author

lipppppp commented Feb 4, 2021

In this case, it is still dead after restarting the service on ssm1. And there is no problem in the log. Only after the active server is restarted can it return to normal.
image

@PHILO-HE
Copy link
Member

PHILO-HE commented Feb 5, 2021

I cannot reproduce this issue. You can try to debug it. The exception reported in shutting down doesn't matter I think. HazelcastExecutorService#addMember will add newly started SSM server and deliver message to CmdletDispatcherHelper for further handling, which may be helpful in your debugging.

@lipppppp
Copy link
Contributor Author

lipppppp commented Feb 7, 2021

OK, I will try to debug this process. I found sometimes the state of standby server is normal, but all the tasks occured timeout in this case when there is no agent node in cluster.
image
image

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants