Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cassandra nodes becomes unreachable to each other #171

Closed
behko opened this issue Dec 9, 2018 · 4 comments
Closed

Cassandra nodes becomes unreachable to each other #171

behko opened this issue Dec 9, 2018 · 4 comments

Comments

@behko
Copy link

behko commented Dec 9, 2018

I have 3 nodes of elassandra running in docker containers.

Containers created like:

Host 10.0.0.1 : docker run --name elassandra-node-1 --net=host -e CASSANDRA_SEEDS="10.0.0.1" -e CASSANDRA_CLUSTER_NAME="BD Storage" -e CASSANDRA_DC="DC1" -e CASSANDRA_RACK="r1" -d strapdata/elassandra:latest

Host 10.0.0.2 : docker run --name elassandra-node-2 --net=host -e CASSANDRA_SEEDS="10.0.0.1,10.0.0.2" -e CASSANDRA_CLUSTER_NAME="BD Storage" -e CASSANDRA_DC="DC1" -e CASSANDRA_RACK="r1" -d strapdata/elassandra:latest

Host 10.0.0.3 : docker run --name elassandra-node-3 --net=host -e CASSANDRA_SEEDS="10.0.0.1,10.0.0.2,10.0.0.3" -e CASSANDRA_CLUSTER_NAME="BD Storage" -e CASSANDRA_DC="DC1" -e CASSANDRA_RACK="r1" -d strapdata/elassandra:latest

Cluster was working fine for a couple of days since created, elastic, cassandra all was perfect.

Currently however all cassandra nodes became unreachable to each other:
Nodetool status on all nodes is like

Datacenter: DC1

Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
DN 10.0.0.3 11.95 GiB 8 100.0% 7652f66e-194e-4886-ac10-0fc21ac8afeb r1
DN 10.0.0.2 11.92 GiB 8 100.0% b91fa129-1dd0-4cf8-be96-9c06b23daac6 r1
UN 10.0.0.1 11.9 GiB 8 100.0% 5c1afcff-b0aa-4985-a3cc-7f932056c08f r1

Where the UN is the current host 10.0.0.1
Same on all other nodes.

Nodetool describecluster on 10.0.0.1 is like

Cluster Information:
Name: BD Storage
Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
24fa5e55-3935-3c0e-9808-99ce502fe98d: [10.0.0.1]

            UNREACHABLE: [10.0.0.2,10.0.0.3]

When attached to the first node its only repeating these infos:

2018-12-09 07:47:32,927 WARN [OptionalTasks:1] org.apache.cassandra.auth.CassandraRoleManager.setupDefaultRole(CassandraRoleManager.java:361) CassandraRoleManager skipped default role setup: some nodes were not ready
2018-12-09 07:47:32,927 INFO [OptionalTasks:1] org.apache.cassandra.auth.CassandraRoleManager$4.run(CassandraRoleManager.java:400) Setup task failed with error, rescheduling
2018-12-09 07:47:32,980 INFO [HANDSHAKE-/10.0.0.2] org.apache.cassandra.net.OutboundTcpConnection.lambda$handshakeVersion$1(OutboundTcpConnection.java:561) Handshaking version with /10.0.0.2
2018-12-09 07:47:32,980 INFO [HANDSHAKE-/10.0.0.3] org.apache.cassandra.net.OutboundTcpConnection.lambda$handshakeVersion$1(OutboundTcpConnection.java:561) Handshaking version with /10.0.0.3

After a while when some node is restarted:

2018-12-09 07:52:21,972 WARN [MigrationStage:1] org.apache.cassandra.service.MigrationTask.runMayThrow(MigrationTask.java:67) Can't send schema pull request: node /10.0.0.2 is down.

Tried so far:
Restarting all containers at the same time
Restarting all containers one after another
Restarting cassandra in all containers like : service cassandra restart
Nodetool disablegossip then enable it
Nodetool repair : Repair command #1 failed with error Endpoint not alive: /10.0.0.2

Seems that all node schemas are different, but I still dont understand why they are marked as down to each other.

@wglambert
Copy link

wglambert commented Dec 10, 2018

Could you post your docker-compose.yml

This looks like your issue if that 10.0.0.* address is from the overlay network: #168 and also #169

@wglambert
Copy link

I think you need a -e CASSANDRA_BROADCAST_ADDRESS=10.0.0.*

Under the section "For separate machines (ie, two VMs ..."
https://github.com/docker-library/docs/tree/master/cassandra#make-a-cluster

@tianon
Copy link
Member

tianon commented Dec 24, 2018

I don't really see anything we can change in the image to make this easier, unfortunately. The best I can recommend from here is to try the Docker Community Forums, the Docker Community Slack, or Stack Overflow for further help setting up and configuring a cluster.

@tianon tianon closed this as completed Dec 24, 2018
@tianon
Copy link
Member

tianon commented Dec 24, 2018

(Additionally, strapdata/elassandra:latest is not this image.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants