Cassandra nodes becomes unreachable to each other #171

behko · 2018-12-09T08:12:24Z

I have 3 nodes of elassandra running in docker containers.

Containers created like:

Host 10.0.0.1 : docker run --name elassandra-node-1 --net=host -e CASSANDRA_SEEDS="10.0.0.1" -e CASSANDRA_CLUSTER_NAME="BD Storage" -e CASSANDRA_DC="DC1" -e CASSANDRA_RACK="r1" -d strapdata/elassandra:latest

Host 10.0.0.2 : docker run --name elassandra-node-2 --net=host -e CASSANDRA_SEEDS="10.0.0.1,10.0.0.2" -e CASSANDRA_CLUSTER_NAME="BD Storage" -e CASSANDRA_DC="DC1" -e CASSANDRA_RACK="r1" -d strapdata/elassandra:latest

Host 10.0.0.3 : docker run --name elassandra-node-3 --net=host -e CASSANDRA_SEEDS="10.0.0.1,10.0.0.2,10.0.0.3" -e CASSANDRA_CLUSTER_NAME="BD Storage" -e CASSANDRA_DC="DC1" -e CASSANDRA_RACK="r1" -d strapdata/elassandra:latest

Cluster was working fine for a couple of days since created, elastic, cassandra all was perfect.

Currently however all cassandra nodes became unreachable to each other:
Nodetool status on all nodes is like

Datacenter: DC1

Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
DN 10.0.0.3 11.95 GiB 8 100.0% 7652f66e-194e-4886-ac10-0fc21ac8afeb r1
DN 10.0.0.2 11.92 GiB 8 100.0% b91fa129-1dd0-4cf8-be96-9c06b23daac6 r1
UN 10.0.0.1 11.9 GiB 8 100.0% 5c1afcff-b0aa-4985-a3cc-7f932056c08f r1

Where the UN is the current host 10.0.0.1
Same on all other nodes.

Nodetool describecluster on 10.0.0.1 is like

Cluster Information:
Name: BD Storage
Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
24fa5e55-3935-3c0e-9808-99ce502fe98d: [10.0.0.1]
            UNREACHABLE: [10.0.0.2,10.0.0.3]

When attached to the first node its only repeating these infos:

2018-12-09 07:47:32,927 WARN [OptionalTasks:1] org.apache.cassandra.auth.CassandraRoleManager.setupDefaultRole(CassandraRoleManager.java:361) CassandraRoleManager skipped default role setup: some nodes were not ready
2018-12-09 07:47:32,927 INFO [OptionalTasks:1] org.apache.cassandra.auth.CassandraRoleManager$4.run(CassandraRoleManager.java:400) Setup task failed with error, rescheduling
2018-12-09 07:47:32,980 INFO [HANDSHAKE-/10.0.0.2] org.apache.cassandra.net.OutboundTcpConnection.lambda$handshakeVersion$1(OutboundTcpConnection.java:561) Handshaking version with /10.0.0.2
2018-12-09 07:47:32,980 INFO [HANDSHAKE-/10.0.0.3] org.apache.cassandra.net.OutboundTcpConnection.lambda$handshakeVersion$1(OutboundTcpConnection.java:561) Handshaking version with /10.0.0.3

After a while when some node is restarted:

2018-12-09 07:52:21,972 WARN [MigrationStage:1] org.apache.cassandra.service.MigrationTask.runMayThrow(MigrationTask.java:67) Can't send schema pull request: node /10.0.0.2 is down.

Tried so far:
Restarting all containers at the same time
Restarting all containers one after another
Restarting cassandra in all containers like : service cassandra restart
Nodetool disablegossip then enable it
Nodetool repair : Repair command #1 failed with error Endpoint not alive: /10.0.0.2

Seems that all node schemas are different, but I still dont understand why they are marked as down to each other.

The text was updated successfully, but these errors were encountered:

wglambert · 2018-12-10T16:19:54Z

Could you post your docker-compose.yml

This looks like your issue if that 10.0.0.* address is from the overlay network: #168 and also #169

wglambert · 2018-12-10T21:51:43Z

I think you need a -e CASSANDRA_BROADCAST_ADDRESS=10.0.0.*

Under the section "For separate machines (ie, two VMs ..."
https://github.com/docker-library/docs/tree/master/cassandra#make-a-cluster

tianon · 2018-12-24T23:48:16Z

I don't really see anything we can change in the image to make this easier, unfortunately. The best I can recommend from here is to try the Docker Community Forums, the Docker Community Slack, or Stack Overflow for further help setting up and configuring a cluster.

tianon · 2018-12-24T23:48:53Z

(Additionally, strapdata/elassandra:latest is not this image.)

tianon closed this as completed Dec 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cassandra nodes becomes unreachable to each other #171

Cassandra nodes becomes unreachable to each other #171

behko commented Dec 9, 2018 •

edited

Loading

Datacenter: DC1

wglambert commented Dec 10, 2018 •

edited

Loading

wglambert commented Dec 10, 2018

tianon commented Dec 24, 2018

tianon commented Dec 24, 2018

Cassandra nodes becomes unreachable to each other #171

Cassandra nodes becomes unreachable to each other #171

Comments

behko commented Dec 9, 2018 • edited Loading

Datacenter: DC1

wglambert commented Dec 10, 2018 • edited Loading

wglambert commented Dec 10, 2018

tianon commented Dec 24, 2018

tianon commented Dec 24, 2018

behko commented Dec 9, 2018 •

edited

Loading

wglambert commented Dec 10, 2018 •

edited

Loading