Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker stack error: Unable to gossip with any peers #169

Closed
konradmalik opened this issue Nov 8, 2018 · 1 comment
Closed

docker stack error: Unable to gossip with any peers #169

konradmalik opened this issue Nov 8, 2018 · 1 comment
Labels

Comments

@konradmalik
Copy link

konradmalik commented Nov 8, 2018

Hello,

I cannot run cassandra cluster (2 cassandra nodes) using yml file in docker swarm with 1 node (only my local computer).
I want to use docker swarm due to its resource limitng capability (in my test yaml I don't use that functionality).

First node starts ok, the second one restarts indefinetely due to error "Unable to gossip with any peers":
https://pastebin.com/raw/ji2k3Fga

INFO  [main] 2018-11-08 13:57:13,078 IndexSummaryManager.java:85 - Initializing index summary manager with a memory pool size of 195 MB and a resize interval of 60 minutes
INFO  [main] 2018-11-08 13:57:13,089 MessagingService.java:761 - Starting Messaging Service on /10.255.0.100:7000 (eth1)
WARN  [main] 2018-11-08 13:57:13,093 SystemKeyspace.java:1087 - No host ID found, created 9da2bc83-572d-40c8-9737-0794a175deee (Note: This should happen exactly once per node).
INFO  [main] 2018-11-08 13:57:13,124 OutboundTcpConnection.java:108 - OutboundTcpConnection using coalescing strategy DISABLED
INFO  [ScheduledTasks:1] 2018-11-08 13:57:15,038 TokenMetadata.java:498 - Updating topology for all endpoints that have changed
Exception (java.lang.RuntimeException) encountered during startup: Unable to gossip with any peers
java.lang.RuntimeException: Unable to gossip with any peers
        at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1443)
        at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:547)
        at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:804)
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:664)
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:613)
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:379)
        at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:602)
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:691)
ERROR [main] 2018-11-08 13:58:14,143 CassandraDaemon.java:708 - Exception encountered during startup
java.lang.RuntimeException: Unable to gossip with any peers
        at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1443) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:547) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:804) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:664) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:613) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:379) [apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:602) [apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:691) [apache-cassandra-3.11.3.jar:3.11.3]
INFO  [StorageServiceShutdownHook] 2018-11-08 13:58:14,160 HintsService.java:220 - Paused hints dispatch
WARN  [StorageServiceShutdownHook] 2018-11-08 13:58:14,161 Gossiper.java:1567 - No local state, state is in silent shutdown, or node hasn't joined, not announcing shutdown
INFO  [StorageServiceShutdownHook] 2018-11-08 13:58:14,161 MessagingService.java:992 - Waiting for messaging service to quiesce
INFO  [ACCEPT-/10.255.0.100] 2018-11-08 13:58:14,162 MessagingService.java:1346 - MessagingService has terminated the accept() thread
INFO  [StorageServiceShutdownHook] 2018-11-08 13:58:14,618 HintsService.java:220 - Paused hints dispatch

This is my yml file:
https://pastebin.com/raw/8J0FhsxU

version: '3'
services:
  cassandra-1:
    image: cassandra
    hostname: cassandra-1
    deploy:
      restart_policy:
        condition: on-failure
        max_attempts: 3
        window: 120s
    environment:
      CASSANDRA_BROADCAST_ADDRESS: cassandra-1
      CASSANDRA_CLUSTER_NAME: cassandracluster
      CASSANDRA_ENDPOINT_SNITCH: GossipingPropertyFileSnitch
    volumes:
    - volume1:/var/lib/cassandra
    ports:
    - "7000"
    networks:
      default:
  cassandra-2:
    image: cassandra
    hostname: cassandra-2
    deploy:
      restart_policy:
        condition: on-failure
        max_attempts: 3
        window: 120s
    environment:
      CASSANDRA_BROADCAST_ADDRESS: cassandra-2
      CASSANDRA_SEEDS: cassandra-1
      CASSANDRA_CLUSTER_NAME: cassandracluster
      CASSANDRA_ENDPOINT_SNITCH: GossipingPropertyFileSnitch
    depends_on:
      - cassandra-1
    volumes:
    - volume2:/var/lib/cassandra
    ports:
    - "7000"
    networks:
      default:

volumes:
  volume1:
  volume2:

networks:
  default:
    external:
       name: cassandra-net

It's imporant to say that when I use docker-compose instead of docker stack deploy, this cluster works ok, nodes see each other and no error is thrown.
Also if I use "ping" on cassandra-1 node or cassandra-2 node it works ok (ping cassandra-1 as well as ping cassandra-2), so containers also see each other without problems.
This error also shows if I use sleep in node 2 to delay its start (120 seconds).

This error is very frustrating as every tutorial/example on the internet that shows how to run cassandra on docker stack won't work but it clearly did for some people, maybe my system is somehow responsible?
I use Arch Linux, Docker version 18.06.1-ce, build e68fc7a215

Any suggestions on how to fix that?
Thanks

@wglambert wglambert added question Usability question, not directly related to an error with the image Issue and removed question Usability question, not directly related to an error with the image labels Nov 8, 2018
@wglambert
Copy link

when I use docker-compose instead of docker stack deploy, this cluster works ok

Duplicate of #168
The ports: "7000" line when used in a Docker stack adds an ingress interface and address, there's also an additional interface and address for the overlay network from the Docker stack

in short, our container has no less than three candidate IP addresses, and there's really not any way I can see for us to differentiate them in an automated way

Closing as a duplicate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants