Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resilience when required servers fail temporarily #32

Open
manuel-freire opened this issue Oct 16, 2015 · 0 comments
Open

Resilience when required servers fail temporarily #32

manuel-freire opened this issue Oct 16, 2015 · 0 comments

Comments

@manuel-freire
Copy link
Contributor

When a required server (mongo, kafka, openlrs) is down, we currently abort.

I propose the following changes:

  1. at startup, patiently keep trying once a second during at least 2m until all required servers are present or too much time has passed (and only then fail).
  2. at runtime, make sure that servers that are temporarily down do not result in complete failure. Failing servers are to be expected in long-running installations with high load.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant