Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

view: hack attempting to mitigate split-brain GRPC failures #3291

Merged
merged 1 commit into from
Nov 8, 2023

Conversation

hdevalence
Copy link
Member

This isn't a permanent fix, and it's not the right way to solve the robustness issue (more details TK) but it may help with the immediate problem with summonerd, and seems unlikely to add new failure modes.

@cronokirby
Copy link
Contributor

What's with the protobuf lint failures?

@hdevalence
Copy link
Member Author

I think it's comparing the state of this branch against main, even though the PR is being made against a different branch. Probably the solution is to change the CI workflow to filter to only apply to PRs against main.

@cronokirby
Copy link
Contributor

Also, are we sure that one second is enough?

@hdevalence
Copy link
Member Author

Also, are we sure that one second is enough?

We are not! But it seems likely, and fits in the middle of the confirmation delay we already put in pcli. If it isn't, we'll see it really clearly in the logs, because there will be a warning before an error. So I think it's probably a reasonable starting point.

@cronokirby
Copy link
Contributor

Also, are we sure that one second is enough?

We are not! But it seems likely, and fits in the middle of the confirmation delay we already put in pcli. If it isn't, we'll see it really clearly in the logs, because there will be a warning before an error. So I think it's probably a reasonable starting point.

Yeah, makes sense, if we still run into this we could do a few retries with exponential backoff, but I'm in favor of merging this as an obviously correct patch for now

@hdevalence hdevalence merged commit 5d3abc8 into summonerd-feature-branch Nov 8, 2023
7 of 8 checks passed
@hdevalence hdevalence deleted the tmp-block-fetch-retry-hack branch November 8, 2023 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants