Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: gcrane - Increase GCRBackoff Steps to 5 and decrease Factor to 5.0 #1823

Closed

Conversation

klemmari1
Copy link

@klemmari1 klemmari1 commented Oct 24, 2023

ExponentialBackoff breaks the loop when backoff.Steps == 1. This might cause the gcrane cp command to fail with 429 errors when we have only 2 steps to GCRBackoff because the wait period is only ~1 minute.

For some reason I saw 429 errors from GCR even after a wait of 15 minutes. Do a couple of improvements to remediate this issue:

  • Increase Steps to 5 to get more retries and a longer wait time on the last step.
  • Decrease Factor to 5.0 to get more retries in a more tight schedule. Last wait time will be the Cap which is 1 hour.
  • Set b.Steps = 1 when b.Duration > b.Cap.

Fix the same issue as this PR.

Fixes #424

@google-cla
Copy link

google-cla bot commented Oct 24, 2023

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@klemmari1 klemmari1 changed the title Gcrane: Increase GCRBackoff steps to 4 and duration to 9 seconds feat: gcrane - Increase GCRBackoff steps to 4 and duration to 9 seconds Oct 24, 2023
@klemmari1 klemmari1 changed the title feat: gcrane - Increase GCRBackoff steps to 4 and duration to 9 seconds feat: gcrane - Increase GCRBackoff Steps to 6 and decrease Factor to 5.0 Oct 24, 2023
…go:112)

breaks the loop when backoff.Steps == 1. This might cause the `gcrane cp`
command to face 429 errors when we have only 2 steps to GCRBackoff because
the wait period is only ~1 minute.

For some reason I saw 429 errors from GCR even after a wait of 15 minutes.
Do a couple of improvements to remediate this issue:

- Decrease Factor to 5.0 seconds and increase steps to 6.
  - This way we get more retries and the maximum wait time is 1 hour
  after which the run will fail.
- Update GCRBackoff docstring.
@klemmari1 klemmari1 force-pushed the gcrane/increase-gcrbackoff-steps branch from e1e7829 to 22abbb4 Compare October 24, 2023 10:15
When cap is hit, we want to run the loop one more time before
breaking out of it.
@@ -89,7 +89,7 @@ func (b *Backoff) Step() time.Duration {
b.Duration = time.Duration(float64(b.Duration) * b.Factor)
if b.Cap > 0 && b.Duration > b.Cap {
b.Duration = b.Cap
b.Steps = 0
b.Steps = 1
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set the Steps to 1 so that we run a final time with Duration = Cap which was not run until now.

@klemmari1 klemmari1 changed the title feat: gcrane - Increase GCRBackoff Steps to 6 and decrease Factor to 5.0 feat: gcrane - Increase GCRBackoff Steps to 5 and decrease Factor to 5.0 Nov 3, 2023
Copy link

github-actions bot commented Feb 2, 2024

This Pull Request is stale because it has been open for 90 days with
no activity. It will automatically close after 30 more days of
inactivity. Keep fresh with the 'lifecycle/frozen' label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gcrane: unrecognized HTTP status: 429 Too Many Requests
1 participant