Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Canary Deploy Timeout: Reduce Desired Capacity To Original Value #57

Open
ravingrichie opened this issue Feb 11, 2020 · 1 comment

Comments

@ravingrichie
Copy link

Hey,

When using your tool in Canary mode, testing with new instances that fail the lifecycle hook and do not go into service in the ASG bouncer just continues until it hits the default timeout (20 mins). Is there a way to make bouncer put the desired capacity back to its original value when bouncer hits the timeout value?

If this isn't possible could you add it as a feature request?

Cheers
Richard

@holtwilkins
Copy link
Contributor

Sorry for this taking so long to respond here @ravingrichie . Is this still something you'd like to see?

You're right that this isn't possible currently, as the idea is that your next terraform run should reset your ASG to the correct value before invoking bouncer, and thus do it that way.

I was thinking about this a bit today, and I have some concerns. Namely, I think this reset back to the given value could make sense during the canary phase of canary, i.e. you've gone from your normal 3 nodes -> 4 nodes, and the 4th (new AMI) is somehow broken, so when we hit 20 min, set the ASG back to 3. Just as with terraform, AWS won't immediately destroy the new node, but once it is destroyed, you won't get AWS continually trying to launch a new failing node. However, what about the case if there's just a transient failure during the rollout from 4 -> 6 in my example above? If we implement this feature currently, you could potentially have 5 working nodes, 3 of the old AMI, 2 of the new, then we reset the value back to 3, letting AWS decide what to do to scale you back down to 3, probably a combination of what node is the newest & which nodes are in which AZ. The advantage of bouncer not messing with it is instead it forces you to only set it back to 3 at terraform time (or manually), when bouncer is about to be invoked immediately after, thus ensuring that "the right thing" is done to get you to 3 new nodes, rather than letting AWS pick the 3 nodes it thinks you should have.

Thoughts on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants