Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent concurrent deploys against a given schema w/ a timeout functionality #111

Closed
shantstepanian opened this issue Nov 14, 2017 · 8 comments

Comments

@shantstepanian
Copy link
Contributor

Expected Behavior

  1. Prevent a deploy from being executed if one of the physical schemas requested for deployment is already being deployed to by another process

  2. Allow a timeout to be specified so that we don't have a case where a process exits and doesn't "unlock" the deployment

  3. Ensure that "unlocking" happens on shutdown hook if needed

  4. Try to check on active processes via some approach, whether by checking the PID of the deploy process or the connection number

  5. The above assume a pessimistic lock. Brainstorm if optimistic lock could work? However, DBMS types like Sybase IQ by default will lock a whole table on a write, making optimistic locking difficult in practice

Actual Behavior

No such lock behavior exists

Obevo Version where this issue was observed

All versions lack this feature

Steps to reproduce

Try in an in-memory db?

@shantstepanian shantstepanian added this to the External Contributions milestone Jan 27, 2018
@agentgt
Copy link

agentgt commented Jun 26, 2019

I was just about to file a bug on this but I found this issue.

This is a showstopper for us. We can't use Obevo with out this support.

I recommend perhaps looking at how Flyway does it since they support a global lock for cluster support.

Perhaps if I have time next week I will look into forking and making the change.

@shantstepanian
Copy link
Contributor Author

Can you elaborate on your use case a bit more (ie how you are using Obevo/Flyway to do deployments) and why this is critical? I'd like to know to ensure that whatever I had in mind originally for this (as listed in the ticket) still satisfies your use case

I'll give the implementation a think in the meantime

@agentgt
Copy link

agentgt commented Jun 27, 2019

See my comments on #242 but basically when an application is deployed into production it will do the migration programmatically.

Since we have a cluster of applications that maybe in the process of rebooting for upgrade or canary testing we absolutely need cluster support and to be honest even Flyway doesn't do this correctly.

This is because the application needs to tell Flyway/Obevo what the current version being deployed/rebooted is of the instance (the application version and not the schema). This requires another table or something akin that has the version of the application stored. Flyway does not support this see flyway/flyway#1999 . For Flyway we sort of have a forked workaround and we will probably do the same for Obevo (this is because we need access to the connection).

If the application initializing is:

  • Newer it updates the application version table with the latest version and lets the migration run. This requires locking and rollback for correct support. Postgresql has this level of support.
  • Older the application boots up with warning and does not run the migration
  • Same we let the migration still run since in theory it should be idempotent but one could argue for it not to run.

@shantstepanian
Copy link
Contributor Author

Thanks for the info

In terms of locking the schema for deploy, we should be able to add a hook into our code (see MainDeployer.kt and the if (this.shouldProceedWithDbChange(artifactsToProcess, deployerArgs)) line), and then each DB platform can plugin its own implementation (Advisory Lock as you said for Postgres). I'll give that a read and see how it can compare against other platforms. I may not be able to analyze this for a few days (I'm happy to take your help with the fork and to help you w/ the code, but no obligation on your end; I'm also happy to take this up myself)

I do see a couple other interesting tidbits in your use case:

  1. You mentioned a cluster of applications that deploy to this schema. Are these the same set of objects being deployed or do you have different applications managing different slices of objects?

If the latter, then we have some thoughts on that (see section C here with the addendum/correction here)

  1. In terms of passing in the "application version" - we have some support with that as it relates to rollbacks. Your use case in that issue doesn't exactly match how we used it on our side, but it may be worth you knowing it just in case. However, the documentation isn't published on Github yet; I will do that in the next few days

@shantstepanian
Copy link
Contributor Author

An update here - I have a POC working locally (code is committed to the "concurrentDeployCheck" branch, but it is a work in progress, a lot of cleanup to do and hacks still in the code). But wanted to talk this through a bit

In terms of how it will work:

  • The lock would be obtained prior to the changeset calculation and released when the deploy is done
  • We can obtain locks around the physical schemas declared in your config, i.e. if your DB server has 10 schemas, and you are only trying a deploy on 2 of them, we can lock around those 2 schemas, and not the other 8
  • If one caller attempts a deploy while another deploy has obtained a lock, then my default stance is to have the caller error out immediately. I am open to providing an option to have the behavior be to wait for the lock to be free, but that would be an add-on to the existing behavior. (your input on whether you'd need this add-on would help decide when to add this)

Before I do further code changes on this, I'll write up the "application version" support that I mentioned in the last post (I will aim for that in the next couple days), so that you can have more info on how to proceed

@shantstepanian
Copy link
Contributor Author

shantstepanian commented Jul 8, 2019

@agentgt See this doc for more information on the application deployment use case.

Note the existing methodology around rollback. We do have the option to change what is done in case you need something different besides the default rollback behavior (e.g. to instead do nothing instead of attempting a deploy, in case a lower version competitor is found). We can also implement this via semantic versioning if easier

Let us know your thoughts

@shantstepanian
Copy link
Contributor Author

shantstepanian commented Dec 4, 2019

@agentgt A bit of a delay on our end, but we finally merged this change into master for PostgreSQL

Let me know if you have any questions on this

@shantstepanian
Copy link
Contributor Author

Note - timeout hasn't been added at this time, and lock is against the DB server itself, vs just an individual schema

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants