Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

statefulset: Use stateful sets for postgres deployments (PROJQUAY-6672) #980

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

deshpandevlab
Copy link
Contributor

@deshpandevlab deshpandevlab commented Oct 16, 2024

We're adding a check for Deployment here because during the migration process from Deployment to StatefulSet for the Clair Postgres component, we cannot assume that the StatefulSet will always be present. There may be instances where the old Deployment is still in use, especially if the upgrade process is ongoing or hasn't been initiated yet. This ensures that we can handle both the old Deployment-based setup and the new StatefulSet-based setup, providing a smoother transition and backwards compatibility during the upgrade process.

Testing

  • e2e
  • manually validated the migration
oc get pods -n shubhra
NAME                                              READY   STATUS             RESTARTS       AGE
shubhra-registry-clair-app-84b66cf595-tfm2p       1/1     Running            0              2m12s
shubhra-registry-clair-app-84b66cf595-zr64h       1/1     Running            0              2m12s
shubhra-registry-clair-postgres-0                 1/1     Running            0              2m12s
shubhra-registry-clair-postgres-bbf8d4558-hwd7q   1/1     Terminating        0              10m
shubhra-registry-clair-postgres-upgrade-plmnn     0/1     Completed          0              5m36s
shubhra-registry-quay-app-596b9db6c-6nmsh         1/1     Running            1 (13m ago)    13m
shubhra-registry-quay-app-596b9db6c-j9nzh         1/1     Running            2 (13m ago)    13m
shubhra-registry-quay-app-858546c478-5sxcs        0/1     CrashLoopBackOff   5 (2m4s ago)   5m36s
shubhra-registry-quay-app-upgrade-px67m           0/1     Completed          0              14m
shubhra-registry-quay-database-5b6d4b9897-t4x2w   1/1     Running            0              13m
shubhra-registry-quay-redis-76bc4fd7f4-x8l4s      1/1     Running            0              13m

bcaton85 and others added 11 commits October 1, 2024 10:48
…y for authentication (PROJQUAY-2417) (quay#963) (quay#966)

Missed that we need the certificate handling in the mirror pod when using Postgres SSL authentication

Co-authored-by: Michaela Lang <[email protected]>
…ed clair resources (PROJQUAY-7993) (quay#970)

The scale-down component was never rendered because of the incorrect path.
The clair-postgres and clair-app needed to scale back after the postgres upgrade.

This change ensures following flow in the case where clair is managed:
- Scale down clair-app
- Scale down clair-postgres
- Upgrade clair-postgres
- Scale up clair-postgres
- Scale up clair-app

---------

Co-authored-by: Shubhra Deshpande <shubhrajayant+github.com>
- Swap Postgres and Clair Postgres Deployments to StatefulSets
@ibazulic
Copy link
Member

We're adding a check for Deployment here because during the migration process from Deployment to StatefulSet for the Clair Postgres component, we cannot assume that the StatefulSet will always be present. There may be instances where the old Deployment is still in use, especially if the upgrade process is ongoing or hasn't been initiated yet. This ensures that we can handle both the old Deployment-based setup and the new StatefulSet-based setup, providing a smoother transition and backwards compatibility during the upgrade process.

There should never be a case where we need to manage both deployments and statefulsets at the same time IMHO. Once the operator sees a deployment, it should shut down all Quay instances immediately and begin a migration from a deployment to a statefulset. There should be no instances of Quay or Clair running during the migration period, no jobs, no activity, apart from the migration pod.

@deshpandevlab
Copy link
Contributor Author

There should never be a case where we need to manage both deployments and statefulsets at the same time IMHO. Once the operator sees a deployment, it should shut down all Quay instances immediately and begin a migration from a deployment to a statefulset. There should be no instances of Quay or Clair running during the migration period, no jobs, no activity, apart from the migration pod.

The check for both StatefulSet and Deployment is not intended to be about managing both simultaneously or during database migration. It's specifically to handle the transition period when upgrading the operator itself from a version that used Deployments to one that uses StatefulSets for Clair Postgres.

This allows the operator to correctly identify the current state (Deployment or StatefulSet) and proceed with the appropriate upgrade path, ensuring a smooth transition between operator versions without assuming a specific resource type exists.

The actual migration process from Deployment to StatefulSet would still occur as a separate, controlled step as you described.

@ibazulic
Copy link
Member

ibazulic commented Oct 16, 2024

@deshpandevlab ah, gotcha, so it's only used for identification of the current state. Thanks! I misunderstood what the actual purpose is.

Copy link

@crozzy crozzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually it LGTM, I can't offer an in-depth code review as I'm not so familiar with the operator.

I assume this PR supersedes https://github.com/quay/quay-operator/pull/927/files?

@deshpandevlab
Copy link
Contributor Author

/test ocp-latest-e2e

test failed for mounting error. retrying the test

Copy link

openshift-ci bot commented Oct 17, 2024

@deshpandevlab: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/ocp-latest-e2e 83038f4 link true /test ocp-latest-e2e

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants