Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚀 Release v1.54.0 #4288

Closed
24 of 25 tasks
matusdrobuliak66 opened this issue May 30, 2023 · 6 comments
Closed
24 of 25 tasks

🚀 Release v1.54.0 #4288

matusdrobuliak66 opened this issue May 30, 2023 · 6 comments
Assignees
Labels
release Preparation for pre-release/release t:maintenance Some planned maintenance work
Milestone

Comments

@matusdrobuliak66
Copy link
Contributor

matusdrobuliak66 commented May 30, 2023

Release version

1.54.0

Commit SHA

0f99027f479e67f9c0fc057233bb1d8db6573bea

Previous pre-release

https://github.com/ITISFoundation/osparc-simcore/releases/tag/staging_Watermelon1

Did the commit CI suceeded?

  • The commit CI succeeded.

Motivation

  • DevOps will prepare infrastructure for new big Sim4Life product
  • As maintenance will be announced we will take advantage of that and also release to production (includes some important bug fixes)

Changes

Staging PastelDeNata5

Release Issue: #4318

Staging Watermelon1

Release Issue: #4355

Devops check 👷

Tests assessment: e2e testing check 🧪

  • AWS staging
    • sleepers failing (problems started to occur when we start to test big computations - in theory should not propagate)
    • timeouts on the e2e-portal check (problems started to occur when we start to test big computations - in theory should not propagate)
  • Dalco staging
    • currently, the maintenance page is up after yesterday's pre-release

Test assessment: targeted-testing 🔍️

No response

Test assessment: user-testing 🧐

No response

Summary 📝

  • Prepare release link
make release-prod version=1.54.0  git_sha=0f99027f479e67f9c0fc057233bb1d8db6573bea
  • Draft release changelog
  • Announce maintenance ( ** ANNOUNCE AT LEAST 24 HOURS BEFORE ** )
  • redis {"start": "2023-06-15T05:30:00.000Z", "end": "2023-06-15T08:30:00.000Z", "reason": "Maintenance & Release v1.54.0"}
    • aws
    • dalco
    • tip
  • status page (https://manage.statuspage.io/)
    • osparc
    • s4l
  • mattermost channels
    • maintenance
    • power users

Releasing 🚀

  • Maintenance page up.
cd /deployment/production/osparc-ops-environments
make up-maintenance
make down-maintenance
  • Check hanging sidecars. Helper command to run in director-v2 CLI simcore-service-director-v2 close-and-save-service <uuid>
  • Release by publishing draft
  • Check release CI
  • Check deployed
    • aws deploy
    • dalco deploy
    • tip deploy
  • Delete announcement
  • Check e2e runs
  • Announce
:tada: https://github.com/ITISFoundation/osparc-simcore/releases/tag/v1.54.0
@matusdrobuliak66 matusdrobuliak66 added t:maintenance Some planned maintenance work release Preparation for pre-release/release labels May 30, 2023
@matusdrobuliak66 matusdrobuliak66 self-assigned this May 30, 2023
@mrnicegyu11
Copy link
Member

mrnicegyu11 commented May 30, 2023

We have prometheus cardinality growing without bounds on aws-prod, due to container names being random for comp. services. @Sylvain Anderegg is kind enough to address this, until then we reove the docker-events-exporter (that alerts for OOM events). If the PR is propagated to prod, we should re-enable the docker-events exporter by re-deploying the ops-monitoring stack

@GitHK
Copy link
Contributor

GitHK commented Jun 7, 2023

⚠️ WARNING ⚠️

@matusdrobuliak66
Copy link
Contributor Author

NOTE: This fix #4316 should help with the problem that was occurring in Dalco staging, and we have also seen it in the TIP production deployment

@matusdrobuliak66
Copy link
Contributor Author

⚠️ WARNING ⚠️

DYNAMIC_SIDECAR_ENABLE_VOLUME_LIMITS=False
should be set in the configuration osparc-ops-deployment-configuration

@matusdrobuliak66
Copy link
Contributor Author

matusdrobuliak66 commented Jun 13, 2023

during last prerelease: Watermelon1 devops did:

  • Re-provision graylog (modified alerts)
  • Restart and re-provision monitoring stack (grafana dashboards / changes)
  • Re-Start admin panels (new admin panel from Eli)

also deployment-agent needs to be redeployed?

@mrnicegyu11 please comment here if there are devops tasks that need to be done also during release, thanks!

@matusdrobuliak66
Copy link
Contributor Author

  • infrastructure for the new big sim4life product was introduced
  • v1.54.0 of simcore was released
  • new version of deployment agent was introduced
  • 2 services were not able to start -> problem with the long hostname (can be max 63 characters)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release Preparation for pre-release/release t:maintenance Some planned maintenance work
Projects
None yet
Development

No branches or pull requests

3 participants