-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Statefulset OnDelete Limitation and Enhancement #744
Comments
Hello! That's a very good case to consider!
From the commit, operator will take over the rolling update process if
Since both
|
I think, it's a global vmcluster issue. Changing rolling update strategy or increasing delays between component updates don't help much in this case. |
Hey @mohammadkhavari, |
operator/controllers/factory/k8stools/sts.go Line 194 in e261c37
Instead, the deletion grace period should probably use |
While using vmcluster, rolling and updating vmstorage and vmselect we can't actually use OnDelete policy, cause when updateStrategy is OnDelete, victoriametrics operator will delete sts pods accordingly, this feature was added due to this issue. #344
the problem here is when we apply a minor change in vmstorage configurations operator will delete vmstorage pods in order and vminserts will start rerouting, all storage nodes will face pending indexdb items as soon as new timeseries arrive, and this leads to metrics loss in the pipeline, our vmcluster can tolerate one storage node downtime without metric loss, but multiple storage node downtime is troublesome.
Is there any way that we can actually use OnDelete Strategy on vmstorage statefulsets, or even a parameter like stsPodsRollingDelay can be save us from downtime, a time that operator waits between recreating each vmstorage pod.
In fact we can turn off the operator and update statefulset and vmcluster objects and delete pods accordingly, but if we make a mistake in consistency between our manually edited sts and the sts that vmcluster will generate (that cause different revisions), by activating operator pods may recreated again.
The text was updated successfully, but these errors were encountered: