-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Operator deletes tigera-system namespace on ApiServer deployment #2912
Comments
Running the operator in one of the namespaces it uses/manages is likely to cause other problems. It is recommended to run the operator in the tigera-operator namespace and if it is desired to use a different namespace then it should be a dedicated namespace that only the operator will run in. The reason the operator is deleting that namespace is because we use that namespace ( |
InvestigationApi Server is deployed to either When Operator is installed in Same situation would happen for an Enterprise variant if, somehow, the Operator would be installed in We can easily avoid the deletion of a namespace (the namespace where Operator is running is available via I would suggest maintaining two lists for namespaced objects (both for Calico and Enterprise) and on a clash append corresponding lists to deleted objects or just the namespace if there's no clash. Any thoughts/comments? I can take care of this. CC: @caseydavenport |
@tmjd we've been running Operator in |
To mitigate this issue, it should be very easy to migrate to a different namespace, like Like I mentioned running the operator in a namespace it manages could cause other problems. I think if we add any changes to support a non-recommended setup, I'd suggest we add something to startup to try to check that the operator isn't being deployed to a namespace it might attempt to manage. What I mean is something that we add at startup to have the operator compare its namespace to a list of ones it might use and if it finds itself running in one of those it would exit/crash. That would prevent the operator from ever getting in a situation where the namespace it is running in would be deleted because of its management. I'm not sure if this is a good idea since the reuse of the names for other operator resource (like its ServiceAccount, ClusterRole, and ClusterRoleBinding) could still be done and cause problems. |
Self-destruction is a pretty serious "problem". In such case indeed Operator should refuse to start if deployed to a namespace it is about to manage. That should also be documented in detail. I'll check if we can move the installation easily. I guess we can close this issue and the associated PR in favor of a new one that will block Operator from starting in one of the "opinionated" namespaces. Cheers |
I agree about failing to start if running in one of these namespaces - @tmjd what do you think about having a change to our main() function that simply exits if we're running in a known namespace managed by the operator? e.g., I think we want the operator to simply die if anyone tried to run in those namespaces, because otherwise unexpected behavior will occur somewhere further along in the operator code (probably more can go wrong than simply deleting its own namespace) |
Yep, that is exactly what I was thinking would be a reasonable course of action. |
Operator running in
tigera-system
will perform self-destruction (delete own Namespace) if ApiServer is enabled.Expected Behavior
Operator should never attempt to delete a namespace in which it is deployed.
Current Behavior
Operator running in
tigera-system
will perform self-destruction (delete own Namespace) if ApiServer is enabled.Steps to Reproduce (for bugs)
Deploy Tigera Oprator in
tigera-system
and enable ApiServer.Your Environment
We are running Tiger Operator 1.29.4 with Calico 3.25.1 and ApiServer disabled. The reason for such a setup is that we've seen major issues with internal controllers (registering as owners of certain namespaces) being broken when ApiServer was deployed. Now that I've got some free cycles I wanted to return to this issue and try to investigate it further (in the meantime we've done a number of upgrades of our Tigera/Calico stack hence the issue might be gone already). To my surprise when I dropped the ApiServer resource the
tigera-system
namespace was deleted (that's where we install the Operator). To be precise it was still hanging in theTerminating
state as there were many references to the Operator that were preventing clean deletion.The text was updated successfully, but these errors were encountered: