-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to deploy CNF when there are some metrics apiservices are in False or FailedDiscoveryCheck #6782
Comments
Hi @ansvu what version of operator-sdk is being used? There were many issues in the |
Thanks @acornett21 for your info. They used the ansible-operator version from the community. So you meant this version quay.io/operator-framework/ansible-operator:v1.34.3? |
@ansvu Are they just updating the |
@acornett21 This CNF is a little bit special, they combined between helm chart and ansible-operator and they used ansible-operator version straight from here quay.io/operator-framework/ansible-operator. No OLM integrated. It designs and architects not only for OCP but also other Kubernetes cluster as well. |
@ansvu I understand, but if they have to have some |
Hi @acornett21 as I know that they don't use |
Hi @acornett21, they used this version kubectl get apiservice | grep False
v1alpha1.example.com try/api False (ServiceNotFound) 27m The result has same error as in version 2024-07-24 15:29:13,913 p=3470 u=ansible n=ansible | TASK [cnf_status : Store CNF status and data] **********************************
2024-07-24 15:29:13,913 p=3470 u=ansible n=ansible | fatal: [localhost]: FAILED! => {"changed": false, "msg": "Failed to create object: b'Unable to determine if virtual resource\\n'", "reason": "Internal Server Error"}
2024-07-24 15:29:13,914 p=3470 u=ansible n=ansible | PLAY RECAP |
Hi @acornett21, has any suggests on above test result? |
Hey folks, just wanted to add more information here. To me, it would seem like #6222 is a potential fix to this problem, given the error comes from that proxy code. Granted, this has since moved to this repo, so the equivalent would be here: https://github.com/operator-framework/ansible-operator-plugins/blob/main/internal/ansible/proxy/inject_owner.go#L86-L96 From what I can tell, #6222 stalled because a proper test case wasn't found. Based on my testing, you can just stand up an APIService with an invalid service reference and it should immediately trigger this issue. E.g. apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
name: v1alpha1.example.com
spec:
caBundle: 'Zm9vCg=='
group: example.com
groupPriorityMinimum: 1000
service:
name: example-api
namespace: non-existent
port: 443
version: v1alpha1
versionPriority: 15 The APIServer accepts this, but it immediately becomes unavailable because the underlying service is not found. I'll leave it up to maintainers what they want to do with this information, or if they want to take #6222 and replicate it over in the ansible-operator-plugins repository. |
Type of question
Question
What did you do?
There is a partner using ansbile operator 1.34-2, when they tried to deploy their CNF, the following error occurred.
2024-07-05 06:25:14,241 p=17 u=ansible n=ansible | fatal: [localhost]: FAILED! => {"changed": false, "msg": "Failed to create object: b'Unable to determine if virtual resource\\n'", "reason": "Internal Server Error"}
This is the API being called from ansible code:
They noticed these two apiservices are in
False
orFailedDiscoveryCheck
state:If they removed these two apiservices then the CNF deployment worked fine.
They said that they did not observe any error in
ansible-operator v1.31
when there are some apiservices inFalse
state.Are there any new changes in
ansible-operator v1.34.2
that triggered this issue? Is it needed for all apiservices to be inTrue
state now?What did you expect to see?
CNF to be deployed without this error
"Failed to create object: b'Unable to determine if virtual resource
What did you see instead? Under which circumstances?
any ansible task used by the operator through the ansible K8s module, throwing the error.
Environment
Operator type:
ansible-operator 1.34-2
Kubernetes cluster type:
Google GKE
$ operator-sdk version
ansbile-operator 1.34-2
$ go version
(if language is Go)NA
$ kubectl version
v1.29.3
Additional context
Some existing issues reported but there is no solution but advised to fix the cluster health or removed apiservices.
https://access.redhat.com/solutions/6813781
https://bugzilla.redhat.com/show_bug.cgi?id=2063774
#5596
#6222
The text was updated successfully, but these errors were encountered: