Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ansible-operator] Status false when type/reason = Successful #18

Open
PYLochou opened this issue Sep 4, 2023 · 9 comments
Open

[ansible-operator] Status false when type/reason = Successful #18

PYLochou opened this issue Sep 4, 2023 · 9 comments
Assignees

Comments

@PYLochou
Copy link

PYLochou commented Sep 4, 2023

Type of question

General operator-related help

Question

What did you do?

I develop an Ansible operator and let it manage its status (and I prefer that way - it will probably manage it better by itself).
However our testers recently complained about the following result:

    {
      "lastTransitionTime": "2023-07-25T07:14:46Z",
      "message": "Last reconciliation succeeded",
      "reason": "Successful",
      "status": "False",
      "type": "Successful"
    }

They would like to see "True" instead of "False" in the status field which would be more coherent with the actual results. (They use some script to check all statuses, my operator being just one amongst many, and wait for True I guess.)

What did you expect to see?

I'd rather expect something like:

    {
      "lastTransitionTime": "2023-07-25T07:14:46Z",
      "message": "Last reconciliation succeeded",
      "reason": "Successful",
      "status": "True",
      "type": "Successful"
    }

but maybe I missed something there.

What did you see instead? Under which circumstances?

We actually see status: False and don't understand why.

Environment

Operator type:

/language ansible

Kubernetes cluster type:

OpenShift

$ operator-sdk version

operator-sdk version: "v1.31.0"

$ kubectl version

Client Version: v1.28.1

Additional context

@OchiengEd
Copy link
Contributor

Hi @PYLochou Would you please provide the the full status for your custom resource you observed the above behavior on? From the operator-sdk code, it seems while reconcile is ongoing, the Running condition would be set to true while the Successful condition would be set to false.

I may need to see all the conditions in the status block to be able to aptly determine if we are looking at a bug.

@PYLochou
Copy link
Author

PYLochou commented Sep 20, 2023

Thank you for the explaination, Edmund. You're right, at this moment there was a reconciliation running:

"conditions": [
    {
      "lastTransitionTime": "2023-07-24T14:02:37Z",
      "message": "",
      "reason": "",
      "status": "False",
      "type": "Failure"
    },
    {
      "lastTransitionTime": "2023-07-25T07:14:46Z",
      "message": "Last reconciliation succeeded",
      "reason": "Successful",
      "status": "False",
      "type": "Successful"
    },
    {
      "lastTransitionTime": "2023-07-24T14:06:40Z",
      "message": "Running reconciliation",
      "reason": "Running",
      "status": "True",
      "type": "Running"
    }
  ]

I'll transmit your explaination to our testers, thank you!

@PYLochou
Copy link
Author

Another closely-related issue, if I may...?

On another run (with buggy input in the CR, hence the errors), they got the following conditions:

"conditions": [
    {
      "lastTransitionTime": "2023-07-24T08:59:32Z",
      "message": "",
      "reason": "",
      "status": "False",
      "type": "Successful"
    },
    {
      "ansibleResult": {
        "changed": 0,
        "completion": "2023-07-24T13:23:28.4525",
        "failures": 1,
        "ok": 96,
        "skipped": 68
      },
      "lastTransitionTime": "2023-07-24T13:23:29Z",
      "message": "Could not find the following secrets ['ibm-cp4ba-db-ssl-secret-for-svl']. Create it or them before installing the cloudpak.",
      "reason": "Failed",
      "status": "False",
      "type": "Failure"
    },
    {
      "lastTransitionTime": "2023-07-24T13:23:34Z",
      "message": "Running reconciliation",
      "reason": "Running",
      "status": "True",
      "type": "Running"
    }
  ]

and this time, for the type Failure, they are waiting for status True (which may seem logical), is it again because the reconciliation is running?

@OchiengEd
Copy link
Contributor

OchiengEd commented Sep 20, 2023

Interesting find. So, it seems in the previous reconcile loop, the Failure condition was set to true and failure reason captured. However, on subsequent reconcile, the Failure condition is set to false and the Running condition was set to true.

I think, it is logical argument to leave the Failure condition as previously set since it is accurate as of time of recording. However, when marking the reconcile to be complete / successfully deployed, then the previous error could be reset.

@PYLochou
Copy link
Author

Yes, I agree ("it is logical argument to leave the Failure condition as previously set since it is accurate as of time of recording") and remove it when the deployment gets successful. No False status should be set anytime on this one IMHO.

@OchiengEd
Copy link
Contributor

Roger that. I will table this in the next meeting and see if there is consensus to make the change. I may explore creating a PR for this in the interim.

@OchiengEd
Copy link
Contributor

The PR to solve this issue has been moved to the ansible-operator-plugins repository #14

@openshift-ci
Copy link

openshift-ci bot commented Oct 5, 2023

@PYLochou: The label(s) language/ansible cannot be applied, because the repository doesn't have them.

In response to this:

Type of question

General operator-related help

Question

What did you do?

I develop an Ansible operator and let it manage its status (and I prefer that way - it will probably manage it better by itself).
However our testers recently complained about the following result:

   {
     "lastTransitionTime": "2023-07-25T07:14:46Z",
     "message": "Last reconciliation succeeded",
     "reason": "Successful",
     "status": "False",
     "type": "Successful"
   }

They would like to see "True" instead of "False" in the status field which would be more coherent with the actual results. (They use some script to check all statuses, my operator being just one amongst many, and wait for True I guess.)

What did you expect to see?

I'd rather expect something like:

   {
     "lastTransitionTime": "2023-07-25T07:14:46Z",
     "message": "Last reconciliation succeeded",
     "reason": "Successful",
     "status": "True",
     "type": "Successful"
   }

but maybe I missed something there.

What did you see instead? Under which circumstances?

We actually see status: False and don't understand why.

Environment

Operator type:

/language ansible

Kubernetes cluster type:

OpenShift

$ operator-sdk version

operator-sdk version: "v1.31.0"

$ kubectl version

Client Version: v1.28.1

Additional context

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@everettraven everettraven transferred this issue from operator-framework/operator-sdk Oct 5, 2023
@everettraven everettraven linked a pull request Oct 5, 2023 that will close this issue
2 tasks
@everettraven
Copy link
Collaborator

@OchiengEd this was resolved by #35 right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants