Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

follow-up: cancel health checks when a resource is not running #6461

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

adamint
Copy link
Member

@adamint adamint commented Oct 23, 2024

Description

Currently, we only cancel health checks when the ResourceHealthCheckService is stopped. We should instead start to run health checks on a resource whenever it gets into the Running state and stop running them when they are no longer Running.

I created a linked cancellation token to represent the lifecycle of an individual resource's periodical health check runs. It is added when the resource is determined to be Running, and it is cancelled (and health reports' statuses are set to null) when the resource is not.

Animation

Additionally, adds several more health check tests for different scenarios, including that reports are updated where the health status did not change but a report status did, and re-enables ResourcesWithHealthCheck_NotHealthyUntilCheckSucceeds and so fixes #6385

Fixes #6450

Checklist

  • Is this feature complete?
    • Yes. Ready to ship.
    • No. Follow-up changes expected.
  • Are you including unit tests for the changes and scenario tests if relevant?
    • Yes
    • No
  • Did you add public API?
    • Yes
      • If yes, did you have an API Review for it?
        • Yes
        • No
      • Did you add <remarks /> and <code /> elements on your triple slash comments?
        • Yes
        • No
    • No
  • Does the change make any security assumptions or guarantees?
    • Yes
      • If yes, have you done a threat model and had a security review?
        • Yes
        • No
    • No
  • Does the change require an update in our Aspire docs?
    • Yes
      • Link to aspire-docs issue:
    • No
Microsoft Reviewers: Open in CodeFlow

@adamint adamint changed the title Cancel health checks when a resource is not running follow-up: cancel health checks when a resource is not running Oct 23, 2024
@mitchdenny
Copy link
Member

I'm not sure we want to do this. Health checks to resources is possibly a many to many relationship.

It's possibly also worth checking out whether the health check invocation is coming from the DefaultHealthCheckService or the ResourceHealthCheckService.

@adamint
Copy link
Member Author

adamint commented Oct 24, 2024

I'm not sure we want to do this. Health checks to resources is possibly a many to many relationship.

Yes that's true, but resource monitoring happens per-resource. We can cancel/restart monitoring per-resource at will without cancelling health checks on other resources. It is potentially expensive to keep running these checks in perpetuity; definitionally, health status is null for non-Running resources, so it would be nice if we could avoid making these calls when possible.

It's possibly also worth checking out whether the health check invocation is coming from the DefaultHealthCheckService or the ResourceHealthCheckService.

They are coming from ResourceHealthCheckService.

@JamesNK
Copy link
Member

JamesNK commented Oct 25, 2024

I can't comment on app hosting changes. I'm not fimilar with the health checks logic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants