Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait for runtime to be ready in __init__ #3963

Merged
merged 12 commits into from
Sep 20, 2024
Merged

Wait for runtime to be ready in __init__ #3963

merged 12 commits into from
Sep 20, 2024

Conversation

rbren
Copy link
Collaborator

@rbren rbren commented Sep 19, 2024

Short description of the problem this fixes or functionality that this introduces. This may be used for the CHANGELOG

I fixed this a few days ago, but my fix seems to have disappeared.


Give a summary of what the PR does, explaining any non-trivial design decisions

  • Wait for client in init
  • Refactor the wait_for_client function so that log refreshing is done separately, making it easier for other callers to use

Link of any specific issues this addresses

@rbren rbren marked this pull request as ready for review September 19, 2024 18:43
@tofarr
Copy link
Collaborator

tofarr commented Sep 19, 2024

Looks like the call to _wait_until_alive was removed in this commit: a45b20a#diff-a7ff0449d091e3e762a4a8b604fe16a3a5570448751817a5a3e207ab18ec648c

I'd like to know if the removal was accidental before putting this back in. @tobitege - any insights?

@tobitege
Copy link
Collaborator

Looks like the call to _wait_until_alive was removed in this commit: a45b20a#diff-a7ff0449d091e3e762a4a8b604fe16a3a5570448751817a5a3e207ab18ec648c

I'd like to know if the removal was accidental before putting this back in. @tobitege - any insights?

Those 2 commits are not removing your call, but comment out a log refresh in run_action from a previous PR.

@tobitege
Copy link
Collaborator

Line 162 adding back makes sense. If I removed it, then by accident.
But the other changes that process the logs should not just be changed like that, they're there to wait for the client become ready on first call.

@rbren
Copy link
Collaborator Author

rbren commented Sep 19, 2024

But since it's alive in __init__, can't we assume it's alive everywhere else?

@tobitege
Copy link
Collaborator

tobitege commented Sep 19, 2024

But since it's alive in __init__, can't we assume it's alive everywhere else?

Still, the very first call of _wait_until_alive because of potential timing issues should make sure it got'em all and that extra check is only done once.

The websocket becomes alive but we have to wait for the client to be ready (bash/ipython) before we can start sending commands.

@tobitege
Copy link
Collaborator

in #3962 I just tested with minimal reverts, that sudo works again

@rbren
Copy link
Collaborator Author

rbren commented Sep 19, 2024

@tobitege sorry can you explain more? I assume you're correct given that the tests are failing, but once /alive returns 200, can't we assume it's alive from that point forward?

@tobitege
Copy link
Collaborator

@tobitege sorry can you explain more? I assume you're correct given that the tests are failing, but once /alive returns 200, can't we assume it's alive from that point forward?

Oops, I had added this to my previous comment instead of making a new one:
The websocket becomes alive but we have to wait for the client to be ready (bash/ipython) before we can start sending commands.

@rbren
Copy link
Collaborator Author

rbren commented Sep 19, 2024

Thanks for the explanation in Slack Tobi!

I think I have a better fix now--_wait_until_alive will now always wait for log_buffer.client_ready. We also reuse the same retry logic instead of having separate retry logic for the log checker

@tobitege
Copy link
Collaborator

Thanks for the explanation in Slack Tobi!

I think I have a better fix now--_wait_until_alive will now always wait for log_buffer.client_ready. We also reuse the same retry logic instead of having separate retry logic for the log checker

Yes, looks much cleaner!

@rbren rbren enabled auto-merge (squash) September 20, 2024 14:09
@rbren
Copy link
Collaborator Author

rbren commented Sep 20, 2024

OK @tobitege I think the tests should finally be passing

@tobitege
Copy link
Collaborator

tobitege commented Sep 20, 2024

OK @tobitege I think the tests should finally be passing

Ok. But why move the wait before the super.__init__ in the latest commit? That is doing the bash/ipython env initializations.

@tobitege
Copy link
Collaborator

What I read from the logs, the move above the super init isn't helping, other than it will "waste" the first 2 attempts in which guarantueed nothing gets initialized.
It might be more of a time saver to have the retry run with a higher number of retries, but with e.g. a fixed wait time of like 4 seconds, not up to 20.
That would give it a more granular hit-testing, imho.

@rbren
Copy link
Collaborator Author

rbren commented Sep 20, 2024

The super init sets env vars, so it needs the environment to be ready. Otherwise the tests fail

@tobitege
Copy link
Collaborator

The super init sets env vars, so it needs the environment to be ready. Otherwise the tests fail

Hmm ok, due to the @retry "polling" effect now.

Any thoughts on the wait times, though?

@rbren
Copy link
Collaborator Author

rbren commented Sep 20, 2024

OK I refactored a bit so that the env is set later on--hopefully this works

Copy link
Collaborator

@tobitege tobitege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rbren rbren merged commit 72ca169 into main Sep 20, 2024
13 checks passed
@rbren rbren deleted the rb/runtime-refactor branch September 20, 2024 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants