fix(scheduler): ensure recursive jobs can't be queued twice #11955
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR was originally going to be just tests for #11826, which was merged a couple of weeks ago. But while working on the tests I realised there was a small bug in my original PR, which I've fixed here.
In my original fix, I assumed that it was safe to clear the
QUEUED
flag unconditionally after a job had run. However, this isn't always correct for jobs withALLOW_RECURSE
.For a job with
ALLOW_RECURSE
, theQUEUED
flag is already cleared before the job runs. If the job does recurse then theQUEUED
flag will be set again. In that scenario it shouldn't be cleared after the job runs. Mistakenly clearing the flag allows the job to be added to the queue again.In practice, even if the job does get queued twice, the dirtiness checks should significantly limit its impact. The 'Maximum recursive updates' warning could be hit prematurely, but as the bug only allows the job to be queued twice it would still need to recurse several times to hit that warning.
While the bug is low impact, it's also an easy fix.
There is some duplication between
flushJobs
andflushPreFlushCbs
. I may investigate refactoring that later, but I didn't want to complicate this PR with a refactoring.The tests cover various scenarios for both this fix and the original fix in #11826.
This PR only considers the main scheduler queue, not the
post
queue. Thepost
queue does still need tests writing for #11826, but there are various other edge cases I need to think through before I attempt that. It uses aSet
to de-duplicate jobs, so it isn't impacted by the bug I've fixed here.