-
-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix two tests that fail on a heavily-loaded machine #38825
Conversation
One test in this file sets the timeout to 100ms and expects an easy example to complete within that amount of time. It can still fail, though, if the machine is under load: File "src/sage/parallel/decorate.py", line 519, in sage.parallel.decorate.fork Failed example: g(5, m=5) Expected: 8 Got: Killing subprocess 62130 with input ((5,), {'m': 5}) which took too long 8 This example isn't really the purpose of the doctest -- its intent is to show that there are hard examples that will NOT complete in 100ms. (That is much easier to do.) All things considered, we are better off just dropping this one flaky example.
The test we have for start_workers() is doing a bit more than that. It's starting two threads that each print something, sleep for a bit, and then print something else. To check the expected output, we are basically guessing how long it will take these threads to run, and on a very heavily-loaded machine, we can guess wrong: File "src/sage/parallel/map_reduce.py", line 1149, in sage.parallel.map_reduce.RESetMapReduce.start_workers Failed example: sleep(float(1.5)) Expected: Finished: ... Finished: ... Got: Finished: 2 (We were expecting both threads to finish within 1.5s, but only one did.) To make this a bit more robust, we eliminate the sleep() and subsequent print inside of the threads. As a result, we need only make one guess: how long it will take the threads to start. And for that we now go overkill and wait 5s.
Documentation preview for this PR (built with commit a181883; changes) is ready! 🎉 |
The commit messages explain each change in detail, but tl;dr these two tests fail for me when the machine is under heavy load because they are guessing how long it will take some computation to complete and the guess is wrong. URL: sagemath#38825 Reported by: Michael Orlitzky Reviewer(s): Tobias Diez
....: return [] | ||
sage: S = RESetMapReduce(roots=[1, 2], children=children) | ||
sage: S.setup_workers(2) | ||
sage: S.start_workers(); sleep(float(0.4)) | ||
sage: S.start_workers(); sleep(float(5)) # long time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The # long time
label causes a test failure in the cleanup section, when running tests in non-long mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commit messages explain each change in detail, but tl;dr these two tests fail for me when the machine is under heavy load because they are guessing how long it will take some computation to complete and the guess is wrong.