-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: Make more robust timer_test #3953
Conversation
On overloaded VMs, the timer_test is ureliable and fails spuriously too often. Two strategies in this patch, hope some combination does the trick: 1. Increase the absolute error allowed for the timing tests when doing CI runs. 2. Add just a bit more relative error (via a new unit test macro, OIIO_CHECK_EQUAL_THRESH_REL). Hopefully this will cut down on CI failures for unit_timer test case. Signed-off-by: Larry Gritz <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10% relative error could still be too small since the error is related to the OS's tick size. On Windows this defaults to 15ms, so if a test tries to sleep for 100ms (what interval
is set to I think), 10% error is 10ms and so you might still be wrong some of the time. 15% relative error should work on standard Windows machines if you want to go with this approach.
But I believe users can modify the tick size, so a better approach would be to have OS specific queries for tick size and then verify that the timer was equal to or up to 15ms slower. Since this code is for a unit test, it's possible the quick 15% padding is good enough and we can revisit this in the future if anyone reports fails.
I don't believe the timer can return faster than the requested time, so you might want to get rid of the absolutes.
In these examples, I'm using an absolute error PLUS relative, so I think it's going to be ok. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So abs + relative is giving you 20ms of padding? Yes, that might be OK for most machines. As I said before, this won't work on all machines but for unit tests it's probably not worth worrying about until we hit it.
This takes a stab at improving our hit rate on this test in CI, and I don't see how it hurts anything. If it doesn't fully solve the problem and we have to revisit again, so be it. |
I can at least verify that this patch allows the Also, here's output from a quick .exe I threw together to verify the timer resolution from |
On overloaded VMs, the timer_test is ureliable and fails spuriously too often. Two strategies in this patch, hope some combination does the trick: 1. Increase the absolute error allowed for the timing tests when doing CI runs. 2. Add just a bit more relative error (via a new unit test macro, OIIO_CHECK_EQUAL_THRESH_REL). Hopefully this will cut down on CI failures for unit_timer test case. Signed-off-by: Larry Gritz <[email protected]>
On overloaded VMs, the timer_test is ureliable and fails spuriously too often.
Two strategies in this patch, hope some combination does the trick:
Increase the absolute error allowed for the timing tests when doing CI runs.
Add just a bit more relative error (via a new unit test macro, OIIO_CHECK_EQUAL_THRESH_REL).
Hopefully this will cut down on CI failures for unit_timer test case.