-
Notifications
You must be signed in to change notification settings - Fork 774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor test apps to use unit-test framework #4014
base: master
Are you sure you want to change the base?
Conversation
…e new framework (reason: because parallel flag is not set, doh!)
…, skip essential tests, list tests
…ce returning value
…use of errors when running on Windows virtual machine
…t wait the previous test to complete)
…ing, and running unit test. This can be used by all test apps. pjlib-util-test has been ported to use this utilities
…sential test because it does not exist in features test (in pjlib-test)
…up from 45m originally to 15m using 10 worker threads
…4:30 minutes with 10 worker threads, from 45:42m originally)
…provements due to exclusive tests
… automatic error reporting) hopefully make it easier to use
…e with unit-test logging (see unittest.md)
…args are set in GitHub action variables
Sorry I'm converting this to draft to resolve issues with GH CI integration. Will request for review later |
There are still some CI errors to solve, but I think this PR is ready to be reviewed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, the time improvement looks amazing!
Looks good to me, only needs to resolve the CI test failures.
…eference counter goes to zero in regc_test
…), presumably because there are other loop transport around when the test is run. Solution are: 1) make the test exclusive for now, 2) fix the cleanup of loop transport in other tests. Then there is failure in transport_rt_test(), because it gets loop transport from other test that is being shutdown. Sneakily add pjsip_tpmgr_get_transport_count_by_type() that's useful for debugging.
…ime) called from BaseClasses
…out/stderr buffering
…ferent sequence on different platform even with the same srand, making it hard to reproduce test sequence
e36208d
to
78c9adf
Compare
…) except in pj_test_suite_shuffle(), 2) unit test PRNG explicitly uses pj_uint32_t instead of int. Also disable windows python tests since it is unreliable
This PR contains modifications to PJSIP test apps (
pjlib-test
,pjlib-util-test
,pjnath-test
,pjmedia-test
, andpjsip-test
) to use the new unit test framework (#4007) with the main objective to make them complete faster.Timing Results
Let's get straight to it. Below are the test time improvements from the original with the new framework using several worker thread settings.
GitHub CI timings:
Notes on timing
Settings with three worker threads (totalling four threads with the main thread) are significant because our GitHub runners mostly use 4 VCPU according to this article.
Some tests cannot be made faster than certain limit with more worker threads, because that is the longest test case duration in that test.
General look and feel
All test apps have common look and feel with uniform command line options, which look something like this:
The test outputs are also uniform, which look something like this:
Running the tests
With Makefile build system, it is easier to run the tests with the
make
command. TheMakefile
accepts two environment variables:CI_ARGS
contains arguments for the test apps, andCI_MODE
to indicate we're running under GitHub CI (#3374). Sample invocation:Otherwise (e.g. on Windows) run each of the app directly. Use
-h
to get help.GitHub CI modifications
CI_LIN_ARGS
,CI_WIN_ARGS
,CI_MAC_ARGS
, andCI_MODE
. UseCI_XXX_ARGS
to control run-time arguments for the test apps, especially the number of worker threads, which should be equal to the number of vCPU of the GitHub action runners minus one (because the main thread also runs the test cases).Tips on troubleshooting errors
When the logging does not convey sufficient info about the error, use
--log-no-cache
to display logs as they are written, most likely with-w 0
to disable worker thread to avoid cluttering the output.But sometimes, problem only arises with specific worker thread number and test orders. In this case, troubleshooting will be challenging indeed. :) Use
-v, --verbose
to display when tests are started/ended. This way you can know what tests were started when the failed test was running. After that, you can try running only these tests rather than all tests to reproduce the problem.Test shuffling (
--shuffle
arg) is used by default on GitHub CI via repository variables (see above). To reproduce the error, make note of the seed value used when running the (failed) test (it is printed in the output), and re-run the test (locally) using--shuffle --seed N
args.The
--stop-err
option is useful to avoid waiting for all tests to complete when debugging an error.Open issues
Reproducibility
As mentioned above, we're supposed to be able to reproduce the test sequence by using
--shuffle
and specific--seed
value. But as it turns out, this is not the case. Even with the same seed, the test sequence can be different on different machine. We already use our own psudo random number generator inunittest.c
, but this doesn't seem to fix the problem.Intermitten crashes
There is an intermitten crashes in pjsip's
regc_test
. This could be related totdata
being dec-ref-ed more than it should (even though the test result is success). The exact cause however is still not known.Test app modifications
General
There is a new utility file in
pjlib/src/pjlib-test/test_util.h
that is shared by all test apps to parse command line arguments, show usage, register tests, and control the unit testing process.The main front-end files (
main.c
) were modified to be more nice as command line apps.The main modification in test body (
test.c
) is to use the unit-test framework.Some test codes were changed, replacing manual checks with
PJ_TEST_XXX()
macros, mainly to test the usage of these macros and to make the test nicer. But since it made the PR very big, I didn't continue the effort, unless when it was necessary for debugging some problems.In general, large tests needed to be split into smaller ones to make them run in parallel. But major problems arose, mainly because the tests share global states or manipulate common objects.
More specific changes are discussed below.
pjlib-test
notespjlib-test
has "special" arrangements intest.c
, because it needs to test the unit-test (UT) framework first, before running the rest of the test using the UT framework. But before testing the UT framework, it needs to test the components needed by the UT framework such as list, fifobuf, and OS. And so on. That's why the test output is different than the rest of the test apps.Other than that, the modifications to the test functions are not too major, at least compared to pjnath-test and pjsip-test, and I think the test time is quite satisfactory.
pjlib-util-test
notesWe couldn't speed up more because tests such as
resolver_test()
andhttp_client_test()
takes about three minutes to complete and they couldn't be split up without major effort due to the use of global states. Since the test time is already quite satisfactory, I didn't pursue further optimizations.pjnath-test
notespjnath-test
requires large modifications to make the tests run in parallel as follows:mem
pool factory since many tests validate the memory leak in the pool factory, therefore having a single pool factory will not workserver.c
so that server can be instantiated multiple times simultaneously (this was the motivation behind API to get DNS server's bound address to allow specifying zero as port number #3999).ice_test
,turn_sock_test
,concur_test
) into individual test for each configuration, making them parallelable.As the result, there are 70 smaller test items in
pjnath-test
, and with 7 worker threads, we can save 40 minutes of test time!pjmedia-test
notespjmedia-test
has the least modifications because it has very few tests. The original duration was 4m18.691s, and has come down a little to 2m8.363s with 1 worker thread.Having said that, some minor modifications were done:
pjmedia_endpt_create()
withpjmedia_endpt_create2()
(similarly..destroy()
with..destroy2()
) inmips_test()
andcodec_test_vectors()
, to avoid inadvertently initializingpjmedia_aud_subsys
which on Ubuntu emits lots of debugging messages during initialization (although the messages should have been suppressed in the code).printf
with log in jbuf test to make the output tidy, and renamedjbuf_main
function name tojbuf_test
to be consistent.pjsip-test
notespjsip-test
has also gone through the biggest and most difficult modifications to make the tests parallelable, which involves:pjsip_tpselector
to bind transaction (andtdata
in case of stateless request) with specific loop transport, otherwise the transaction/tdata may find other instance of loop transporttsx_uac_test
failed because UA layer has now been registered before the test)tsx_basic_test
,tsx_uac_test
,tsx_uas_test
to take the index to parameters rather than the parameter itself to make the test output more informative.Detailed test timings/outputs
Below are detailed test timings/outputs for considerations and future reference. Maybe by looking at the test time you can have some idea to further speed up the tests.
Uncaught logging messages were removed for brevity.
pjlib-test
pjlib-util-test
pjnath-test
pjmedia-test
pjsip-test