Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow grouping and parallelization of test targets within a single module #3478

Draft
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

lihaoyi
Copy link
Member

@lihaoyi lihaoyi commented Sep 7, 2024

This PR allows Mill to parallelize test suites on a per-test-class basis. This is roughly a port of the SBT testGrouping feature https://www.scala-sbt.org/1.x/docs/Testing.html#Forking+tests and serves the same purpose. This is most useful for codebases with large modules each of which has a lot of test classes within them, and the flexible nature of def testForkGrouping gives the user room to tune exactly how the tests are grouped for maximal performance:

  • Too small groups and the JVM startup overhead dominates
  • Too large groups and you don't get enough parallelism
  • Also some tests may not be amenable to running in the same forked JVM as others, due to conflicting read/writes to the filesystem or manipulation of JVM-global stateful variables

For example, running time ./mill -i scalalib.test with and without test grouping on my 10 core macbook pro (after breaking up HelloWorldTests.scala for better granularity), we see about a 3x speedup.

Without Test Grouping, all test classes in 1 JVM (default)

581.83s user 48.25s system 181% cpu 5:47.11 total

With Test Grouping, 3 test classes per JVM (def testForkGrouping = discoveredTestClasses().grouped(3).toSeq)

656.06s user 40.93s system 577% cpu 2:00.68 total

With Test Grouping, 1 test class per JVM (def testForkGrouping = discoveredTestClasses().grouped(1).toSeq)

707.30s user 45.21s system 509% cpu 2:27.72 total

The limited speedup is likely due to the heavy nature of Mill tests meaning that running sequentially they already use multiple cores, and I would expect a greater speedup for most projects whose tests would be more lightweight. We can also see that 1-test-class-per-JVM is somewhat slower than 3-test-classes-per-JVM in this case, likely due to JVM overhead becoming significant

This feature is opt-in via def testForkGrouping = discoveredTestClasses().map(Seq(_)). The default behavior of running all tests in a single JVM is the unchanged

Implementation Notes

  • We re-use the same ExecutionContext that Mill uses internally for scheduling its targets, allowing the scheduling to be cooperative.

    • For example, this means that test classes running in parallel and other tasks use the same pool of threads, keeping constant the total number of threads on a global basis
  • We convert the default FixedThreadPool into a ForkJoinPool provide a blocking{...} operation to allow the ForkJoinPool to spawn an additional thread when an existing thread is blocked waiting.

    • This is necessary when we wait for the Futures spawned for each test class, as the task-level thread is idle and we want to continue making use of the available CPUs.
    • This is basically the same implementation the scala.concurrent.ExecutionContext.global does, but globals implementation is private and not re-usable link and so I have to duplicate the small amount of code wiring it up
  • Each test class runs in a subprocess in a separate JVM with a separate sandbox folder, and their outputs are then all read and consolidated back into the combined output for the original test task.

    • There is some overhead to spawning JVMs, but from my experience doing the same in Bazel that overhead is manageable and the benefits of class-level parallelism win out
  • I added a flag testParallelizeClasses that can be set of false to fall back to the existing behavior

@lihaoyi lihaoyi changed the title Parallelize test targets on a per-test-class basis Allow grouping and parallelization of test targets within a single module Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant