Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
To review this PR, I would suggest to try the instructions in the Readme to run the vizualizer locally.
The vizualizer is a single-page Javascript app with one route or each view. Routing is based on
window.location.hash
. I guess the single-page and single file structure is debatable, but I do think it's the easiest to work with and to avoid duplication.It's written in vanilla JS 1. to keep things simple (it doesn't require any build tool), 2. because the previous version was done that way, 3. because it's the easiest way to interact with Plotly.js.
This is an MVP:
This is only a part of the new benchmarks infrastructure. There will also be independent changes in the Dotty repo: benchmarks will be run with Github actions and JMH only. https://github.com/lampepfl/bench/tree will eventually be archived.
Aggregated view
This view is vastly similar to what the current vizualizer (https://dotty-bench.epfl.ch) but provides options to customize the graphs. Additionally, min and max values are displayed as errors bars, while only min values can be displayed, as a separate trace, in the current vizualizer.
Errors bars make it easier to see if there is a significant change in performance at some point, or if benchmarks are misconfigured. For example, the screenshot above shows quite clearly that the Dotty benchmark is currently misconfigured: error bars should be overlapping most of the time but they aren't. This either means that the CPU frequency is not stable enough, or that we don't run enough iterations. This will be fixed in separate PRs.
A well configured benchmark should be closer to this for example:
Compare view
This view aims at providing better insights to compare 2 or more commits, for example to check if a PR has a significant impact.
It shows the distribution of runtimes as box plots for all benchmarks in a single graph. There is a group of boxes for each benchmark on the x-axis, with a box for each commits, differentiated by color. Runtimes are normalized to the runtime of the first commit to ease comparison across benchmarks.
In the screenshot below, we can easily see for example that the second commit (the orange one) improves the runtime of the "re2s" benchmark by ~5%, and probably does not have a significant on other benchmarks.
The goal is to point people to this view after benchmarks are run for a specific PR.