Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scala 2 forwardport: -Yprofile-trace #19897

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

WojciechMazur
Copy link
Contributor

@WojciechMazur WojciechMazur commented Mar 7, 2024

  • Scala 2 tracing profiler backport from Emit detailed compiler trace under -Yprofile-trace scala#7364 extended with more Scala 3 idiomatic syntax based on inlined methods
  • Fixes the context.profiler which could have been null, now it's initially a NoOp Profiler
  • Check dependencies of -Yprofile-enabled dependent tasks, now we get an error if -Yprofile-trace is set without -Yprofile-enabled

@bishabosha
Copy link
Member

@keynmol petition here

@keynmol
Copy link
Contributor

keynmol commented Apr 5, 2024

YES YES YES PLEASE
This would be so much better than the several formats currently enabled by 3 different flags

@WojciechMazur WojciechMazur added this to the 3.5.0 milestone Apr 5, 2024
@WojciechMazur WojciechMazur force-pushed the backport/Yprofiler-trace branch 2 times, most recently from 880e2a7 to 70523b7 Compare April 10, 2024 12:12
@WojciechMazur WojciechMazur marked this pull request as ready for review April 11, 2024 09:47
@WojciechMazur
Copy link
Contributor Author

@nicolasstucki I think you we're doing the original backport of Profiler from Scala 2. Can you take a look at this and make a review?

@nicolasstucki nicolasstucki self-requested a review April 12, 2024 10:03
@nicolasstucki
Copy link
Contributor

I generated the profile for

enum Foo:
  case A
  case B
  case C

and got the following result

Screenshot 2024-04-18 at 10 52 36

All metrics (except for GC) are missing in the type phase.

@dwijnand dwijnand changed the title Scala 2 backport: -Yprofile-trace Scala 2 forwardport: -Yprofile-trace Apr 18, 2024
@WojciechMazur
Copy link
Contributor Author

The metrics are collected after the CompilationUnit with at least 10ms intervals. Parser used overriden runOn method that didn't run onUnit callback. Becouse there was only 1 compilation unit it seemed like collection of metrics started after the typer. In fact it works the same in Scala 2, but I've fixed to collect initial results in the constructor of the profiler - now we got statistics from the very beginning of the compilation run
image

@WojciechMazur WojciechMazur force-pushed the backport/Yprofiler-trace branch 3 times, most recently from cc05d6a to 72d5e0e Compare May 8, 2024 11:04
@Gedochao Gedochao requested review from sjrd and rochala May 9, 2024 08:32
@Kordyjan Kordyjan removed this from the 3.5.0 milestone May 10, 2024
@Gedochao Gedochao requested a review from mbovel July 1, 2024 09:53
Copy link
Member

@mbovel mbovel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done a first quite-quick pass by just reading code. My only concern is the potential cost of the new *on operations, but that might be insignificant. Otherwise looks good to me at first sight.

I now need to experiment with the option to be able to provide further feedback.

compiler/src/dotty/tools/dotc/config/ScalaSettings.scala Outdated Show resolved Hide resolved
compiler/src/dotty/tools/dotc/config/ScalaSettings.scala Outdated Show resolved Hide resolved
compiler/src/dotty/tools/dotc/config/ScalaSettings.scala Outdated Show resolved Hide resolved
compiler/src/dotty/tools/dotc/config/ScalaSettings.scala Outdated Show resolved Hide resolved
compiler/src/dotty/tools/dotc/config/ScalaSettings.scala Outdated Show resolved Hide resolved
ctx.profiler.onCompletion(sym, associatedFile)(body)
}

override def complete(root: SymDenotation)(using Context): Unit = profileCompletion(root) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked that the associatedFile callback above doesn't run when profiling is disabled ✅

But I am still not 100% convinced that is zero cost—it seems that calls to default implementations of beforeCompletion and afterCompilation will still exist after inlining? This would just be 2 method calls and a pair allocation, so maybe harmless in terms for performance?

Unfortunately, the benchmarks bot is currently down, so we can't run the benchmarks now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The allocations have been reduced by preallocating empty (default) outputs. This should reduce any overhead when -Yprofile-trace is disabled. The 2 additional calls to before/after completion might still be present, but maybe JVM can optimize these?

Do you maybe know if the benchmarks bot is working now? We could run it now, after the rebase, to ensure the overhead is not significant when profile-tracing is disabled.

def beforePhase(phase: Phase): ProfileSnap

def afterPhase(phase: Phase, profileBefore: ProfileSnap): Unit
inline def onPhase[T](phase: Phase)(inline body: T): T =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above for all on* methods; we now allocate a pair on each call. I don't now if that's significant, but maybe we could just extract (TracedEventId.Empty, Profiler.emptySnap) and so on to constants, in order to avoid the allocations?


import scala.collection.mutable

object ChromeTrace {
Copy link
Member

@mbovel mbovel Jul 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't read this class and FileUtils in details. Should we have a few unit tests for these, or is it not worth it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've ported the FileUtilsTest from Scala 2 repo and added a ChromeTraceTest to check the outputs and structure of tracing files

Copy link
Member

@mbovel mbovel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I took so long to come back to this! I finally tried compiling "hello world" with tracing enabled this morning. Here is what I did:

scalac -Yprofile-enabled -Yprofile-trace trace.json hello.scala

I was then able to visualize its easily using https://ui.perfetto.dev/:

image

Is this the expected workflow? It seems to work well, even if Perfetto warned me that it would prefer a trace in protobuf format.

compiler/src/dotty/tools/dotc/profile/ChromeTrace.scala Outdated Show resolved Hide resolved
else
val completionName= this.completionName(root, associatedFile)
val event = TracedEventId(associatedFile.name)
chromeTrace.traceDurationEventStart(Category.Completion.name, "↯", colour = "thread_state_sleeping")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I correct that this always generate 3 events with the same timing information for each completion? Is this to ease filtering by file and symbol? What was the rational for the events; do we really need them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, 3 events would be always generated. I'm not sure what the purpose of the events was. If I had to guess I'd say that it only exists to inform about the bounds of IO operations, which might make some sense. All other operations are CPU-bound. The 2 remaining symbols might be used to ease filtering, especially because the associated file might not always be present.
We could remove the and associated file events, but initially I wanted to make the traces compliant with the outputs of Scala 2.

compiler/src/dotty/tools/dotc/config/ScalaSettings.scala Outdated Show resolved Hide resolved
chromeTrace.traceCounterEvent("jitCompilationTime", "jitCompilationTime", initialSnap.totalJITCompilationTime, processWide = true)
chromeTrace.traceCounterEvent("userTime", "userTime", initialSnap.userTimeNanos, processWide = false)
chromeTrace.traceCounterEvent("cpuTime", "cpuTime", initialSnap.cpuTimeNanos, processWide = false)
chromeTrace.traceCounterEvent("idleTime", "idleTime", initialSnap.idleTimeNanos, processWide = false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we record these metrics only in afterUnit? This means that they are only available after the first unit processed during typer, so we don't know how much memory was used for parsing or for typing for example. Couldn't we sample them also before/after each phase, and before each unit? Or would that be too expensive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've refactored this part out. Now it's invoked in all 4 before/after phase/unit combinations. This should not impose a significant overhead

@WojciechMazur
Copy link
Contributor Author

WojciechMazur commented Sep 13, 2024

I needed to rebase the PR which is not good news for the reviewing. The improvements after the first review round start with 592a892

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants