Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client Request Cached Methods List from JIT Server #20129

Merged
merged 1 commit into from
Oct 25, 2024

Conversation

luke-li-2003
Copy link
Contributor

@luke-li-2003 luke-li-2003 commented Sep 6, 2024

Added an option that makes the client request a list of cached methods from the server on bootstrap, and set the count for those methods as scount to improve rampup

Depends on eclipse-omr/omr#7458

@luke-li-2003
Copy link
Contributor Author

@mpirvu

@mpirvu mpirvu added the comp:jitserver Artifacts related to JIT-as-a-Service project label Sep 6, 2024
Copy link

@marius-pirvu marius-pirvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I provided a partial review of the code.

runtime/compiler/net/StreamExceptions.hpp Outdated Show resolved Hide resolved
runtime/compiler/net/StreamExceptions.hpp Outdated Show resolved Hide resolved
runtime/compiler/net/StreamExceptions.hpp Show resolved Hide resolved
runtime/oti/j9nonbuilder.h Outdated Show resolved Hide resolved
runtime/compiler/control/rossa.cpp Outdated Show resolved Hide resolved
runtime/compiler/control/JITServerCompilationThread.cpp Outdated Show resolved Hide resolved
runtime/compiler/control/JITServerCompilationThread.cpp Outdated Show resolved Hide resolved
runtime/compiler/control/JITServerCompilationThread.cpp Outdated Show resolved Hide resolved
runtime/compiler/control/JITServerCompilationThread.cpp Outdated Show resolved Hide resolved
@luke-li-2003 luke-li-2003 force-pushed the GetServerAOTCacheList branch 4 times, most recently from 1796628 to 0952284 Compare September 17, 2024 14:27
@luke-li-2003
Copy link
Contributor Author

Forgot to force-push, should be fine now

@luke-li-2003 luke-li-2003 force-pushed the GetServerAOTCacheList branch 3 times, most recently from dcc00d3 to cf589c4 Compare September 17, 2024 15:31
runtime/compiler/control/HookedByTheJit.cpp Outdated Show resolved Hide resolved
runtime/compiler/control/HookedByTheJit.cpp Outdated Show resolved Hide resolved
runtime/compiler/control/HookedByTheJit.cpp Outdated Show resolved Hide resolved
runtime/compiler/control/JITServerCompilationThread.cpp Outdated Show resolved Hide resolved
Copy link

@marius-pirvu marius-pirvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the IDs of the various network messages has changed, we need to increment the minor version of the server.
Look how we do that in PR #19477 in file runtime/compiler/net/CommunicationStream.hpp

runtime/compiler/control/HookedByTheJit.cpp Outdated Show resolved Hide resolved
runtime/compiler/control/HookedByTheJit.cpp Outdated Show resolved Hide resolved
runtime/compiler/control/JITServerCompilationThread.cpp Outdated Show resolved Hide resolved
runtime/compiler/control/JITServerCompilationThread.cpp Outdated Show resolved Hide resolved
@dsouzai dsouzai added the depends:omr Pull request is dependent on a corresponding change in OMR label Sep 19, 2024
@luke-li-2003 luke-li-2003 force-pushed the GetServerAOTCacheList branch 6 times, most recently from 2fa5254 to 4f059f1 Compare September 23, 2024 16:20
Copy link
Contributor

@mpirvu mpirvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Could you please obtain some rampup curves to see whether we do better by using this feature. Use AcmeAir, preferably in containers, if not then outside containers without a SCC.

@luke-li-2003
Copy link
Contributor Author

luke-li-2003 commented Oct 17, 2024

Here are the rampup curves using AcmeAirEE8 outside containers (without SCC) for this branch and the master branch.

image

@luke-li-2003
Copy link
Contributor Author

Also the result on my local machine with jdk17 (the previous curves are of jdk11)

image

@mpirvu
Copy link
Contributor

mpirvu commented Oct 19, 2024

jenkins compile all jdk17

@mpirvu
Copy link
Contributor

mpirvu commented Oct 21, 2024

jenkins test sanity xlinux zlinux jdk17

@mpirvu
Copy link
Contributor

mpirvu commented Oct 21, 2024

jenkins test sanity xlinuxjit zlinuxjit jdk17

@mpirvu
Copy link
Contributor

mpirvu commented Oct 21, 2024

jenkins test sanity xlinuxjit,zlinuxjit jdk17

@mpirvu
Copy link
Contributor

mpirvu commented Oct 21, 2024

Tests have passed, but there are conflicts which need to be resolved

Added an option that makes the client request a list of cached
methods from the server on bootstrap, and set the count for
those methods as scount to improve rampup

Signed-off-by: Luke Li <[email protected]>
@mpirvu
Copy link
Contributor

mpirvu commented Oct 22, 2024

jenkins test sanity zlinuxjit jdk21

@luke-li-2003
Copy link
Contributor Author

luke-li-2003 commented Oct 23, 2024

Running without the option Xjit:requestJITServerCachedMethods, the performance difference is negligible compared to the master branch build.

Results for jdk: /team/lukeli/jdk11Master and opts: -Xmx256m -Xms256m -Xjit:verbose={JITServer},verbose={counts},     verbose={compilePerformance},vlog=/team/lukeli/logs/vlog.txt -Xshareclasses:none
First run is a cold run and is not included in the stats
Run 0    3581.3  4683.1 Avg= 4683.1  RSS=    231 MB  CompCPU= 34.1 sec  Startup= 6042 ms
Run 1    7908.7  4771.1 Avg= 4771.1  RSS=    273 MB  CompCPU= 36.3 sec  Startup= 5965 ms
Run 2    3584.2  4713.0 Avg= 4713.0  RSS=    228 MB  CompCPU= 32.6 sec  Startup= 5898 ms
Run 3    6862.2  4757.3 Avg= 4757.3  RSS=    259 MB  CompCPU= 36.9 sec  Startup= 5994 ms
Run 4    3586.6  4716.2 Avg= 4716.2  RSS=    229 MB  CompCPU= 32.9 sec  Startup= 6215 ms
Run 5    6302.6  4734.1 Avg= 4734.1  RSS=    255 MB  CompCPU= 36.4 sec  Startup= 5936 ms
Run 6    6648.8  4758.9 Avg= 4758.9  RSS=    258 MB  CompCPU= 35.3 sec  Startup= 5807 ms
Run 7    5921.6  4729.7 Avg= 4729.7  RSS=    258 MB  CompCPU= 35.2 sec  Startup= 6244 ms 
Run 8    6131.3  4575.7 Avg= 4575.7  RSS=    262 MB  CompCPU= 36.1 sec  Startup= 5915 ms
Run 9    6278.2  4828.4 Avg= 4828.4  RSS=    264 MB  CompCPU= 36.3 sec  Startup= 5920 ms
Avg:     5913.8  4731.6 Thr= 4731.6  RSS=    254 MB  CompCPU= 35.3 sec  Startup= 5988 ms
Throughput stats: Avg= 4731.6  StdDev=   68.2  Min= 4575.7  Max= 4828.4  Max/Min=   6% CI95=    1.1% numSamples=  9  
Footprint stats:  Avg=  254.0  StdDev=   15.4  Min=  228.1  Max=  273.0  Max/Min=  20% CI95=    4.7% numSamples=  9
Comp CPU stats:   Avg=   35.3  StdDev=    1.6  Min=   32.6  Max=   36.9  Max/Min=  13% CI95=    3.4% numSamples=  9   
StartupTime stats:Avg= 5988.2  StdDev=  146.3  Min= 5807.0  Max= 6244.0  Max/Min=   8% CI95=    1.9% numSamples=  9
Results for jdk: /team/lukeli/jdk11GetServer and opts: -Xmx256m -Xms256m -Xjit:verbose={JITServer},verbose={counts},  verbose={compilePerformance},vlog=/team/lukeli/logs/vlog.txt -Xshareclasses:none
First run is a cold run and is not included in the stats
Run 0    6589.8  4673.6 Avg= 4673.6  RSS=    261 MB  CompCPU= 35.8 sec  Startup= 5926 ms
Run 1    3513.3  4698.9 Avg= 4698.9  RSS=    227 MB  CompCPU= 34.4 sec  Startup= 6118 ms
Run 2    7301.8  4756.4 Avg= 4756.4  RSS=    272 MB  CompCPU= 40.6 sec  Startup= 6834 ms
Run 3    6317.3  4791.8 Avg= 4791.8  RSS=    433 MB  CompCPU= 35.9 sec  Startup= 6052 ms
Run 4    3587.1  4726.7 Avg= 4726.7  RSS=    228 MB  CompCPU= 32.0 sec  Startup= 5905 ms
Run 5    6276.0  4824.8 Avg= 4824.8  RSS=    262 MB  CompCPU= 36.3 sec  Startup= 5953 ms
Run 6    7234.2  4737.9 Avg= 4737.9  RSS=    269 MB  CompCPU= 34.7 sec  Startup= 5975 ms
Run 7    3575.9  4647.6 Avg= 4647.6  RSS=    226 MB  CompCPU= 32.9 sec  Startup= 5845 ms
Run 8    3549.0  4556.8 Avg= 4556.8  RSS=    227 MB  CompCPU= 33.7 sec  Startup= 5982 ms
Run 9    3500.8  4549.4 Avg= 4549.4  RSS=    229 MB  CompCPU= 34.6 sec  Startup= 6276 ms
Avg:     4983.9  4698.9 Thr= 4698.9  RSS=    264 MB  CompCPU= 35.0 sec  Startup= 6104 ms
Throughput stats: Avg= 4698.9  StdDev=   97.0  Min= 4549.4  Max= 4824.8  Max/Min=   6% CI95=    1.6% numSamples=  9
Footprint stats:  Avg=  263.7  StdDev=   66.4  Min=  225.9  Max=  432.9  Max/Min=  92% CI95=   19.4% numSamples=  9
Comp CPU stats:   Avg=   35.0  StdDev=    2.5  Min=   32.0  Max=   40.6  Max/Min=  27% CI95=    5.5% numSamples=  9
StartupTime stats:Avg= 6104.4  StdDev=  301.3  Min= 5845.0  Max= 6834.0  Max/Min=  17% CI95=    3.8% numSamples=  9

@mpirvu
Copy link
Contributor

mpirvu commented Oct 23, 2024

Running without the option Xjit:requestJITServerCachedMethods, the performance difference is negligible compared to the master branch build.

Are these runs done in client mode? I don't see -XX:+UseJITServer in the list of options.

Also, the results for jdk11Master are strange in that the the throughput is much larger for the first 5 minute load period than for the second one.

@luke-li-2003
Copy link
Contributor Author

Are these runs done in client mode?

No, that data set was run without the JITServer

@mpirvu
Copy link
Contributor

mpirvu commented Oct 23, 2024

I am willing to merge this code even though peak throughput is affected. Because of this, the new functionality will not be enabled by default. However, I wanted to make sure that the resulting JITServer configuration is not doing worse in the default case than the current master branch. Hence, the comparison should use JITServer and AOT cache, just don't provide the option that enables the new code.

@mpirvu
Copy link
Contributor

mpirvu commented Oct 23, 2024

jenkins test sanity xlinuxjit jdk21

@luke-li-2003
Copy link
Contributor Author

Running as the clients, the performance differences are negligible.

Results for jdk: /team/lukeli/jdk11GetServer and opts: -Xmx256m -Xms256m -Xjit:verbose={JITServer},verbose={counts},verbose={compilePerformance},vlog=/team/lukeli/logs/vlog.txt -XX:+UseJITServer -XX:+JITServerUseAOTCache -Xshareclasses:none
Run 0	 3944.3	 4629.5	Avg= 4629.5  RSS=    239 MB  CompCPU=  7.9 sec  Startup= 5242 ms
Run 1	 3817.4	 4441.8	Avg= 4441.8  RSS=    237 MB  CompCPU=  5.6 sec  Startup= 4419 ms
Run 2	 3997.8	 4571.5	Avg= 4571.5  RSS=    236 MB  CompCPU=  5.7 sec  Startup= 4442 ms
Run 3	 3980.1	 4650.3	Avg= 4650.3  RSS=    231 MB  CompCPU=  5.6 sec  Startup= 4435 ms
Run 4	 3998.5	 4304.3	Avg= 4304.3  RSS=    236 MB  CompCPU=  5.4 sec  Startup= 4392 ms
Run 5	 4143.3	 4634.1	Avg= 4634.1  RSS=    232 MB  CompCPU=  5.4 sec  Startup= 4454 ms
Run 6	 3975.4	 4429.6	Avg= 4429.6  RSS=    234 MB  CompCPU=  5.5 sec  Startup= 4411 ms
Run 7	 3967.6	 4607.4	Avg= 4607.4  RSS=    234 MB  CompCPU=  5.5 sec  Startup= 4446 ms
Run 8	 3956.7	 4270.0	Avg= 4270.0  RSS=    230 MB  CompCPU=  5.4 sec  Startup= 4395 ms
Run 9	 3978.2	 4462.5	Avg= 4462.5  RSS=    233 MB  CompCPU=  5.4 sec  Startup= 4452 ms
Avg:	 3975.9	 4500.1	Thr= 4500.1  RSS=    234 MB  CompCPU=  5.8 sec  Startup= 4509 ms
Throughput stats: Avg= 4500.1  StdDev=  139.4  Min= 4270.0  Max= 4650.3  Max/Min=   9% CI95=    2.2% numSamples= 10
Footprint stats:  Avg=  234.2  StdDev=    2.9  Min=  229.9  Max=  239.5  Max/Min=   4% CI95=    0.9% numSamples= 10
Comp CPU stats:   Avg=    5.8  StdDev=    0.8  Min=    5.4  Max=    7.9  Max/Min=  46% CI95=    9.5% numSamples= 10
StartupTime stats:Avg= 4508.8  StdDev=  258.6  Min= 4392.0  Max= 5242.0  Max/Min=  19% CI95=    4.1% numSamples= 10



Results for jdk: /team/lukeli/jdk11Master and opts: -Xmx256m -Xms256m -Xjit:verbose={JITServer},verbose={counts},verbose={compilePerformance},vlog=/team/lukeli/logs/vlog.txt -XX:+UseJITServer -XX:+JITServerUseAOTCache -Xshareclasses:none
Run 0	 3992.8	 4599.0	Avg= 4599.0  RSS=    236 MB  CompCPU=  8.0 sec  Startup= 5134 ms
Run 1	 3967.3	 4596.7	Avg= 4596.7  RSS=    233 MB  CompCPU=  5.8 sec  Startup= 4492 ms
Run 2	 4006.5	 4610.6	Avg= 4610.6  RSS=    236 MB  CompCPU=  5.7 sec  Startup= 4415 ms
Run 3	 4037.7	 4619.2	Avg= 4619.2  RSS=    237 MB  CompCPU=  5.5 sec  Startup= 4529 ms
Run 4	 3965.2	 4234.9	Avg= 4234.9  RSS=    232 MB  CompCPU=  5.5 sec  Startup= 4457 ms
Run 5	 3983.6	 4584.8	Avg= 4584.8  RSS=    234 MB  CompCPU=  5.5 sec  Startup= 4482 ms
Run 6	 4140.9	 4461.0	Avg= 4461.0  RSS=    232 MB  CompCPU=  5.7 sec  Startup= 4435 ms
Run 7	 3995.1	 4480.8	Avg= 4480.8  RSS=    233 MB  CompCPU=  5.5 sec  Startup= 4391 ms
Run 8	 3950.2	 4475.2	Avg= 4475.2  RSS=    230 MB  CompCPU=  5.5 sec  Startup= 4335 ms
Run 9	 3952.7	 4343.6	Avg= 4343.6  RSS=    232 MB  CompCPU=  5.4 sec  Startup= 4373 ms
Avg:	 3999.2	 4500.6	Thr= 4500.6  RSS=    233 MB  CompCPU=  5.8 sec  Startup= 4504 ms
Throughput stats: Avg= 4500.6  StdDev=  129.1  Min= 4234.9  Max= 4619.2  Max/Min=   9% CI95=    2.1% numSamples= 10
Footprint stats:  Avg=  233.5  StdDev=    2.1  Min=  230.4  Max=  237.2  Max/Min=   3% CI95=    0.6% numSamples= 10
Comp CPU stats:   Avg=    5.8  StdDev=    0.8  Min=    5.4  Max=    8.0  Max/Min=  48% CI95=    9.5% numSamples= 10
StartupTime stats:Avg= 4504.3  StdDev=  228.9  Min= 4335.0  Max= 5134.0  Max/Min=  18% CI95=    3.6% numSamples= 10

@mpirvu
Copy link
Contributor

mpirvu commented Oct 25, 2024

My 50x grinder on the test jdk_lang_j9 that failed on zlinuxjit passed apparently, though I see one timeout in the console log.
However, the vast majority of the new code should not be exercised without the appropriate cmd line option. Thus, I think this PR is safe to be merged.

@mpirvu mpirvu merged commit 35db7ee into eclipse-openj9:master Oct 25, 2024
8 of 9 checks passed
@luke-li-2003 luke-li-2003 deleted the GetServerAOTCacheList branch October 25, 2024 16:29
#if defined(J9VM_OPT_JITSERVER)
void *serverAOTMethodSet;
UDATA serverAOTQueryThread;
#endif // J9VM_OPT_JITSERVER
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is included in C source files where // does not legally begin a comment.
This line should instead be:

#endif /* defined(J9VM_OPT_JITSERVER) */

@mpirvu
Copy link
Contributor

mpirvu commented Oct 28, 2024

@luke-li-2003 Please see the comment from @keithc-ca and address it in a new PR.

@keithc-ca
Copy link
Contributor

keithc-ca commented Oct 28, 2024

I already have a change that fixes that and cleans up other similar misuses: #20423.

mpirvu added a commit to mpirvu/openj9 that referenced this pull request Oct 28, 2024
PR eclipse-openj9#20129 added two includes (for <string> and <includes>) that
are only needed for JITServer specific code. Thus the two includes
should be protected by #if defined(J9VM_OPT_JITSERVER)

Signed-off-by: Marius Pirvu <[email protected]>
mpirvu added a commit to mpirvu/openj9 that referenced this pull request Oct 28, 2024
PR eclipse-openj9#20129 added two includes (for <string> and <vector>) that
are only needed for JITServer specific code. Thus the two includes
should be protected by #if defined(J9VM_OPT_JITSERVER)

Signed-off-by: Marius Pirvu <[email protected]>
zl-wang pushed a commit to zl-wang/openj9 that referenced this pull request Nov 11, 2024
PR eclipse-openj9#20129 added two includes (for <string> and <vector>) that
are only needed for JITServer specific code. Thus the two includes
should be protected by #if defined(J9VM_OPT_JITSERVER)

Signed-off-by: Marius Pirvu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:jitserver Artifacts related to JIT-as-a-Service project depends:omr Pull request is dependent on a corresponding change in OMR
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants