Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarking #9

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Benchmarking #9

wants to merge 2 commits into from

Conversation

apuignav
Copy link
Contributor

First try at serious benchmarking.

Still, we're quite slow, also because creating multiple graphs takes a long time (total time is 202 seconds!):

(zfit36) [10:34]farm-gpu:~/zfit/tfphasespace/benchmark[benchmarks]$ CUDA_VISIBLE_DEVICES= python3 bench_tfphasespace.py
2019-03-12 10:35:18.778705: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-03-12 10:35:19.994733: E tensorflow/stream_executor/cuda/cuda_driver.cc:300] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2019-03-12 10:35:19.994797: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] retrieving CUDA diagnostic information for host: farm-gpu
2019-03-12 10:35:19.994807: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:170] hostname: farm-gpu
2019-03-12 10:35:19.994851: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:194] libcuda reported version is: 410.78.0
2019-03-12 10:35:19.994891: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:198] kernel reported version is: 410.78.0
2019-03-12 10:35:19.994900: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:305] kernel version seems to match DSO: 410.78.0
Initial run (may takes more time than consequent runs)
Elapsed time: 93251.58056803048 ms
starting benchmark
Total number of generated samples 100000100
Shape of one particle momentum (4, 1000001)
Elapsed time: 70495.63611880876 ms
Time per sample: 7.049556562324313e-07 ms
CUDA_VISIBLE_DEVICES= python3 bench_tfphasespace.py  202.81s user 12.83s system 99% cpu 3:37.76 total

vs

(zfit36) [10:38]farm-gpu:~/zfit/tfphasespace/benchmark[benchmarks]$ root -q bench_tgenphasespace.cxx+

Processing bench_tgenphasespace.cxx+...
(int) 0
root -l -q bench_tgenphasespace.cxx+  26.52s user 0.11s system 97% cpu 27.206 total

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant