-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Template/expval #489
Template/expval #489
Conversation
…ata` to work with devices. M pennylane_lightning/core/src/simulators/lightning_kokkos/StateVectorKokkos.hpp; `applyMatrix` bugfix: use intermediate hostview to copy matrix data; same bugfix for `getDataVector`. M pennylane_lightning/core/src/simulators/lightning_kokkos/algorithms/AdjointJacobianKokkos.hpp; use copy constructor. M pennylane_lightning/core/src/simulators/lightning_kokkos/measurements/MeasurementsKokkos.hpp; use copy constructor. M pennylane_lightning/core/src/simulators/lightning_kokkos/observables/ObservablesKokkos.hpp; use copy constructor. M requirements-dev.txt; add clang-format-14.
… vector data in adjoint-diff.
…calls into two templated methods. Call specialized expval methods when possible. Remove obsolete 'Apply directly' tests.
…alueMultiQubitOpFunctor.
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #489 +/- ##
==========================================
+ Coverage 93.04% 99.09% +6.04%
==========================================
Files 142 142
Lines 16278 16693 +415
==========================================
+ Hits 15146 16542 +1396
+ Misses 1132 151 -981
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left only a few comments for now.
I see that we still have some work to do in terms of coverage.
pennylane_lightning/core/src/simulators/lightning_kokkos/gates/README.md
Outdated
Show resolved
Hide resolved
pennylane_lightning/core/src/simulators/lightning_kokkos/measurements/ExpValFunctors.hpp
Show resolved
Hide resolved
I would like to merge #485 first to assess the coverage situation. |
Absolutely, I think it is only sensible to do so. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing more to add. Thank you for that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing more to add --- thanks a bunch @vincentmr
I'm happy with the macro-approach for now, but we can revisit later to see if it can become some compile-time generated parameter-packed solution.
Before submitting
Please complete the following checklist when submitting a PR:
All new features must include a unit test.
If you've fixed a bug or added code that should be tested, add a test to the
tests
directory!All new functions and code must be clearly commented and documented.
If you do make documentation changes, make sure that the docs build and
render correctly by running
make docs
.Ensure that the test suite passes, by running
make test
.Add a new entry to the
.github/CHANGELOG.md
file, summarizing thechange, and including a link back to the PR.
Ensure that code is properly formatted by running
make format
.When all the above are checked, delete everything above the dashed
line and fill in the pull request template.
Context:
This PR is a follow-up on #481. In the last PR, it appeared that reducing
expval
on the fly is generally faster than using inner products. Another factor is the computation of the observable-statevector product and the parallelization scheme used to do it. The general scheme uses three layers of parallelism with team policies. This introduces several parameters which should be tuned for optimal performance, but are currently left to Kokkos' heuristics to decide. On the other hand, the straightforward range policy-based scheme of the 1- and 2-qubit kernels outperforms the general scheme significantly.Since this discrepancy does not appear explainable by the flop intensity increase between 2- and 3+-qubit kernels, I introduce specialized 3- to 5-qubit kernels. I draw the following conclusions:
The following figures show timings to get the expectation value of a
Hermitian
observable for OPENMP, CUDA and HIP respectively.Description of the Change:
Introduce specialized 3- to 5-qubit kernels. Refactor
getExpValMatrix
wrapper inMeasurementsKokkos.hpp
. Add few tests.Benefits:
Faster
expval
on all platforms, especially for 3+-qubit observables.Possible Drawbacks:
None
Related GitHub Issues:
#481