Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend kernel-info to emit PGO-based FLOP count #110586

Draft
wants to merge 137 commits into
base: main
Choose a base branch
from

Conversation

jdenny-ornl
Copy link
Collaborator

@jdenny-ornl jdenny-ornl commented Sep 30, 2024

This is an experiment to combine the capabilities of PGO GPU support (PR #94268) and the kernel-info pass (PR #102944). In particular, it implements an estimation of the number of floating point operations a GPU code executes (profile counts x static floating point op counts). Example usage can be found in llvm/docs/KernelInfo.rst.

The floating point operation count implementation starts at commit d2847b0. A few questions about the implementation appear there as todos. Subsequent commits bring in updates and improvements. Prior commits merge the aforementioned pull requests (not yet landed).

This PR formerly only supported -fprofile-instrument=clang. This commit adds support for -fprofile-instrument=llvm
Replace getPointerBitCastOrAddrSpaceCast with getAddrSpaceCast and allow no-op getAddrSpaceCast calls when types are identical
TODO: Fix tests
Copy link

github-actions bot commented Sep 30, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants