Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modal script - benchmarking, profiling and libraries #504

Open
vyom1611 opened this issue May 31, 2024 · 6 comments
Open

Modal script - benchmarking, profiling and libraries #504

vyom1611 opened this issue May 31, 2024 · 6 comments

Comments

@vyom1611
Copy link
Contributor

Running the cuda code on modal using the benchmark_on_modal.py script is very useful, but I was wondering if there is a way to install cuDNN onto that app because we cannot run a lot of the faster kernels without libraries like such. Is it also possible to profile using nsight systems on modal?

Using modal is very convenient for people like me who do not have access to gpus, so if we can update the script and add more documentation on how to run profiling and all kinds of kernels on modal, I bet more folks would be able to access and run the codebase.

@gordicaleksa
Copy link
Contributor

I won't be able to help you with Modal, but I'll just say that our goal is ultimately to have custom CUDA kernels that outperform cuDNN. So you should, ideally, be able to just run our code without external dependencies.

One can dream. :)

karpathy added a commit that referenced this issue Jun 13, 2024
Benchmark modal script fixed - profiling and cuDNN (Issue #504 and PR #510 fixes)
@awayzjj
Copy link

awayzjj commented Jun 16, 2024

@vyom1611 Hi, I try to run the demo

GPU_MEM=80 modal run benchmark_on_modal.py \
    --compile-command "nvcc -O3 --use_fast_math attention_forward.cu -o attention_forward -lcublas" \
    --run-command "./attention_forward 1

, but failed. Could you please give me some suggestions? Did I miss some steps? Thank you very much!
截屏2024-06-16 13 31 55
截屏2024-06-16 13 32 11

@vyom1611
Copy link
Contributor Author

Hi, try running the compile command with the -lcublast option.

@awayzjj
Copy link

awayzjj commented Jun 16, 2024

@vyom1611 Thank you very much! It worked after adding the -lcublasLt option.

nvcc -O3 --use_fast_math attention_forward.cu -o attention_forward -lcublas -lcublasLt

Have you considered adding support for "ncu"? I tried, but encountered an error and no profile file was generated.
截屏2024-06-16 16 12 13

@vyom1611
Copy link
Contributor Author

vyom1611 commented Jul 1, 2024

using ncu is weird, because it likes to profile kernels very deeply. There is a linux_kernel_paranoid level if too high, then nsys cannot profile cpu and os parts during profiling, and for ncu: it seems impossible to fix this on modal containers: ERR_NVGPUCTRPERM The user running <tool_name/application_name> does not have permission to access NVIDIA GPU Performance Counters on the target device.

And to fix it you have to:

Enable access permanently

  1. To allow access for any user: create a file with the .conf extension containing options nvidia NVreg_RestrictProfilingToAdminUsers=0 in /etc/modprobe.d.
  2. To restrict access to admin users (CAP_SYS_ADMIN capability set), create a file with the .conf extension containing options nvidia NVreg_RestrictProfilingToAdminUsers=1 in /etc/modprobe.d.

which seems impossible on modal containers since you need root privilege to edit modprobe.d and create .conf files in /etc/.

and even if you managed to change it, then you have to reload modrobe using rmmod nvidia and update-initramfs -u which again needs sudo access, and reboot the system,

So currently we cannot run ncu on modal.

This is the link for reference: https://developer.nvidia.com/nvidia-development-tools-solutions-err_nvgpuctrperm-permission-issue-performance-counters

@awayzjj
Copy link

awayzjj commented Jul 1, 2024

using ncu is weird, because it likes to profile kernels very deeply. There is a linux_kernel_paranoid level if too high, then nsys cannot profile cpu and os parts during profiling, and for ncu: it seems impossible to fix this on modal containers: ERR_NVGPUCTRPERM The user running <tool_name/application_name> does not have permission to access NVIDIA GPU Performance Counters on the target device.

And to fix it you have to:

Enable access permanently

  1. To allow access for any user: create a file with the .conf extension containing options nvidia NVreg_RestrictProfilingToAdminUsers=0 in /etc/modprobe.d.
  2. To restrict access to admin users (CAP_SYS_ADMIN capability set), create a file with the .conf extension containing options nvidia NVreg_RestrictProfilingToAdminUsers=1 in /etc/modprobe.d.

which seems impossible on modal containers since you need root privilege to edit modprobe.d and create .conf files in /etc/.

and even if you managed to change it, then you have to reload modrobe using rmmod nvidia and update-initramfs -u which again needs sudo access, and reboot the system,

So currently we cannot run ncu on modal.

This is the link for reference: https://developer.nvidia.com/nvidia-development-tools-solutions-err_nvgpuctrperm-permission-issue-performance-counters

Got it! Thank you! It is a pity since modal offer some free A100 quota :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants