Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA version and updated installation instructions #785

Merged
merged 3 commits into from
May 4, 2024

Conversation

sidharthrajaram
Copy link
Contributor

Edits to the README regarding installation/CUDA based on discussion on these issues: #783 and #717

  • Updated libraries to install for CUDA 12 (because of latest ctranslate2 support for only CUDA 12)
  • Added note regarding CUDA 11 support.

@jimydavis
Copy link

pip install nvidia-cudnn-cu12 needs to be pinned to either

  • ~=8.9
  • ^=8.9

because in v9, libcudnn_ops_infer.so.8 seems to be replaced with libcudnn_ops.so.9 amidst other changes. I am not familiar enough with nvidia's practice on whether it should be a tilde or a caret.

For cublas, at least as of commit 91c8307, I have tested that nvidia-cublas-cu12==12.4.5.8 is fully working.

README.md Outdated Show resolved Hide resolved
@Purfview
Copy link
Contributor

Btw, @nguyendc-systran mentioned that they will keep the support for CUDA 11 for a while, maybe they dropped that idea:

OpenNMT/CTranslate2#1590 (comment)

@minhthuc2502
Copy link

We tried to support CUDA 12 and 11 but releasing 2 versions in parallel quite complicated to maintain. In the end, we decided to only support CUDA 12 but the Ctranslate2 source can always build with CUDA 11.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@Purfview Purfview left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@bil-ash
Copy link

bil-ash commented Apr 11, 2024

May be alongside update ctranslate2 dependency to latest 4.2.0 because it supports flash attention as well as performance improvements for quantized models on CPU.

@sidharthrajaram
Copy link
Contributor Author

May be alongside update ctranslate2 dependency to latest 4.2.0 because it supports flash attention as well as performance improvements for quantized models on CPU.

@bil-ash this PR mainly contains updates to the installation instructions due to lack of CUDA 11 support in the latest versions of ctranslate2. Upgrading the ctranslate2 dependency would be beyond the scope of this particular PR I think.

@sidharthrajaram
Copy link
Contributor Author

Is this good to go, @Purfview ?

@Purfview
Copy link
Contributor

@sidharthrajaram FYI, I'm not a maintainer of this repo.

@trungkienbkhn trungkienbkhn merged commit 3d1de60 into SYSTRAN:master May 4, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants