Skip to content

How do I use GPU for inference? #565

Closed Answered by prnvbn
prnvbn asked this question in Q&A
Discussion options

You must be logged in to vote

Seems like llama-cpp-python had to be reinstalled with CUBLAS enabled. I hade done this for my langchain conda env but forgot to do it for the guidance env. would be nice if this was documented somewhere :)

The command I used to re-install - CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir

If installing for the first time, use - CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by prnvbn
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
1 participant