Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
HoloChat local is a chat application that allows users to interact with 'HoloChat' an LLM that able to answer a wide array of questions about the Holoscan SDK.
To accomplish this, this app uses Langchain to store the Holoscan SDK's repo and user guide in a vector database. This vector database is then queried by HoloChat to retrieve relevant documentation to aid in answering user questions.
To build this repo, the
BGE
embedding model is used via LangChain, which uses HuggingFace's tranformer's library as the back end. This requires torch >= 2.0, where I was unable to find a wheel that supported ARM64 with dGPU, CUDA 11.8, and Python 3.10. So, I built the included wheel from source and included it in the repo.However, I'm interested in potentially deploying this in a container, such as
nvcr.io/nvidia/cuda:12.2.0-runtime-ubuntu22.04
, which could remove all of the setup steps and allow a user to simply executedocker run
.