Welcome to the LLAMA Satellite for Taubyte WebAssembly VM project. This tool extends the functionality of the Taubyte WebAssembly Virtual Machine by introducing Large Language Model (LLAMA) capabilities. It's built upon llama.cpp
and employs go-llama-cpp
for cgo bindings.
With this plugin, you can augment your applications with advanced language understanding features.
- File Structure
- Example Usage from WebAssembly
- Installation
- Acceleration
- Model Setup
- Plugin Compilation
- WebAssembly
- Testing
plugin/
- Code for the plugin itselfsdk/
- Wrapper around the low-level functions exported by the pluginfixtures/build/
- Code to be compiled to webassembly and run on a Taubyte Virtual Machine during testingmodels/
- Helper tool to download models
Using the plugin is straightforward. In just a few lines of code, you can build your own planet-scale ChatGPT clone-API! Here's a simple example:
package lib
import (
"fmt"
"io"
"github.com/samyfodil/taubyte-llama-satellite/sdk"
)
//export wapredict
func wapredict(uint32) uint32 {
p, err := sdk.Predict(
"How old is the universe?",
sdk.WithTopK(90),
sdk.WithTopP(0.86),
sdk.WithBatch(5),
)
if err != nil {
panic(err)
}
for {
token, err := p.Next()
if err == io.EOF {
break
} else if err != nil {
panic(err)
}
fmt.Print(token)
}
return 0
}
This project requires some dependencies to function. Use the following commands to clone the submodules locally:
git clone --recurse-submodules deps/go-llama
Then, navigate to the newly cloned directory and make the libbinding:
cd deps/go-llama
make libbinding.a
You can take advantage of OpenBLAS and CuBLAS for acceleration.
To build and run with OpenBLAS:
cd deps/go-llama
BUILD_TYPE=openblas make libbinding.a
To build with CuBLAS:
cd deps/go-llama
BUILD_TYPE=cublas make libbinding.a
You need to provide the plugin with a model to load. If you do not have one, you can use the tool in the models
folder:
cd models
go run .
Follow the prompts to select and download the model. Then, specify the path to the model in plugin/main.go
. For example:
ai, err := New(ctx, "orca-mini", "models/assets/orca-mini-3b.ggmlv3.q4_0.bin")
cd plugin
go build .
cd plugin
go build -tags openblas .
cd plugin
go build -tags cublas .
The WebAssembly code used to test the plugin is in fixtures/build
. If you modify predict.go
, the tests will automatically recompile it.
Once you have compiled the bindings, you can run the tests:
go test -v
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin && \
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600 && \
wget https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda-repo-ubuntu2204-12-2-local_12.2.0-535.54.03-1_amd64.deb && \
sudo dpkg -i cuda-repo-ubuntu2204-12-2-local_12.2.0-535.54.03-1_amd64.deb && \
sudo cp /var/cuda-repo-ubuntu2204-12-2-local/cuda-*-keyring.gpg /usr/share/keyrings/ && \
( curl https://developer.download.nvidia.com/hpc-sdk/ubuntu/DEB-GPG-KEY-NVIDIA-HPC-SDK | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-hpcsdk-archive-keyring.gpg ) && \
( echo 'deb [signed-by=/usr/share/keyrings/nvidia-hpcsdk-archive-keyring.gpg] https://developer.download.nvidia.com/hpc-sdk/ubuntu/amd64 /' | sudo tee /etc/apt/sources.list.d/nvhpc.list ) && \
sudo apt-get update -y && \
sudo apt-get -y install cuda nvhpc-23-5
This project is licensed under the BSD 3-Clause License. For more details, see the LICENSE file.