Table of Contents
This repository is a package that allows you to run Large Language Models (LLM) offline and locally only. The processing speed varies depending on the CPU/GPU, but some models work fine with a CPU. In particular, since large language models construct responses one word at a time, there is a process from the call to the response, so we use ROS's Actionlib communication.
This section describes how to set up this repository.
First, please set up the following environment before proceeding to the next installation stage.
System | Version |
---|---|
Ubuntu | 20.04 (Focal Fossa) |
ROS | Noetic Ninjemys |
Python | >=3.8 |
Note
If you need to install Ubuntu
or ROS
, please check our SOBITS Manual.
- Go to the
src
folder of ROS.$ roscd # Or just use "cd ~/catkin_ws/" and change directory. $ cd src/
- Clone this repository.
$ git clone https://github.com/TeamSOBITS/ollama_python
- Navigate into the repository.
$ cd ollama_python/
- Install the dependent packages.
$ bash install.sh
- Compile the package.
$ roscd # Or just use "cd ~/catkin_ws/" and change directory. $ catkin_make
Let's start with the execution process.
-
Launch model_download.launch
$ roslaunch ollama_python model_download.launch
-
Download the model you want to use from the GUI. Click [download] to download the model.
Note
Not all models are listed here. For a complete list, check it at ollama.com/library.
If you want to download a model that is not in the GUI, please add it to the list on line 19 of model_downloader.py. If the model has already been downloaded, you can delete ([delete]), copy ([copy]), or push ([push]) it.
Warning
Downloading the model will take some time. Please wait until the GUI is updated.
Note
For details and specific operation methods, please refer to the original Ollama Python and ollama github.
-
Set the
model_name
in ollama.launch to any model you like that you downloaded in the Download the model section. The following is an example wherellama3
is specified.<arg name="model_name" default="llama3"/>
-
Launch the Server. This uses Actionlib communication so that you can know the progress until the response sentence is generated.
$ roslaunch ollama_python ollama.launch
-
[Optional] Try calling it.
- Call with Actionlib communication (mode to get from the progress).
$ rosrun ollama_python ollama_action_client.py
- Call with Service communication (mode to get only the result).
$ rosrun ollama_python ollama_service_client.py
- Call with Actionlib communication (mode to get from the progress).
Here, you can set room_name
>>> to anything, but let's try default
for now.
Try typing something into the request
. Here, as an example, I sent Hello!
.
Warning
Since the processing may be slow on the CPU, it might be better to wait while watching the progress with Actionlib.
Note
Please check here for details about the pre-prompt settings and room_name
.
See the open issues for a full list of proposed features (and known issues).