Skip to content

Latest commit

 

History

History
14 lines (10 loc) · 960 Bytes

README.md

File metadata and controls

14 lines (10 loc) · 960 Bytes

Fine-tune GPT 3.5 and CodeLlama 34B on the viggo functional representation dataset

Code Llama fine-tune:

  • Inside the llama directory that is the notebook that will fine-tune Code Llama 34B on the Viggo dataset
  • It uses Huggingface transformers to pull the model and dataset from the huggingface hub

This is a Lora fine-tune quantising the model to 8 bits. Therefore, you'll need just over 40GB of GPU memory. I used an A40. But you could use an 80GB A100 or A6000 too.

OpenAI fine-tune:

Inside the openai folder there are two notebooks and the two dataset files (formatted for OpenAI):

  • openai_train.ipynb loads the datasets and sends them to OpenAI to fine-tune GPT 3.5
  • openai_dataset_analytics.ipynb reads an OpenAI dataset jsonl file and works out whether there are any errors in the file and the total number of tokens in the dataset.

My full post on the results of this is here!