Finetuning#

NOTE: this document is a work in progress!

This document aims to provide a step-by-step guide to finetuning a model on conversations from gptme.

The goal of fine-tuning a model for gptme is to:

Teach the tools available in gptme
Update out-of-date knowledge and conventions
Improve its ability to recover from errors

Step 1: Gather the data#

To fine-tune we need something to fine-tune on.

We will fine-tune on our own conversation history, combined with a subset of the [OpenAssistant dataset][oa-dataset] to extend the training data with relevant examples.

We collect our own conversation history by running the following command:

./train/collect.py --model "HuggingFaceH4/zephyr-7b-beta"  # or whatever model you intend to fine-tune

This will create files train.csv and train.jsonl in the train directory.

TODO: describe how to get the OpenAssistant dataset TODO: describe how to use exported ChatGPT conversations

Step 2: Prepare the data#

We need to prepare the data for fine-tuning. This involves:

Extend the data with examples from the OpenAssistant dataset
Splitting the data into train and validation sets
- We might want to make sure that the validation set is comprised of examples from gptme, and not from the OpenAssistant dataset.

TODO…

Step 3: Fine-tune the model#

Options:

axolotl
- Does it support Mistral? (and by extension Zephyr)
[Hugging Face transformers][hf-transformers]
- Examples for Llama2 by Meta
[OpenPipe][openpipe]?
- Looks interesting, but not sure if it’s relevant for us.

TODO…

Model suggestions#

HuggingFaceH4/zephyr-7b-beta
teknium/Replit-v2-CodeInstruct-3B
- I had issues with this one on M2, but would be good to have some 3B model as an example used in testing/debug.

Finetuning

Contents

Finetuning#

Step 1: Gather the data#

Step 2: Prepare the data#

Step 3: Fine-tune the model#

Model suggestions#