Finetuning#

NOTE: this document is a work in progress!

This document aims to provide a step-by-step guide to finetuning a model on conversations from gptme.

The goal of fine-tuning a model for gptme is to:

  • Teach the tools available in gptme

  • Update out-of-date knowledge and conventions

  • Improve its ability to recover from errors

Step 1: Gather the data#

To fine-tune we need something to fine-tune on.

We will fine-tune on our own conversation history, combined with a subset of the [OpenAssistant dataset][oa-dataset] to extend the training data with relevant examples.

We collect our own conversation history by running the following command:

./train/collect.py --model "HuggingFaceH4/zephyr-7b-beta"  # or whatever model you intend to fine-tune

This will create files train.csv and train.jsonl in the train directory.

TODO: describe how to get the OpenAssistant dataset TODO: describe how to use exported ChatGPT conversations

Step 2: Prepare the data#

We need to prepare the data for fine-tuning. This involves:

  • Extend the data with examples from the OpenAssistant dataset

  • Splitting the data into train and validation sets

    • We might want to make sure that the validation set is comprised of examples from gptme, and not from the OpenAssistant dataset.

TODO…

Step 3: Fine-tune the model#

Options:

  • axolotl

    • Does it support Mistral? (and by extension Zephyr)

  • [Hugging Face transformers][hf-transformers]

  • [OpenPipe][openpipe]?

    • Looks interesting, but not sure if it’s relevant for us.

TODO…

Model suggestions#

  • HuggingFaceH4/zephyr-7b-beta

  • teknium/Replit-v2-CodeInstruct-3B

    • I had issues with this one on M2, but would be good to have some 3B model as an example used in testing/debug.