The Kaitchup – AI on a Budget

The Kaitchup – AI on a Budget

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Fine-tune a Better Google Gemma with Unsloth and Distilled DPO
Copy link
Facebook
Email
Notes
More

Fine-tune a Better Google Gemma with Unsloth and Distilled DPO

The Zephyr recipe on a budget

Benjamin Marie's avatar
Benjamin Marie
Mar 18, 2024
∙ Paid
6

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Fine-tune a Better Google Gemma with Unsloth and Distilled DPO
Copy link
Facebook
Email
Notes
More
2
Share
Generated with DALL-E

Finding good training hyperparameters for new LLMs is always difficult and time-consuming. With Zephyr Gemma 7B, Hugging Face seems to have found a good recipe for fine-tuning Gemma. They used a combination of distilled supervised fine-tuning and DPO similar to what they did for their original Zephyr based on Mistral 7B:

A Cheap Zephyr 7B Beta: Distilled DPO on Consumer Hardware

A Cheap Zephyr 7B Beta: Distilled DPO on Consumer Hardware

Benjamin Marie
·
November 9, 2023
Read full story

We also know now that are there several bugs in the Pytorch version of Gemma initially released on the Hugging Face Hub. These bugs impact the precision and performance of the model during training. They are currently under correction.

Unsloth, which is a framework for fast and memory-efficient fine-tuning, has already implemented several patches improving Gemma’s stability during fine-tuning.

unsloth: Faster and Memory-Efficient QLoRA Fine-tuning

unsloth: Faster and Memory-Efficient QLoRA Fine-tuning

Benjamin Marie
·
December 28, 2023
Read full story

In this article, I first review the recipe used by Hugging Face to train Zephyr Gemma 7B. Then, I show how to use this recipe with Unsloth. We will see how fast and memory-efficient Unsloth is with Gemma, with a peak memory consumption of 19 GB of VRAM and a total training time of only 8 hours.

The Kaitchup – AI on a Budget is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

The notebook implementing the fine-tuning and DPO training of Gemma 7B with unsloth is available here:

Get the notebook (#53)

Keep reading with a 7-day free trial

Subscribe to The Kaitchup – AI on a Budget to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 The Kaitchup
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More