The Kaitchup – AI on a Budget

The Kaitchup – AI on a Budget

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
A Cheap Zephyr 7B Beta: Distilled DPO on Consumer Hardware
Copy link
Facebook
Email
Notes
More

A Cheap Zephyr 7B Beta: Distilled DPO on Consumer Hardware

The recipe for training a Zephyr-like model without using A100 GPUs

Benjamin Marie's avatar
Benjamin Marie
Nov 09, 2023
∙ Paid
8

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
A Cheap Zephyr 7B Beta: Distilled DPO on Consumer Hardware
Copy link
Facebook
Email
Notes
More
2
1
Share
Image generated by Substack

Hugging Face’s Zephyr 7B Beta is a 7 billion parameter chat model outperforming much larger LLMs. In the previous issue of The Kaitchup, we saw what makes the model so good: knowledge distillation.

Zephyr 7B Beta: A Good Teacher Is All You Need

Zephyr 7B Beta: A Good Teacher Is All You Need

Benjamin Marie, PhD
·
November 6, 2023
Read full story

Hugging Face trained Zephyr with DPO to align it with human preferences. In another article, we also saw why DPO is much simpler than standard reinforcement learning with human feedback (RLHF) while performing as well.

Fine-tune Your Own Instruct Version of Mistral 7B with Direct Preference Optimization (DPO)

Fine-tune Your Own Instruct Version of Mistral 7B with Direct Preference Optimization (DPO)

Benjamin Marie, PhD
·
October 26, 2023
Read full story

While Zephyr 7B Beta was a relatively cheap model to make thanks to DPO and distillation, Hugging Face still needed 16 A100 80 GB GPUs for a few hours to train it. Note: In the cloud, the cost of training Zephyr would be a few hundred dollars.

In this article, we will see how to turn Mistral 7B into a Zephyr 7B Beta on consumer hardware using a parameter-efficient training method and quantization (QLoRA). I also adapt the training data made and used by Hugging Face, along with the training hyperparameters, to speed up training and reduce memory consumption.

My recipe for a cheap Zephyr 7B is implemented in this notebook:

Get the notebook (#26)

The Kaitchup – AI on a Budget is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Keep reading with a 7-day free trial

Subscribe to The Kaitchup – AI on a Budget to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 The Kaitchup
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More