The Kaitchup – AI on a Budget

The Kaitchup – AI on a Budget

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Fine-tune Llama 3 on Your Computer

Fine-tune Llama 3 on Your Computer

With code to merge QLoRA adapters and quantize the model

Benjamin Marie's avatar
Benjamin Marie
Apr 22, 2024
∙ Paid
12

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Fine-tune Llama 3 on Your Computer
38
Share
Generated with DALL-E

Llama 3 is currently available in two versions: 8B and 70B. The 8B version, which has 8.03 billion parameters, is small enough to run locally on consumer hardware.

With parameter-efficient fine-tuning (PEFT) methods such as LoRA, we don’t need to fully fine-tune the model but instead can fine-tune an adapter on top of it. To further decrease memory consumption, we can even apply this method on top of a quantized Llama 3 with QLoRA.

QLoRA: Fine-Tune a Large Language Model on Your GPU

QLoRA: Fine-Tune a Large Language Model on Your GPU

Benjamin Marie
·
May 30, 2023
Read full story

Get instant access to over 150 articles and tutorials on fine-tuning/running LLMs, plus more than 100 AI notebooks. Subscribe to The Kaitchup:

In this article, I briefly present Llama 3 and the hardware requirements to fine-tune and run it locally. Then, I show how to fine-tune the model on a chat dataset. The code is fully explained. With LoRA, you need a GPU with 24 GB of RAM to fine-tune Llama 3. With QLoRA, you only need a GPU with 16 GB of RAM.

After the fine-tuning, I also show:

  • How to merge the fine-tuned adapter into Llama 3.

  • How to quantize the model to 4-bit with AWQ to reduce its size.

  • In the notebook only: How to fully fine-tune the model, i.e., without using an adapter, with GaLore.

All the code explained in this article is also implemented in this notebook:

Get the notebook (#62)

The code presented in this article also works for Llama 3.1.

Llama 3.1: Fine-tuning on Consumer Hardware — LoRA vs. QLoRA

Llama 3.1: Fine-tuning on Consumer Hardware — LoRA vs. QLoRA

Benjamin Marie
·
July 29, 2024
Read full story

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 The Kaitchup
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share