The Kaitchup – AI on a Budget

The Kaitchup – AI on a Budget

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Llama 3.1: Fine-tuning on Consumer Hardware — LoRA vs. QLoRA

Llama 3.1: Fine-tuning on Consumer Hardware — LoRA vs. QLoRA

And why you should pad right

Benjamin Marie's avatar
Benjamin Marie
Jul 29, 2024
∙ Paid
12

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Llama 3.1: Fine-tuning on Consumer Hardware — LoRA vs. QLoRA
5
1
Share
Generated with DALL-E

Along with Llama 3 405B, Meta also released new versions of Llama 3 8B and 70B (“Llama 3.1”). You can find them here:

  • Llama 3.1 Collection

The main differences with Llama 3 include the official support of German, French, Italian, Portuguese, Hindi, Spanish, and Thai, along with function calling. These new versions have been post-trained on very long sequences. They can handle contexts of up to 128k tokens without a noticeable accuracy drop.

Fine-tune Llama 3 on Your Computer

Fine-tune Llama 3 on Your Computer

Benjamin Marie
·
April 22, 2024
Read full story

How is fine-tuning different for this new version? I found a couple of things that made the fine-tuning of Llama 3.1 easier and better.

The Kaitchup – AI on a Budget is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

In this article, we will fine-tune Llama 3.1 with LoRA and QLoRA and discuss the changes in the code and learning curves compared to Llama 3. We will see that the padding side chosen for fine-tuning has a significant and unexpected impact on the results.

The code for fine-tuning Llama 3.1 with LoRA and QLoRA is implemented in this notebook:

Get the notebook (#90)

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 The Kaitchup
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share