The Kaitchup – AI on a Budget

The Kaitchup – AI on a Budget

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Fine-tune Llama 2 on Your Computer with QLoRa and TRL

Fine-tune Llama 2 on Your Computer with QLoRa and TRL

On Guanaco and with the correct padding

Benjamin Marie's avatar
Benjamin Marie
Aug 04, 2023
∙ Paid
8

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Fine-tune Llama 2 on Your Computer with QLoRa and TRL
11
2
Share

Llama 2 is a state-of-the-art large language model (LLM) released by Meta.

In the paper presenting the model, Llama 2 demonstrates impressive capabilities on public benchmarks for various natural language generation and coding tasks.

Meta also released Chat versions of Llama 2. These chat models can be used as chatbots. They mimic OpenAI’s ChatGPT capabilities and can solve many problems with the right prompts.

Both versions of Llama 2 are currently available in different sizes: 7B, 13B, and 70B parameters. Note: A 34B parameter version is presented in the paper but has not been released yet. 

The 7B and 13B models are especially interesting if you want to run Llama 2 on your computer. With recent advances in quantization, using GPTQ or QLoRa, you can fine-tune and run these models on consumer hardware.

I have written about Llama 2 and GPTQ here: 

Quantization of Llama 2 with GTPQ for Fast Inference on Your Computer

Quantization of Llama 2 with GTPQ for Fast Inference on Your Computer

Benjamin Marie
·
July 27, 2023
Read full story

In this article, I go through all the steps to fine-tune Llama 2 with QLoRa on instruction datasets. I use Hugging Face’s TRL library which simplifies LLM fine-tuning with instruction datasets. After implementing this article, you will have your own Llama 2 chat model running on your computer. 

Get the notebook (#7)

Note: Llama 2 is distributed with a license allowing commercial use. However, note that you cannot use Llama 2 for improving another LLM that is not Llama 2 as explicitly stated in the license. I wrote about this limit of the license in this other article:

What You Cannot Do With Llama 2

What You Cannot Do With Llama 2

Benjamin Marie
·
July 29, 2023
Read full story

Share

How to get Llama 2?

Note: If you already have access to Llama 2 on Hugging Face, you may skip this part.

You must register to get it from Meta. The form to get it is there. You should receive an email from Meta within one hour.

Then, since I’ll use Hugging Face Hub, you will also need to create a Hugging Face account. The email address you used to create this account must be the same email that you used to get the Llama 2 weights.

Then, go to a Llama 2 model card, and follow the instructions (you should be logged in to your account and you will see a checkbox to check and a button to click at the top of the model card). This step takes more time, but you should get access to Llama 2 on the Hugging Face hub within 1 day.

You will also need to create an access token from your Hugging Face account. Go to “settings” in your Hugging Face account and generate one.

How does QLoRa work?

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 The Kaitchup
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share