rsQLoRA: Fine-tune Llama 3 with Higher Ranks and QLoRA

Evaluating the impact of rank-stabilized LoRA on recent LLMs and when using QLoRA

Jul 04, 2024

∙ Paid

With parameter-efficient fine-tuning methods such as LoRA, we can fine-tune LLMs on consumer hardware. For instance, with LoRA, it is possible to fine-tune Llama 3 8B using a 16 GB GPU.

Fine-tune Llama 3 on Your Computer

Benjamin Marie

April 22, 2024

Read full story

However, LoRA is only approximating full fine-tuning. Previous work has shown that the capacity of a LoRA adapter to learn is limited by the low-rank nature of the update applied to the adapter’s parameters during fine-tuning.

Increasing the rank, which is a hyperparameter of LoRA, sounds like an intuitive solution to increase the rank of the update and potentially improve fine-tuning. In practice, increasing the rank is often ineffective. Various methods have been proposed to leverage higher LoRA ranks. One such method is Rank-stabilized LoRA (rsLoRA), which is straightforward enough to be supported by many fine-tuning frameworks.

In this article, I explain rsLoRA and apply it to Llama 3. We will verify whether rsLoRA has a positive impact when fine-tuning high-rank adapters for Llama 3.

I’ve also implemented a notebook showing how to use rsLoRA, in combination with QLoRA, here:

Get the notebook (#84)

The Kaitchup – AI on a Budget

rsQLoRA: Fine-tune Llama 3 with Higher Ranks and QLoRA

Evaluating the impact of rank-stabilized LoRA on recent LLMs and when using QLoRA

Fine-tune Llama 3 on Your Computer

This post is for paid subscribers