Fine-tune Tiny Adapters for Llama 3 with VeRA

LoRA but 100x smaller

Jun 06, 2024

∙ Paid

LoRA fine-tunes large language models (LLMs) by adding an adapter on top of the pre-trained LLM, with only this adapter being trainable while the LLM’s original parameters remain frozen. This approach significantly reduces the number of parameters that need to be trained, resulting in much smaller optimizer states. Consequently, LoRA fine-tuning consumes considerably less memory compared to standard full fine-tuning.

Nonetheless, depending on LoRA’s hyperparameters, such as the rank and the number of targeted modules, LoRA may still create very large adapters with hundreds of millions of parameters that are too large to be fine-tuned on consumer hardware.

Many alternatives have been proposed to reduce the size of adapters.

DoRA: Better and Faster than LoRA?

Benjamin Marie

March 11, 2024

Read full story

In this article, I review VeRA. I explain how it works and how it can produce adapters 100x smaller than LoRA. I fine-tuned Llama 3 with VeRA for demonstration and compared its performance with LoRA.

The notebook demonstrating VeRA fine-tuning for Llama 3 is available here:

Get the notebook (#76)

The Kaitchup – AI on a Budget

Fine-tune Tiny Adapters for Llama 3 with VeRA

LoRA but 100x smaller

DoRA: Better and Faster than LoRA?

This post is for paid subscribers