The Kaitchup – AI on a Budget

The Kaitchup – AI on a Budget

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Fine-tune Tiny Adapters for Llama 3 with VeRA

Fine-tune Tiny Adapters for Llama 3 with VeRA

LoRA but 100x smaller

Benjamin Marie's avatar
Benjamin Marie
Jun 06, 2024
∙ Paid
5

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Fine-tune Tiny Adapters for Llama 3 with VeRA
2
1
Share
Generated with DALL-E

LoRA fine-tunes large language models (LLMs) by adding an adapter on top of the pre-trained LLM, with only this adapter being trainable while the LLM’s original parameters remain frozen. This approach significantly reduces the number of parameters that need to be trained, resulting in much smaller optimizer states. Consequently, LoRA fine-tuning consumes considerably less memory compared to standard full fine-tuning.

Nonetheless, depending on LoRA’s hyperparameters, such as the rank and the number of targeted modules, LoRA may still create very large adapters with hundreds of millions of parameters that are too large to be fine-tuned on consumer hardware.

Many alternatives have been proposed to reduce the size of adapters.

DoRA: Better and Faster than LoRA?

DoRA: Better and Faster than LoRA?

Benjamin Marie
·
March 11, 2024
Read full story

The Kaitchup – AI on a Budget is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

In this article, I review VeRA. I explain how it works and how it can produce adapters 100x smaller than LoRA. I fine-tuned Llama 3 with VeRA for demonstration and compared its performance with LoRA.

The notebook demonstrating VeRA fine-tuning for Llama 3 is available here:

Get the notebook (#76)

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 The Kaitchup
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share