Advanced LoRA Fine-Tuning: How to Pick LoRA, QLoRA, DoRA, PiSSA, OLoRA, EVA, and LoftQ for LLMs
A practical guide to parameter-efficient LLM adaptation on 16-bit and 4-bit models
When it’s done well, LoRA can match full fine-tuning while using a fraction of the memory.
It was introduced in 2021, when open LLMs were scarce and relatively small. Today, we have plenty of models, from a few hundred million to hundreds of billions of parameters. On these larger models, LoRA (or one of its variants) is often the only practical way to fine-tune without spending $10k+.
Originally, LoRA was meant to train small adapters on top of the attention blocks of LLMs. Since then, the community has proposed many optimizations and extensions, including techniques that work with quantized models.
In this article, we’ll look at the most useful, modern approaches to LoRA for adapting LLMs to your task and budget. We’ll review (Q)DoRA, (Q)LoRA, PiSSA, EVA, OLoRA, and LoftQ, compare their performance (with and without a quantized base model, when that’s relevant), and discuss when to pick each method. All of them are implemented in Hugging Face TRL.
You can find my notebook showing how to use these techniques here:


