The Kaitchup – AI on a Budget

The Kaitchup – AI on a Budget

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Qwen2 vs. Llama 3: QLoRA Learning Curves and Quantization Performance

Qwen2 vs. Llama 3: QLoRA Learning Curves and Quantization Performance

With code for Qwen2 fine-tuning and quantization on your computer

Benjamin Marie's avatar
Benjamin Marie
Jun 13, 2024
∙ Paid
6

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Qwen2 vs. Llama 3: QLoRA Learning Curves and Quantization Performance
Share
Generated with DALL-E

Qwen2 is a new family of LLMs by Alibaba Cloud, including pre-trained and instruction-tuned models with 5 sizes available: 0.5B, 1.5B, 7B, 57B-A14B (mixture of experts), and 72B.

According to their own evaluation, Qwen2 significantly outperforms Llama 3 in most tasks. Qwen2 looks like a good alternative to Llama 3 and is very suitable for consumer hardware thanks to the availability of small models such as the 0.5B and 1.5B versions.

Are the Qwen2 LLMs still better than Llama 3 after quantization? Do they learn faster and better than Llama 3 with QLoRA fine-tuning?

Get instant access to over 100 AI articles and tutorials, plus more than 80 comprehensive AI notebooks. Subscribe to The Kaitchup:

In this article, I review Qwen2 and answer these questions. We will see the main differences in architectures between Qwen2 and Llama 3. I tested the robustness of Qwen2 to quantization and compared its performance, once quantized, with Llama 3. Finally, we will compare the learning curves of Llama 3 and Qwen2 obtained with QLoRA fine-tuning. Overall, Qwen2 seems more robust to quantization and a good alternative to Llama 3.

The code to quantize and fine-tune Qwen2 on your computer is available here:

Get the notebook (#77)

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 The Kaitchup
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share