Qwen2 vs. Llama 3: QLoRA Learning Curves and Quantization Performance
With code for Qwen2 fine-tuning and quantization on your computer
Qwen2 is a new family of LLMs by Alibaba Cloud, including pre-trained and instruction-tuned models with 5 sizes available: 0.5B, 1.5B, 7B, 57B-A14B (mixture of experts), and 72B.
According to their own evaluation, Qwen2 significantly outperforms Llama 3 in most tasks. Qwen2 looks like a good alternative to Llama 3 and is very suitable for consumer hardware thanks to the availability of small models such as the 0.5B and 1.5B versions.
Are the Qwen2 LLMs still better than Llama 3 after quantization? Do they learn faster and better than Llama 3 with QLoRA fine-tuning?
In this article, I review Qwen2 and answer these questions. We will see the main differences in architectures between Qwen2 and Llama 3. I tested the robustness of Qwen2 to quantization and compared its performance, once quantized, with Llama 3. Finally, we will compare the learning curves of Llama 3 and Qwen2 obtained with QLoRA fine-tuning. Overall, Qwen2 seems more robust to quantization and a good alternative to Llama 3.
The code to quantize and fine-tune Qwen2 on your computer is available here: