Phi-3 “mini” is the new Phi model released by Microsoft. It has 3.8 billion parameters, 1.1 billion more parameters than Phi-2. Even though the number of parameters is growing, a 3.8B model can still be fine-tuned with a very affordable GPU.
Despite their small size, the Phi models have always performed very well thanks to their pre-training on synthetic data generated by larger and better models. Microsoft’s own evaluation shows that Phi-3 mini is as good as Llama 3 8B on some benchmarks.
Later, Microsoft will also release a Phi-3 7B and a Phi-3 14B, denoted “small” and “medium”, respectively, which would be even better models.
In this article, I present Phi-3 and review the technical report published by Microsoft. I show how to fine-tune and quantize Phi-3 on consumer hardware using an 8 GB GPU. Phi-3 exploits a neural architecture close to the Llama architecture. It makes the Phi models very easy to fine-tune and compatible with most deep learning frameworks.
I made a notebook for Phi-3 showing how to:
Fine-tune Phi-3 mini with QLoRA and LoRA
Quantize Phi-3 mini with BitsandBytes and GPTQ
Run Phi-3 mini with Transformers: