Phi-4: What's New and How to Fine-Tune It on Your Computer (+ quantized version)

The good student of GPT-4

Dec 19, 2024

∙ Paid

With Phi-4, Microsoft reaffirms the trend of increasing size for the Phi model series. The first Phi model had 1.3 billion parameters, which Microsoft referred to as a "small model." Now, Phi-4 has 14 billion parameters, yet Microsoft continues to categorize it as small, even though it cannot be run out-of-the-box on a consumer-grade GPU.

As is typical for Phi models, Microsoft has published remarkable results on public benchmarks, mainly thanks to high-quality synthetic training datasets targeting the domains and tasks found in these benchmarks.

In this article, we will first explore how Microsoft developed Phi-4. Microsoft has disclosed significantly more details for this iteration, particularly regarding the synthetic data that constitutes the majority of its pre-training dataset. Next, we will examine how to use and fine-tune the model and discuss its known limitations.

Additionally, I’ve created a highly accurate quantized version of Phi-4. You can find it here:

kaitchup/Phi-4-AutoRound-GPTQ-4bit

Since Phi-4 uses the same architecture as Phi-3/3.5, you can use the same code that I published and explained in this article:

Fine-tuning Phi-3.5 MoE and Mini on Your Computer

Benjamin Marie

August 22, 2024

Read full story

The code is in this notebook:

Get the notebook (#97)

Note: The configuration of the tokenizer must be changed. I use this one:

The Kaitchup – AI on a Budget

Phi-4: What's New and How to Fine-Tune It on Your Computer (+ quantized version)

The good student of GPT-4

Fine-tuning Phi-3.5 MoE and Mini on Your Computer

This post is for paid subscribers