The Kaitchup – AI on a Budget

The Kaitchup – AI on a Budget

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Mistral Small 3: An Excellent 24B-Parameter Wide-Shallow LLM

Mistral Small 3: An Excellent 24B-Parameter Wide-Shallow LLM

Fine-tuning, quantization, and evaluation

Benjamin Marie's avatar
Benjamin Marie
Feb 17, 2025
∙ Paid
6

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Mistral Small 3: An Excellent 24B-Parameter Wide-Shallow LLM
1
Share
Generated with ChatGPT

With large language models (LLMs), depth often leads to better performance: adding more layers typically improves results more than increasing parameter count by widening hidden and intermediate layers.

However, deeper models come at a cost: slower inference. With Mistral Small 3, Mistral AI takes a different approach, favoring a wider model with a significantly larger intermediate size, while maintaining a layer count comparable to LLMs twice as small in total parameters.

Despite this unconventional architecture, Small 3 performs on par with larger models like Qwen2.5 32B on most benchmarks and closely matches Llama 3.3 70B in performance.

The Kaitchup – AI on a Budget is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

What’s in this article?

  • A deeper look at Mistral Small 3’s architecture.

  • How to use both the base and instruct models with Mistral AI’s recommended hyperparameters and prompts.

  • Since the model is unusually wide, I explored quantization performance, reducing it to 4-bit and testing its accuracy on IFEval and MMLU-PRO.

  • We will also see how to fine-tune the model.

I’ve prepared two notebooks for this article:

  • Quantization and Evaluation: Demonstrating the 4-bit quantization process and accuracy results.

Get the notebook (#145)

  • Fine-Tuning Mistral Small 3: Covering full-tuning, LoRA, and QLoRA techniques.

Get the notebook (#146)

Mistral Small 3: An Unusually “Wide” Model

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 The Kaitchup
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share