SmolLM: Full Fine-tuning and Aligning Tiny LLMs on Your Computer

With supervised fine-tuning and distilled DPO

Aug 08, 2024

∙ Paid

Recent large language models (LLMs) have billions of parameters. They are usually too large to run on low-cost hardware or small devices efficiently. Among the small LLMs, we currently have good models, such as Phi-3 mini (3.8B parameters), which perform well for language generation tasks but are still too large for many use cases.

Phi-3 mini: Fine-tuning and Quantization on Your Computer

Benjamin Marie

May 2, 2024

Read full story

Apple OpenELM LLMs are smaller alternatives. The smallest OpenELM has 270M parameters and can be quickly fine-tuned on-device. However, the model struggles to follow instructions, even after fine-tuning.

Fine-tune Tiny Chat Models with Apple OpenELM and ORPO

Benjamin Marie

May 9, 2024

Read full story

Hugging Face launched even smaller LLMs: SmolLM, 135M, 360M, and 1.7B parameter LLMs.

In this article, I review SmolLM. We will see how Hugging Face made them. I show how to fine-tune SmolLM for chat applications, focusing on the 135M and 360M versions, and align the models with human preferences using the DPO technique. The main purpose of this article is to show how to fully fine-tune and align tiny LLMs on consumer hardware. If you have high-quality training data for a specific domain that doesn't require complex reasoning, you can successfully fine-tune SmolLM within a few hours on your computer.

My notebook for fine-tuning and aligning tiny LLMs on consumer GPUs is available here:

Get the notebook (#93)

*Last update: May 26, 2025*

The Kaitchup – AI on a Budget

SmolLM: Full Fine-tuning and Aligning Tiny LLMs on Your Computer

With supervised fine-tuning and distilled DPO

Phi-3 mini: Fine-tuning and Quantization on Your Computer

Fine-tune Tiny Chat Models with Apple OpenELM and ORPO

This post is for paid subscribers