The Kaitchup – AI on a Budget

The Kaitchup – AI on a Budget

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
"Thinking" LLMs with Simple Fine-tuning and Budget Forcing
Copy link
Facebook
Email
Notes
More

"Thinking" LLMs with Simple Fine-tuning and Budget Forcing

How to activate "reasoning" in LLMs

Benjamin Marie's avatar
Benjamin Marie
Feb 13, 2025
∙ Paid
12

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
"Thinking" LLMs with Simple Fine-tuning and Budget Forcing
Copy link
Facebook
Email
Notes
More
5
Share
Generated with ChatGPT

Recent research shows that enhancing the reasoning capabilities of large language models (LLMs) can be surprisingly affordable. Works like LIMO and s1 demonstrate that fine-tuning on a small but well-curated dataset can be enough to outperform GPT-4 in tasks requiring advanced reasoning.

The Kaitchup – AI on a Budget is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

In this article, we’ll train a 7B parameter model to reason using just 1,000 supervised fine-tuning samples—without reinforcement learning. We’ll apply s1’s budget-forcing technique at inference time, encouraging the model to "think" more before generating answers.

While previous studies relied on full fine-tuning, which requires multiple high-end GPUs when working with long sequences, we will take a more cost-effective approach. We’ll leverage LoRA fine-tuning, significantly reducing computational costs while unlocking strong reasoning capabilities.

Check out the notebook below for a step-by-step guide on fine-tuning LLMs with LoRA on the s1 dataset, and then using vLLM for inference with an adapter and budget forcing.

Get the notebook (#144)

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 The Kaitchup
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More