The Kaitchup – AI on a Budget
Subscribe
Sign in
Home
Notes
Chat
Start Here
AI Notebooks
The Kaitchup's Book
Weekly Kaitchup
Tutorials
The Kaitchup Index
Archive
About
Tutorials
Latest
Top
Discussions
Eagle 3 Speculators: When To Use Them?
Easier and faster speculative decoding, if you are in the right settings
Dec 9, 2025
•
Benjamin Marie
3
Accelerate Models with Quantization: Recipes for NVFP4, GPTQ, AWQ, SmoothQuant, AutoRound, and FP8
Focus on 4-bit and 8-bit quantization + vLLM benchmarking with accuracy and inference throughput
Nov 24, 2025
•
Benjamin Marie
9
9
1
Unsloth's Quantization-Aware Training (QAT) vs Post-Training Quantization (PTQ) for Small Models
Can a tiny LLM stay accurate under quantization thanks to QAT?
Nov 10, 2025
•
Benjamin Marie
9
2
Advanced LoRA Fine-Tuning: How to Pick LoRA, QLoRA, DoRA, PiSSA, OLoRA, EVA, and LoftQ for LLMs
A practical guide to parameter-efficient LLM adaptation on 16-bit and 4-bit models
Nov 3, 2025
•
Benjamin Marie
14
3
1
Generate Better Synthetic Datasets with a "User" LLM
User LLM + Qwen3 to generate fully synthetic dialogues
Oct 27, 2025
•
Benjamin Marie
11
1
Qwen3-VL Fine-Tuning on Your Computer
Model review, GPU requirements, and code explained step by step
Oct 20, 2025
•
Benjamin Marie
9
Choosing a GGUF Model: K-Quants, IQ Variants, and Legacy Formats
Reviewing the differences between each type and their impact on accuracy, throughput, and memory.
Oct 13, 2025
•
Benjamin Marie
8
Why Increasing Batch Size Doesn’t Always Speed Up Training
5 most common issues that decreases the batch training efficiency
Oct 7, 2025
•
Benjamin Marie
9
1
Serve Multiple LoRA Adapters with vLLM and Custom Chat Templates
Swap adapters per request, reuse your chat template, and run offline or via an OpenAI-compatible server.
Sep 23, 2025
•
Benjamin Marie
8
1
DenseMixer: Smarter MoE Routing That Doesn’t Break LoRA and QLoRA
Better MoE training for a slightly higher cost
Sep 8, 2025
•
Benjamin Marie
5
2
Gemma 3 270M: Can Tiny Models Learn New Tasks?
A case study with machine translation
Sep 1, 2025
•
Benjamin Marie
21
10
4
NVFP4: Same Accuracy with 2.3x Higher Throughput for 4-Bit LLMs
How to quantize LLMs with NVFP4
Aug 25, 2025
•
Benjamin Marie
9
6
1
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts