Archive - The Kaitchup – AI on a Budget

Accelerate Models with Quantization: Recipes for NVFP4, GPTQ, AWQ, SmoothQuant, AutoRound, and FP8

Focus on 4-bit and 8-bit quantization + vLLM benchmarking with accuracy and inference throughput

Nov 24 •

Olmo 3 Is Here!

The Weekly Kaitchup #119

Nov 21 •

Best GPUs Under $1,500 for AI: Should You Upgrade?

Comparing mid-tier consumer GPUs, from RTX 30xx to 50xx, for running and fine-tuning LLMs

Nov 17 •

The Limits of GRPO-like Methods for Reinforcement Learning

The Weekly Kaitchup #118

Nov 14 •

Unsloth's Quantization-Aware Training (QAT) vs Post-Training Quantization (PTQ) for Small Models

Can a tiny LLM stay accurate under quantization thanks to QAT?

Nov 10 •

BF16 vs FP16 for Reinforcement Learning: Where Are We?

The Weekly Kaitchup #117

Nov 7 •

Advanced LoRA Fine-Tuning: How to Pick LoRA, QLoRA, DoRA, PiSSA, OLoRA, EVA, and LoftQ for LLMs

A practical guide to parameter-efficient LLM adaptation on 16-bit and 4-bit models

Nov 3 •

October 2025

MiniMax M2 and Kimi-Linear: Why Full Attention Still Wins

The Weekly Kaitchup #116

Oct 31 •

Generate Better Synthetic Datasets with a "User" LLM

User LLM + Qwen3 to generate fully synthetic dialogues

Oct 27 •

The Weekly Kaitchup #115

Oct 24 •

Qwen3-VL Fine-Tuning on Your Computer

Model review, GPU requirements, and code explained step by step

Oct 20 •

DGX Spark: Use It for Fine-Tuning

The Weekly Kaitchup #114

Oct 17 •

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts