The Kaitchup – AI on a Budget
Subscribe
Sign in
Home
Notes
AI Notebooks
The Kaitchup's Book
Weekly Kaitchup
Tutorials
The Kaitchup Index
Archive
About
Weekly Kaitchup
Latest
Top
Discussions
Nemotron 3 Super: 1M Tokens, Small KV Cache
The Weekly Kaitchup #134
Mar 13
•
Benjamin Marie
10
More Qwen3.5 GGUF Evals and Speculative Speculative Decoding (SSD)
The Weekly Kaitchup #133
Mar 6
•
Benjamin Marie
9
Lessons from GGUF Evaluations: Ternary Qwen3.5, Bricked Minimax
The Weekly Kaitchup #132
Feb 27
•
Benjamin Marie
11
9
2
Taalas HC1: Absurdly Fast, Per-User Inference at 17,000 tokens/second
The Weekly Kaitchup #131
Feb 20
•
Benjamin Marie
24
1
2
Nanbeige4.1: Only 3B Parameters, but as Good as Qwen3 32B?
The Weekly Kaitchup #130
Feb 14
•
Benjamin Marie
6
LoRA but with Only 13 Parameters??
The Weekly Kaitchup #129
Feb 6
•
Benjamin Marie
8
This Week: Arcee Trinity and Quantization-Aware Distillation
The Weekly Kaitchup #128
Jan 30
•
Benjamin Marie
9
3
This Week: GLM 4.7 Flash's Huge KV Cache and LFM2.5 Thinking
The Weekly Kaitchup #127
Jan 23
•
Benjamin Marie
8
MMLU-Pro Has an Answer Leak (and It’s Just Whitespace)
The Weekly Kaitchup #126
Jan 16
•
Benjamin Marie
6
LFM2.5 and Falcon H1R-7B: New Hybrid Models with Strong Benchmark Scores
The Weekly Kaitchup #125
Jan 9
•
Benjamin Marie
6
3
2
2026 Predictions: Much Faster Inference, Pre-Training with RL, and FP4 Everywhere
The Weekly Kaitchup #124
Jan 2
•
Benjamin Marie
12
Encoder–Decoders and Byte LLMs: T5Gemma 2 and AI2’s New Models
The Weekly Kaitchup #123
Dec 19, 2025
•
Benjamin Marie
8
2
2
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts