Weekly Kaitchup

DFlash for Qwen3.5, EAGLE for Gemma 4, and the MiniMax M2.7 License Debate

The Weekly Kaitchup #138

Apr 17 • Benjamin Marie

GLM 5.1 Is Here, MiniMax M2.7 and Qwen3.6 Are Coming Soon!

The Weekly Kaitchup #137

Apr 10 • Benjamin Marie

Gemma 4 31B and 26B A4B: Architecture and Memory Consumption

The Weekly Kaitchup #136

Apr 3 • Benjamin Marie

TurboQuant: Finally, Fast and Widely Available Low-Bit KV Cache Quantization?

The Weekly Kaitchup #135

Mar 27 • Benjamin Marie

Mistral Small 4: A Good Alternative to Qwen3.5 122B and Nemotron 3 Super?

The Weekly Kaitchup #134

Mar 20 • Benjamin Marie

Nemotron 3 Super: 1M Tokens, Small KV Cache

The Weekly Kaitchup #134

Mar 13 • Benjamin Marie

More Qwen3.5 GGUF Evals and Speculative Speculative Decoding (SSD)

The Weekly Kaitchup #133

Mar 6 • Benjamin Marie

Lessons from GGUF Evaluations: Ternary Qwen3.5, Bricked Minimax

The Weekly Kaitchup #132

Feb 27 • Benjamin Marie

Taalas HC1: Absurdly Fast, Per-User Inference at 17,000 tokens/second

The Weekly Kaitchup #131

Feb 20 • Benjamin Marie

Nanbeige4.1: Only 3B Parameters, but as Good as Qwen3 32B?

The Weekly Kaitchup #130

Feb 14 • Benjamin Marie

LoRA but with Only 13 Parameters??

The Weekly Kaitchup #129

Feb 6 • Benjamin Marie

This Week: Arcee Trinity and Quantization-Aware Distillation

The Weekly Kaitchup #128

Jan 30 • Benjamin Marie

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts