Llama 4 Scout, Maverick, and Behemoth: MoE, VLMs, and Very Long Context

How to burn GPUs

Apr 05, 2025

∙ Paid

Generated image — Image generated with ChatGPT

Llama 4 is here. After a full year of waiting since Llama 3’s release in April 2024, Meta finally dropped their next-gen models, and the scale is wild: 109B, 400B, and an eye-watering 2 trillion parameters.

Instead of focusing on outperforming smaller models with sub-10B parameter models, a task that’s increasingly challenging and yields diminishing returns, Meta capitalized on its core advantage: large-scale training across massive GPU clusters.

Let’s break down what’s actually inside Llama 4.

The Kaitchup – AI on a Budget

Llama 4 Scout, Maverick, and Behemoth: MoE, VLMs, and Very Long Context

How to burn GPUs

This post is for paid subscribers