Discussion about this post

User's avatar
Max's avatar
Apr 3Edited

Gemma was impressive for a while but has since been overshadowed by numerous LLMs. Your quote about being “disappointed by the lack of novelty” sums it up unfortunately.

Google’s trade war with other hyperscalers puts them in a similar mindset as Meta, who intends to rent their next LLM. It looks like Google opted to stay competitive without investing significant resources in Gemma.

You mentioned that NVIDIA integrated NVP4 quantization into Gemma. This suggests they were concerned about Gemma4 competing with the NeMoTrons, but perhaps not anymore after reviewing!

Google stated they built Gemma4 from scratch, implying a more robust training dataset. I wonder if it can capture second or third-order nuances like Gemini does, or if such reasoning is still impossible beyond its 31B parameter range?

These were very helpful: The Fastest and Cheapest 120B LLM?, TurboQuant: Finally, Fast and Widely Available Low-Bit KV Cache Quantization?, Mistral Small 4: A Good Alternative to Qwen3.5 122B and Nemotron 3 Super? Keep churning out the great work!

No posts

Ready for more?