This Week in Open Models: Tiny LFM2.5…

Jun 27

The Weekly Kaitchup #148

3 Comments

Any metrics on Ornith-35B Moe? I'm hoping if the 9B is competitive to qwen-3.5-35B that maybe the 35B MoE is competitive or better than Qwen-3.6. To wit, I still see the industry comparing a lot to Qwen-3.5, but Qwen-3.6 is much better in practice...3.5 feels not quite good enough for agentic coding (just below the bar) and 3.6 feels good enough (above the bar)...so 3.5 comparisons feel a lot less useful...we have to infer the delta between 3.5 and 3.6 and the apply to the new competitor.

They published comparisons 35b against Qwen3.6.

https://huggingface.co/deepreinforce-ai/Ornith-1.0-35B

it seems better. but one important metric is missing: token efficieny

I'm very curious to know whether ornith-1.0 consumes fewer tokens than Qwen3.5/3.6

Great point about LFM2.5-230M's GPQA-Diamond score, we could've specified it in the text!

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts