Discussion about this post

User's avatar
Maxime Labonne's avatar

Great point about LFM2.5-230M's GPQA-Diamond score, we could've specified it in the text!

Nick Jenkins's avatar

Any metrics on Ornith-35B Moe? I'm hoping if the 9B is competitive to qwen-3.5-35B that maybe the 35B MoE is competitive or better than Qwen-3.6. To wit, I still see the industry comparing a lot to Qwen-3.5, but Qwen-3.6 is much better in practice...3.5 feels not quite good enough for agentic coding (just below the bar) and 3.6 feels good enough (above the bar)...so 3.5 comparisons feel a lot less useful...we have to infer the delta between 3.5 and 3.6 and the apply to the new competitor.

1 more comment...

No posts

Ready for more?