Discussion about this post

User's avatar
Trelis Research's avatar

Crazy how we default save so many logits when they basically are never needed except if someone is doing beam search or something. I hadn’t thought about that.

Expand full comment
John Saunders's avatar

Hmm. If Microsoft thought those benchmarks needed decontamination, when will we see other model results using decontamination, and what methods will be used?

Expand full comment
3 more comments...

No posts