5 Comments

Crazy how we default save so many logits when they basically are never needed except if someone is doing beam search or something. I hadn’t thought about that.

Expand full comment

Hmm. If Microsoft thought those benchmarks needed decontamination, when will we see other model results using decontamination, and what methods will be used?

Expand full comment

Last summer, I was using unsloth on a multigpu setup without issue…did they disable it completely? It was never supported explicitly but it ran fine for me until I tried again this past week using the same code and parameters.

Expand full comment

Actually, I didn't even know that multi-GPU was possible with the free version. I always thought it was only available for the paid version.

My guess is that they don't want to unlock multi-GPU for the free version since this would remove the value of the paid version.

https://unsloth.ai/pricing

Expand full comment

Yeah could be that earlier they were replying more on transformers, which will default to pipeline parallel.

But maybe since parts of that library have been pulled into unsloth which doesn’t.

Expand full comment