Discussion about this post

User's avatar
Matt's avatar

Have you ever tried quantizing a Bert or Roberta classifier model with GGUF? I'm curious how the performance compares on CPU to that of an unquantized classifier on GPU.

Expand full comment
9 more comments...

No posts