Discussion about this post

User's avatar
Ronan McGovern's avatar

Nice piece, as usual!

Am I reading the results correctly? I see that AWQ is only a little bit higher in terms of perplexity.

Expand full comment
Matt's avatar

vLLM recently optimized its use of AWQ. I wonder if/when they'll do the same for SqueezeLLM. https://github.com/vllm-project/vllm/pull/2566

Expand full comment
1 more comment...

No posts