Discussion about this post

User's avatar
Tanj's avatar

How about using a Mac Studio for the 4-bit quantized version? 192GB of LPDDR5 at 800GB/s and GPU with access to integrated memory... A little pricey, but perhaps effective and simple for this kind of work?

Expand full comment
2 more comments...

No posts