Discussion about this post

User's avatar
Philip Erdös's avatar

I don't get the memory consumption. For the 8b model the "paged AdamW 8-bit" needs more memory than the "paged AdamW 32-bit" ??

Expand full comment
Mike's avatar

Thank you. Great explanation. I am now curious which other tricks can be used to fit llama 3.1 8b on a 24GB for full fine tuning. (I saw that Torchtune allows it. )

Expand full comment
6 more comments...

No posts