The Kaitchup – AI on a Budget
Subscribe
Sign in
Home
Notes
AI Notebooks
Table of Contents
Tutorials
Models
Hardware for LLMs
Archive
About
Latest
Top
Discussions
Fine-tune Llama 3 70B on Your GPU with AQLM 2-bit
It's possible to fine-tune Llama 3 70B with only 24 GB of GPU RAM
May 13
•
Benjamin Marie
2
Share this post
Fine-tune Llama 3 70B on Your GPU with AQLM 2-bit
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
The Weekly Kaitchup #40
RWKV-6 - DeepSeek-V2 - Panza
May 10
•
Benjamin Marie
3
Share this post
The Weekly Kaitchup #40
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
Fine-tune Tiny Chat Models with Apple OpenELM and ORPO
Can we make a good chat model with a 270M LLM?
May 9
•
Benjamin Marie
Share this post
Fine-tune Tiny Chat Models with Apple OpenELM and ORPO
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
Run Llama 3 70B on Your GPU with ExLlamaV2
2.5 bits per weight, on average, is good enough
May 6
•
Benjamin Marie
2
Share this post
Run Llama 3 70B on Your GPU with ExLlamaV2
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
The Weekly Kaitchup #39
GSM1k - Llama 3 Quantized? - Predict Multiple Tokens at Once
May 3
•
Benjamin Marie
1
Share this post
The Weekly Kaitchup #39
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
Phi-3: Fine-tuning and Quantization on Your Computer
Larger and better than Phi-2
May 2
•
Benjamin Marie
5
Share this post
Phi-3: Fine-tuning and Quantization on Your Computer
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
17
April 2024
Turn Llama 3 into an Embedding Model with LLM2Vec
RAG with Llama 3 for the generation and the retrieval
Apr 29
•
Benjamin Marie
2
Share this post
Turn Llama 3 into an Embedding Model with LLM2Vec
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
13
The Weekly Kaitchup #38
Phi-3 - OpenELM - Llama 3 with QDoRA - FineWeb
Apr 26
•
Benjamin Marie
4
Share this post
The Weekly Kaitchup #38
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
Estimate the Memory Consumption of LLMs for Inference and Fine-tuning
A close look at the memory consumption of Command-R+, Mixtral-8x22B, and Llama 3 70B
Apr 25
•
Benjamin Marie
4
Share this post
Estimate the Memory Consumption of LLMs for Inference and Fine-tuning
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
Fine-tune Llama 3 on Your Computer
With code to merge QLoRA adapters and quantize the model
Apr 22
•
Benjamin Marie
8
Share this post
Fine-tune Llama 3 on Your Computer
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
21
The Weekly Kaitchup #37
Llama 3 - Mixtral-8x22B - Megalodon - WizardLM-2
Apr 19
•
Benjamin Marie
7
Share this post
The Weekly Kaitchup #37
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
21
Training, Loading, and Merging QDoRA, QLoRA, and LoftQ Adapters
And How to Quantize LLMs After a Merge
Apr 18
•
Benjamin Marie
8
Share this post
Training, Loading, and Merging QDoRA, QLoRA, and LoftQ Adapters
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
2
Share
Copy link
Facebook
Email
Note
Other
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts