The Kaitchup – AI on a Budget
Subscribe
Sign in
Home
Notes
AI Notebooks
Table of Contents
Tutorials
Models
Hardware for LLMs
Archive
About
Fine-tune Tiny Chat Models with Apple OpenELM and ORPO
Can we make a good chat model with a 270M LLM?
May 9
•
Benjamin Marie
Share this post
Fine-tune Tiny Chat Models with Apple OpenELM and ORPO
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
Fine-tune Llama 3 on Your Computer
With code to merge QLoRA adapters and quantize the model
Apr 22
•
Benjamin Marie
8
Share this post
Fine-tune Llama 3 on Your Computer
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
21
Fast and Small Llama 3 with Activation-Aware Quantization (AWQ)
Better, fast, and more simple than GPTQ quantization
Oct 5, 2023
•
Benjamin Marie
9
Share this post
Fast and Small Llama 3 with Activation-Aware Quantization (AWQ)
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
11
Avoid Quantizing Llama 3 8B
Llama 3 vs. Llama 2 vs. Mistral 7B, quantized
6 hrs ago
•
Benjamin Marie
1
Share this post
Avoid Quantizing Llama 3 8B
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
1
Fine-tune Llama 3 70B on Your GPU with AQLM 2-bit
It's possible to fine-tune Llama 3 70B with only 24 GB of GPU RAM
May 13
•
Benjamin Marie
2
Share this post
Fine-tune Llama 3 70B on Your GPU with AQLM 2-bit
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
3
Most Popular
View all
Run Llama 2 70B on Your GPU with ExLlamaV2
Sep 27, 2023
•
Benjamin Marie
4
Share this post
Run Llama 2 70B on Your GPU with ExLlamaV2
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
Fine-tune Your Own Instruct Version of Mistral 7B with Direct Preference Optimization (DPO)
Oct 26, 2023
•
Benjamin Marie
18
Share this post
Fine-tune Your Own Instruct Version of Mistral 7B with Direct Preference Optimization (DPO)
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
21
Falcon 180B: Can It Run on Your Computer?
Sep 11, 2023
•
Benjamin Marie
10
Share this post
Falcon 180B: Can It Run on Your Computer?
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
3
Fine-tune Mixtral-8x7B Quantized with AQLM (2-bit) on Your GPU
Mar 14
•
Benjamin Marie
15
Share this post
Fine-tune Mixtral-8x7B Quantized with AQLM (2-bit) on Your GPU
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
3
Latest
Top
Discussions
The Weekly Kaitchup #40
RWKV-6 - DeepSeek-V2 - Panza
May 10
•
Benjamin Marie
3
Share this post
The Weekly Kaitchup #40
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
Run Llama 3 70B on Your GPU with ExLlamaV2
2.5 bits per weight, on average, is good enough
May 6
•
Benjamin Marie
2
Share this post
Run Llama 3 70B on Your GPU with ExLlamaV2
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
The Weekly Kaitchup #39
GSM1k - Llama 3 Quantized? - Predict Multiple Tokens at Once
May 3
•
Benjamin Marie
1
Share this post
The Weekly Kaitchup #39
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
Phi-3: Fine-tuning and Quantization on Your Computer
Larger and better than Phi-2
May 2
•
Benjamin Marie
5
Share this post
Phi-3: Fine-tuning and Quantization on Your Computer
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
17
Turn Llama 3 into an Embedding Model with LLM2Vec
RAG with Llama 3 for the generation and the retrieval
Apr 29
•
Benjamin Marie
3
Share this post
Turn Llama 3 into an Embedding Model with LLM2Vec
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
13
The Weekly Kaitchup #38
Phi-3 - OpenELM - Llama 3 with QDoRA - FineWeb
Apr 26
•
Benjamin Marie
4
Share this post
The Weekly Kaitchup #38
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
Estimate the Memory Consumption of LLMs for Inference and Fine-tuning
A close look at the memory consumption of Command-R+, Mixtral-8x22B, and Llama 3 70B
Apr 25
•
Benjamin Marie
4
Share this post
Estimate the Memory Consumption of LLMs for Inference and Fine-tuning
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
The Weekly Kaitchup #37
Llama 3 - Mixtral-8x22B - Megalodon - WizardLM-2
Apr 19
•
Benjamin Marie
7
Share this post
The Weekly Kaitchup #37
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
21
Training, Loading, and Merging QDoRA, QLoRA, and LoftQ Adapters
And How to Quantize LLMs After a Merge
Apr 18
•
Benjamin Marie
8
Share this post
Training, Loading, and Merging QDoRA, QLoRA, and LoftQ Adapters
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
2
See all
The Kaitchup – AI on a Budget
Weekly news, tips, and tutorials on fine-tuning, running, and serving large language models on your computer. Each tutorial is published along with a notebook ready to run.
Subscribe
Recommendations
Trelis Research Updates
Trelis Research
AI Tidbits
Sahar Mor
The Salt - Curated AI
Benjamin Marie
The Tech Buffet
Ahmed Besbes
AI Horizon Forecast
Nikos Kafritsas
The Kaitchup – AI on a Budget
Subscribe
About
Archive
Recommendations
Sitemap
Share this publication
The Kaitchup – AI on a Budget
kaitchup.substack.com
Copy link
Facebook
Email
Note
Other
Share
Copy link
Facebook
Email
Note
Other
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts