The Kaitchup – AI on a Budget
Subscribe
Sign in
Home
Notes
AI Notebooks
The Kaitchup's Book
Weekly Kaitchup
Tutorials
Archive
About
autoround
Fine-Tuning 2-Bit Qwen3 Models on Your Computer
Code and best practices
Jun 9
•
Benjamin Marie
11
Share this post
The Kaitchup – AI on a Budget
Fine-Tuning 2-Bit Qwen3 Models on Your Computer
Copy link
Facebook
Email
Notes
More
Accurate 2-bit Quantization: Run Massive LLMs on a Single Consumer GPU
70B models for consumer hardware
May 5
•
Benjamin Marie
9
Share this post
The Kaitchup – AI on a Budget
Accurate 2-bit Quantization: Run Massive LLMs on a Single Consumer GPU
Copy link
Facebook
Email
Notes
More
How Well Does Qwen3 Handle 4-bit and 2-bit Quantization?
Let's review Qwen3 and check which one you should use
May 1
•
Benjamin Marie
10
Share this post
The Kaitchup – AI on a Budget
How Well Does Qwen3 Handle 4-bit and 2-bit Quantization?
Copy link
Facebook
Email
Notes
More
Mistral Small 3: An Excellent 24B-Parameter Wide-Shallow LLM
Fine-tuning, quantization, and evaluation
Feb 17
•
Benjamin Marie
6
Share this post
The Kaitchup – AI on a Budget
Mistral Small 3: An Excellent 24B-Parameter Wide-Shallow LLM
Copy link
Facebook
Email
Notes
More
1
Quantize and Run Llama 3.3 70B Instruct on Your GPU
4-bit👍, 3-bit👎, and 2-bit👎quantization
Dec 9, 2024
•
Benjamin Marie
11
Share this post
The Kaitchup – AI on a Budget
Quantize and Run Llama 3.3 70B Instruct on Your GPU
Copy link
Facebook
Email
Notes
More
1
The Recipe for Extremely Accurate and Cheap Quantization of 70B+ LLMs
Cost and accuracy for quantizing large models to 4-bit and 2-bit
Nov 25, 2024
•
Benjamin Marie
10
Share this post
The Kaitchup – AI on a Budget
The Recipe for Extremely Accurate and Cheap Quantization of 70B+ LLMs
Copy link
Facebook
Email
Notes
More
3
The Impact of the Calibration Dataset for AutoRound and AWQ Quantization
Should you choose the calibration dataset?
Oct 31, 2024
•
Benjamin Marie
5
Share this post
The Kaitchup – AI on a Budget
The Impact of the Calibration Dataset for AutoRound and AWQ Quantization
Copy link
Facebook
Email
Notes
More
5
Mistral-NeMo: 4.1x Smaller with Quantized Minitron
How Pruning, Knowledge Distillation, and 4-Bit Quantization Can Make Advanced AI Models More Accessible and Cost-Effective
Aug 26, 2024
•
Benjamin Marie
12
Share this post
The Kaitchup – AI on a Budget
Mistral-NeMo: 4.1x Smaller with Quantized Minitron
Copy link
Facebook
Email
Notes
More
4
Fine-tuning Phi-3.5 MoE and Mini on Your Computer
With code to quantize the models with bitsandbytes and AutoRound
Aug 22, 2024
•
Benjamin Marie
4
Share this post
The Kaitchup – AI on a Budget
Fine-tuning Phi-3.5 MoE and Mini on Your Computer
Copy link
Facebook
Email
Notes
More
11
QLoRA with AutoRound: Cheaper and Better LLM Fine-tuning on Your GPU
Bitsandbytes is not your only option
Aug 19, 2024
•
Benjamin Marie
16
Share this post
The Kaitchup – AI on a Budget
QLoRA with AutoRound: Cheaper and Better LLM Fine-tuning on Your GPU
Copy link
Facebook
Email
Notes
More
6
Intel AutoRound: Accurate Low-bit Quantization for LLMs
Between quantization-aware training and post-training quantization
Jun 27, 2024
•
Benjamin Marie
6
Share this post
The Kaitchup – AI on a Budget
Intel AutoRound: Accurate Low-bit Quantization for LLMs
Copy link
Facebook
Email
Notes
More
Share
Copy link
Facebook
Email
Notes
More
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts