The Kaitchup – AI on a Budget
Subscribe
Sign in
Home
Notes
AI Notebooks
The Kaitchup's Book
Weekly Kaitchup
Tutorials
Archive
About
Tutorials
Latest
Top
Discussions
Gemma 3n: Fine-Tuning, Inference, and Submodel Extraction
Running Gemma 3n with vLLM and fine-tuning with TRL
Jun 30
•
Benjamin Marie
9
Share this post
The Kaitchup – AI on a Budget
Gemma 3n: Fine-Tuning, Inference, and Submodel Extraction
Copy link
Facebook
Email
Notes
More
RAG with Qwen3 Embedding and Qwen3 Reranker
How to use embedding and reranker models to efficiently retrieve only the most relevant chunks or documents given a user query
Jun 19
•
Benjamin Marie
20
Share this post
The Kaitchup – AI on a Budget
RAG with Qwen3 Embedding and Qwen3 Reranker
Copy link
Facebook
Email
Notes
More
4
RTX 6000 Pro vs H100 & A100: Best Single-GPU Choice for Fast, Low-Cost LLM Fine-Tuning
Faster, cheaper single-GPU training
Jun 16
•
Benjamin Marie
10
Share this post
The Kaitchup – AI on a Budget
RTX 6000 Pro vs H100 & A100: Best Single-GPU Choice for Fast, Low-Cost LLM Fine-Tuning
Copy link
Facebook
Email
Notes
More
1
Fine-Tuning 2-Bit Qwen3 Models on Your Computer
Code and best practices
Jun 9
•
Benjamin Marie
11
Share this post
The Kaitchup – AI on a Budget
Fine-Tuning 2-Bit Qwen3 Models on Your Computer
Copy link
Facebook
Email
Notes
More
Qwulu 3: Fine-Tuning Qwen3 Base with LoRA and TULU 3's Supervised Fine-Tuning Recipe
Can a supervised fine-tuning recipe that works effectively on Llama 3.1 be applied directly to Qwen3?
Jun 5
•
Benjamin Marie
6
Share this post
The Kaitchup – AI on a Budget
Qwulu 3: Fine-Tuning Qwen3 Base with LoRA and TULU 3's Supervised Fine-Tuning Recipe
Copy link
Facebook
Email
Notes
More
Boost 2-Bit LLM Accuracy with EoRA
A training-free solution for extreme LLM compression
May 19
•
Benjamin Marie
5
Share this post
The Kaitchup – AI on a Budget
Boost 2-Bit LLM Accuracy with EoRA
Copy link
Facebook
Email
Notes
More
LoRA at Scale on a Consumer GPU: Does It Work?
Reproducing TULU 3 SFT on Consumer Hardware Using LoRA and Unsloth
May 12
•
Benjamin Marie
6
Share this post
The Kaitchup – AI on a Budget
LoRA at Scale on a Consumer GPU: Does It Work?
Copy link
Facebook
Email
Notes
More
3
Fine-Tuning Qwen3: Base vs. Reasoning Models
Is it reasonable to fine-tune a "reasoning" model?
May 8
•
Benjamin Marie
11
Share this post
The Kaitchup – AI on a Budget
Fine-Tuning Qwen3: Base vs. Reasoning Models
Copy link
Facebook
Email
Notes
More
2
Accurate 2-bit Quantization: Run Massive LLMs on a Single Consumer GPU
70B models for consumer hardware
May 5
•
Benjamin Marie
9
Share this post
The Kaitchup – AI on a Budget
Accurate 2-bit Quantization: Run Massive LLMs on a Single Consumer GPU
Copy link
Facebook
Email
Notes
More
Make LLMs Faster and Lighter with W8A8 Quantization
Efficient Weight and Activation Quantization with llm-compressor
Apr 21
•
Benjamin Marie
8
Share this post
The Kaitchup – AI on a Budget
Make LLMs Faster and Lighter with W8A8 Quantization
Copy link
Facebook
Email
Notes
More
Run Llama 3.3 70B on Your GPU with ExLlamaV3
Fast Llama 3.3 70B at 1.75 bits per weight, using only 19 GB!
Apr 17
•
Benjamin Marie
7
Share this post
The Kaitchup – AI on a Budget
Run Llama 3.3 70B on Your GPU with ExLlamaV3
Copy link
Facebook
Email
Notes
More
1
Fast and Memory-Efficient Full Fine-Tuning with Unsloth (single-GPU)
With the best hyperparameters for a cost-effective full fine-tuning
Apr 14
•
Benjamin Marie
8
Share this post
The Kaitchup – AI on a Budget
Fast and Memory-Efficient Full Fine-Tuning with Unsloth (single-GPU)
Copy link
Facebook
Email
Notes
More
5
Share
Copy link
Facebook
Email
Notes
More
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts