The Kaitchup – AI on a Budget
Subscribe
Sign in
Home
Notes
AI Notebooks
The Kaitchup's Book
Weekly Kaitchup
Tutorials
Archive
About
Tutorials
Latest
Top
Discussions
RAG with Qwen3 Embedding and Qwen3 Reranker
How to use embedding and reranker models to efficiently retrieve only the most relevant chunks or documents given a user query
Jun 19
•
Benjamin Marie
10
Share this post
The Kaitchup – AI on a Budget
RAG with Qwen3 Embedding and Qwen3 Reranker
Copy link
Facebook
Email
Notes
More
RTX 6000 Pro vs H100 & A100: Best Single-GPU Choice for Fast, Low-Cost LLM Fine-Tuning
Faster, cheaper single-GPU training
Jun 16
•
Benjamin Marie
8
Share this post
The Kaitchup – AI on a Budget
RTX 6000 Pro vs H100 & A100: Best Single-GPU Choice for Fast, Low-Cost LLM Fine-Tuning
Copy link
Facebook
Email
Notes
More
1
Fine-Tuning 2-Bit Qwen3 Models on Your Computer
Code and best practices
Jun 9
•
Benjamin Marie
9
Share this post
The Kaitchup – AI on a Budget
Fine-Tuning 2-Bit Qwen3 Models on Your Computer
Copy link
Facebook
Email
Notes
More
Qwulu 3: Fine-Tuning Qwen3 Base with LoRA and TULU 3's Supervised Fine-Tuning Recipe
Can a supervised fine-tuning recipe that works effectively on Llama 3.1 be applied directly to Qwen3?
Jun 5
•
Benjamin Marie
5
Share this post
The Kaitchup – AI on a Budget
Qwulu 3: Fine-Tuning Qwen3 Base with LoRA and TULU 3's Supervised Fine-Tuning Recipe
Copy link
Facebook
Email
Notes
More
Boost 2-Bit LLM Accuracy with EoRA
A training-free solution for extreme LLM compression
May 19
•
Benjamin Marie
4
Share this post
The Kaitchup – AI on a Budget
Boost 2-Bit LLM Accuracy with EoRA
Copy link
Facebook
Email
Notes
More
LoRA at Scale on a Consumer GPU: Does It Work?
Reproducing TULU 3 SFT on Consumer Hardware Using LoRA and Unsloth
May 12
•
Benjamin Marie
6
Share this post
The Kaitchup – AI on a Budget
LoRA at Scale on a Consumer GPU: Does It Work?
Copy link
Facebook
Email
Notes
More
3
Fine-Tuning Qwen3: Base vs. Reasoning Models
Is it reasonable to fine-tune a "reasoning" model?
May 8
•
Benjamin Marie
10
Share this post
The Kaitchup – AI on a Budget
Fine-Tuning Qwen3: Base vs. Reasoning Models
Copy link
Facebook
Email
Notes
More
2
Accurate 2-bit Quantization: Run Massive LLMs on a Single Consumer GPU
70B models for consumer hardware
May 5
•
Benjamin Marie
8
Share this post
The Kaitchup – AI on a Budget
Accurate 2-bit Quantization: Run Massive LLMs on a Single Consumer GPU
Copy link
Facebook
Email
Notes
More
Make LLMs Faster and Lighter with W8A8 Quantization
Efficient Weight and Activation Quantization with llm-compressor
Apr 21
•
Benjamin Marie
8
Share this post
The Kaitchup – AI on a Budget
Make LLMs Faster and Lighter with W8A8 Quantization
Copy link
Facebook
Email
Notes
More
Run Llama 3.3 70B on Your GPU with ExLlamaV3
Fast Llama 3.3 70B at 1.75 bits per weight, using only 19 GB!
Apr 17
•
Benjamin Marie
7
Share this post
The Kaitchup – AI on a Budget
Run Llama 3.3 70B on Your GPU with ExLlamaV3
Copy link
Facebook
Email
Notes
More
1
Fast and Memory-Efficient Full Fine-Tuning with Unsloth (single-GPU)
With the best hyperparameters for a cost-effective full fine-tuning
Apr 14
•
Benjamin Marie
8
Share this post
The Kaitchup – AI on a Budget
Fast and Memory-Efficient Full Fine-Tuning with Unsloth (single-GPU)
Copy link
Facebook
Email
Notes
More
5
Llama 4 with 10M Tokens: How Much Does It Cost and Is It Worth It?
A KV Cache Story
Apr 8
•
Benjamin Marie
9
Share this post
The Kaitchup – AI on a Budget
Llama 4 with 10M Tokens: How Much Does It Cost and Is It Worth It?
Copy link
Facebook
Email
Notes
More
Share
Copy link
Facebook
Email
Notes
More
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts