AI Toolboxes
The Kaitchup proposes specialized AI toolboxes.
Releases for November/December 2024
Qwen2.5 Toolbox: 16 notebooks specially optimized to run, fine-tune, align, quantize, serve, and merge Qwen2.5 models. Already available (see below).
Llama 3.1/3.2/3.3 Toolbox: Notebooks specially optimized to run, fine-tune, quantize, serve, and merge Llama 3.1/3.2/3.3 models.
NotebookLM Toolbox: Scheduled for release by mid-December, this toolbox will implement NotebookLM with state-of-the-art open LLMs and TTS.
All the toolboxes are published as Hugging Face repositories, here.
They are regularly updated to ensure compatibility with the latest versions of frameworks like vLLM, Transformers, and TRL. I test the code once a week. Most of the code in these toolboxes requires a GPU that supports FlashAttention and bfloat16—such as those from the Ampere generation (RTX 3000 series, A series) or newer.
How to Access The Kaitchup’s AI Toolboxes
You have two options:
Subscribers to The Kaitchup Pro:
Lifetime Access: Pro subscribers have lifelong access to all current and future toolboxes as part of their subscription. If you are a Pro subscriber, don’t purchase individual access to the toolboxes, it’s included in your subscription!
Access Tokens: Each Pro subscriber receives a Hugging Face access token to download the repositories. You can see your access token on this page.
Hugging Face User ID: In addition, Pro subscribers can optionally request access directly on the repository's main page. I check the access requests daily.
If you plan to purchase several toolboxes or if you are interested in the other perks proposed by The Kaitchup Pro, this is your best option.
Individual Toolbox Purchases for Non-Pro Subscribers:
Lifetime Access to the Purchased Toolbox
Purchasing: Individual toolboxes can be bought directly through Gumroad for those without a Pro subscription. There is a 30-day refund guarantee if you are not satisfied.
Repository Access: After purchase, all the repository content is accessible directly from Gumroad. Then, you can also request access to the repository on Hugging Face.
Once you gain access, whether through a Pro subscription or individual purchase, you will be able to open issues for discussion or improvement, fill bug reports, and suggest new notebooks, directly in the Hugging Face repository.
LLM Toolboxes
Qwen2.5 Toolbox
This toolbox already includes 18 Jupyter notebooks specially optimized for Qwen2.5. The logs of successful runs are also provided. More notebooks will be regularly added.
Once you've subscribed to The Kaitchup Pro or purchased access, request repository access here.
To run the code in the toolbox, CUDA 12.4 and PyTorch 2.4 are recommended. PyTorch 2.5 might already work but I didn't test it yet.
Toolbox content
Supervised Fine-Tuning with Chat Templates (5 notebooks)
Full fine-tuning
LoRA fine-tuning
QLoRA fine-tuning with Bitsandbytes quantization
QLoRA fine-tuning with AutoRound quantization
LoRA and QLoRA fine-tuning with Unsloth
Multi-GPU QLoRA/LoRA fine-tuning with FSDP
Preference Optimization (3 notebooks)
Full DPO training (TRL and Transformers)
DPO training with LoRA (TRL and Transformers)
ORPO training with LoRA (TRL and Transformers)
Multi-GPU QLoRA/LoRA DPO Training with FSDP
Quantization (3 notebooks)
AWQ
AutoRound (with code to quantize Qwen 2.5 72B)
GGUF for llama.cpp
Inference with Qwen2.5 Instruct and Your Own Fine-tuned Qwen2.5 (4 notebooks)
Transformers with and without a LoRA adapter
vLLM offline and online inference
Ollama (not released yet)
llama.cpp
Merging (3 notebooks)
Merge a LoRA adapter into the base model
Merge a QLoRA adapter into the base model
Merge several Qwen2.5 models into one with mergekit (not released yet)
Llama 3.1/3.2/3.3 Toolbox
This toolbox already includes 18 Jupyter notebooks specially optimized for Llama 3.1, Llama 3.2, and Llama 3.3 LLMs. The logs of successful runs are also provided. More notebooks will be regularly added.
Once you've subscribed to The Kaitchup Pro or purchased access, you can also request repository access here.
To run the code in the toolbox, CUDA 12.4 and PyTorch 2.4 are recommended. PyTorch 2.5 might already work but I didn't test it yet.
The scripts in this toolbox are all very similar to the scripts in the Qwen2.5 toolbox. The main differences are mainly related to the tokenizer (pad token, chat template, eos token, …) and all the logs of successful runs, for each script, that are specific to Llama 3.
Toolbox content
Supervised Fine-Tuning with Chat Templates (6 notebooks)
Full fine-tuning
LoRA fine-tuning
LoRA fine-tuning (with Llama 3.1/3.2 Instruct)
QLoRA fine-tuning with Bitsandbytes quantization
QLoRA fine-tuning with AutoRound quantization
LoRA and QLoRA fine-tuning with Unsloth
Multi-GPU QLoRA/LoRA fine-tuning with FSDP
Preference Optimization (2 notebooks)
DPO training with LoRA (TRL and Transformers)
ORPO training with LoRA (TRL and Transformers)
Multi-GPU QLoRA/LoRA DPO Training with FSDP
Quantization (3 notebooks)
AWQ
AutoRound
GGUF for llama.cpp
Inference (4 notebooks)
Transformers with and without a LoRA adapter
vLLM offline and online inference
Ollama (not released yet)
llama.cpp
Merging (3 notebooks)
Merge a LoRA adapter into the base model
Merge a QLoRA adapter into the base model
Merge several Llama 3.1/3.2 models into one with mergekit (not released yet)
AI Application Toolboxes
Open NotebookLM Toolbox
Released planned by mid-December.