Continue Pre-training Llama 3 and Other LLMs on Your Computer

With Unsloth and QLoRA

Jun 20, 2024

∙ Paid

A cartoon-style image of a llama running. The llama should have a joyful and energetic expression, with exaggerated features to emphasize its cartoon nature. The background can be simple with some motion lines to indicate the llama's speed. The overall style should be playful and colorful. — Generated with DALL-E

Recent base large language models (LLM) are pre-trained on trillions of tokens. The pre-training data are text usually extracted from the Web without targeting any specific domains or tasks. In contrast, fine-tuning a base LLM requires much less data and exploits data for specific tasks or domains.

Fine-tune Llama 3 on Your Computer

Benjamin Marie

April 22, 2024

Read full story

“Continued” pre-training is yet another step that can be executed between pre-training and fine-tuning. Continuing pre-training is especially helpful when we want to teach a pre-trained LLM a new language or very specific domains for which we have millions of tokens. You can see it as a fine-tuning but without any particular tasks in mind.

In this article, I show how to continue pre-training LLMs. We will review the main differences between fine-tuning and continued pre-training. I use Llama 3 8B and a recipe proposed by Unsloth to make it possible on consumer hardware.

Code examples of continued pre-training with Llama 3 are available in this notebook:

Get the notebook (#80)

The Kaitchup – AI on a Budget

Continue Pre-training Llama 3 and Other LLMs on Your Computer

With Unsloth and QLoRA

Fine-tune Llama 3 on Your Computer

This post is for paid subscribers