The Kaitchup – AI on a Budget

The Kaitchup – AI on a Budget

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Llama 3.2 Embeddings: Training and Evaluation with LLM2Vec
Copy link
Facebook
Email
Notes
More

Llama 3.2 Embeddings: Training and Evaluation with LLM2Vec

A step-by-step tutorial

Benjamin Marie's avatar
Benjamin Marie
Nov 04, 2024
∙ Paid
11

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Llama 3.2 Embeddings: Training and Evaluation with LLM2Vec
Copy link
Facebook
Email
Notes
More
2
Share
Generated with Grok

The embedding model plays a key role in many applications such as in Retrieval-Augmented Generation (RAG) for large language models (LLMs). In RAG systems, it encodes both the knowledge base and the user query. I explained the RAG concept in this article:

RAG for Mistral 7B Instruct with LlamaIndex and Transformers

RAG for Mistral 7B Instruct with LlamaIndex and Transformers

Benjamin Marie
·
March 25, 2024
Read full story

Using an embedding model that is trained or fine-tuned on the same domain as the LLM can greatly improve a RAG system. With LLM2Vec, we can extract an inaccurate embedding model directly from the LLM. Then, we can improve this model with a two-stage training including masked next-token prediction and contrastive learning. We saw how to do this in previous articles with Llama 3 8B. However, because Llama 3 8B is a large model, it produces high-dimensional text embeddings, which can be costly to train and deploy in downstream tasks.

The Kaitchup – AI on a Budget is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

In this article, we will see how to make text embeddings from Llama 3.2 1B. We will see in detail all the steps: masked next-token prediction training, contrastive learning, and then how to evaluate the resulting embeddings. I used an RTX 3090 from RunPod (currently at $0.22/hour) (referral link) for the training steps and evaluation.

The notebook showing how to turn Llama 3.2 into an embedding model is here:

Get the notebook (#118)

Train an Embedding Model with LLM2Vec

Step 1: Masked Next-Token Prediction Training

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 The Kaitchup
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More