Turn Llama 3 into an Embedding Model with LLM2Vec

How to get and train a Llama 3 embedding model for RAG applications

Apr 29, 2024

∙ Paid

The embedding model is a critical component of retrieval-augmented generation (RAG) for large language models (LLMs). They encode the knowledge base and the query written by the user. I explained RAG in this article:

RAG for Mistral 7B Instruct with LlamaIndex and Transformers

Benjamin Marie

March 25, 2024

Read full story

Using an embedding model trained or fine-tuned for the same domain as the LLM can significantly improve a RAG system. However, finding or training such an embedding model is often a difficult task as in-domain data are usually scarce.

In this article, I show how to turn an LLM into a text embedding model using LLM2Vec. We will see how to do it with Llama 3 to create a RAG system that doesn’t need any other models than Llama 3. The same method can be applied to Llama 3.1.

I also wrote a follow-up article to further improve a Llama 3 embedding model with contrastive learning.

The notebook showing how to convert Llama 3 into an embedding model is available here:

Get the notebook (#65)

The Kaitchup – AI on a Budget

Turn Llama 3 into an Embedding Model with LLM2Vec

How to get and train a Llama 3 embedding model for RAG applications

RAG for Mistral 7B Instruct with LlamaIndex and Transformers

This post is for paid subscribers