Fine-tune Tiny Chat Models with Apple OpenELM and ORPO

Can we make a good chat model with a 270M LLM?

May 09, 2024

∙ Paid

A cartoon-style tiny llama talking gibberish, using comic symbols '@$%%#_)!.><' to represent its speech. The llama is depicted in a playful and whimsical manner, very small with an exaggerated facial expression of excitement and its mouth open as if speaking animatedly. The background is simple and colorful, enhancing the fun and lighthearted vibe of the image. — Generated with DALL-E

Apple released OpenELM, a family of small open LLMs with sizes ranging from 270M to 3B parameters. With this release, Apple aims at providing LLMs that can run on devices with tiny memory.

The memory footprint of the OpenELM LLMs is small and they have a low computational cost for inference. They can also be fully fine-tuned on consumer hardware. Nonetheless, chat LLMs with less than 1 billion parameters are rarely released as they are usually not good enough to be useful for any applications.

In this article, I review the OpenELM LLMs. I first review the technical report published by Apple describing the models. Then, I show how to fine-tune the smallest OpenELM with ORPO to make a tiny chat model. We will see whether we can turn the smallest OpenELM into a capable chat model.

My notebook implementing the full fine-tuning of OpenELM is here:

Get the notebook (#68)

The Kaitchup – AI on a Budget

Fine-tune Tiny Chat Models with Apple OpenELM and ORPO

Can we make a good chat model with a 270M LLM?

This post is for paid subscribers