Fine-tune Tiny Chat Models with Apple OpenELM and ORPO
Can we make a good chat model with a 270M LLM?
Apple released OpenELM, a family of small open LLMs with sizes ranging from 270M to 3B parameters. With this release, Apple aims at providing LLMs that can run on devices with tiny memory.
The memory footprint of the OpenELM LLMs is small and they have a low computational cost for inference. They can also be fully fine-tuned on consumer hardware. Nonetheless, chat LLMs with less than 1 billion parameters are rarely released as they are usually not good enough to be useful for any applications.
In this article, I review the OpenELM LLMs. I first review the technical report published by Apple describing the models. Then, I show how to fine-tune the smallest OpenELM with ORPO to make a tiny chat model. We will see whether we can turn the smallest OpenELM into a capable chat model.
My notebook implementing the full fine-tuning of OpenELM is here: