The Kaitchup – AI on a Budget

The Kaitchup – AI on a Budget

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Device Map: Avoid Out-of-Memory Errors When Running Large Language Models

Device Map: Avoid Out-of-Memory Errors When Running Large Language Models

A small trick to run LLMs on any computer

Benjamin Marie's avatar
Benjamin Marie
Jul 11, 2023
∙ Paid
10

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Device Map: Avoid Out-of-Memory Errors When Running Large Language Models
Share

Device mapping is a feature implemented in the Accelerate library by Hugging Face. It splits a large language model (LLM) into smaller parts that can be individually loaded on different devices: GPUs VRAM, CPU RAM, and hard disk.

Device map — Case where the LLM doesn’t fit on the GPUs

In this article, I won’t further explain how it works. I have already written a detailed report about device map that you can read here:

Run Very Large Language Models on Your Computer

Run Very Large Language Models on Your Computer

Benjamin Marie, PhD
·
December 22, 2022
Read full story

I will explain why even with device map you may still get out-of-memory (OOM) errors triggered by your GPU.

The Kaitchup – AI on a Budget is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 The Kaitchup
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share