The Kaitchup – AI on a Budget

The Kaitchup – AI on a Budget

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Introduction to the Open LLM Falcon-40B: Performance, Training Data, and Architecture
Copy link
Facebook
Email
Notes
More

Introduction to the Open LLM Falcon-40B: Performance, Training Data, and Architecture

Get started using Falcon-7B, Falcon-40B, and their instruct versions

Benjamin Marie's avatar
Benjamin Marie
Jun 07, 2023
∙ Paid
2

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Introduction to the Open LLM Falcon-40B: Performance, Training Data, and Architecture
Copy link
Facebook
Email
Notes
More
Share

The Falcon models have drawn a lot of attention since they have been released in May 2023.

They are causal large language models (LLM), or so-called “decoder-only” models, very much like GPT.

Definition: Causal Language Model

Causal language modeling involves predicting the token that follows a sequence of tokens. During training, the model’s attention is solely directed toward the left context. The right context is masked. These models are usually trained on billion words.

The Falcon models are completely free, even for commercial use (Apache 2.0 License), since May 31st. The Falcon models are developed and trained by the Technology Innovation Institute (TII) of Abu Dhabi.

According to the first results, Falcon-40B, the biggest of the Falcon models, outperforms all the other causal LLMs, including LLaMa-65B and MPT-7B.

In this article, I introduce in detail Falcon-40B, Falcon-7B, and their instruct versions. We will see how they perform compared to other models, how they were trained, and how to run Falcon7-B on your own GPU with QLoRa.

The Kaitchup – AI on a Budget is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Performance on OpenLLM

The instruct version of Falcon-40B is ranked first on

the OpenLLM leaderboard. The standard version is ranked second.

The OpenLLM leaderboard evaluates the performance of LLMs on 4 tasks:

  • AI2 Reasoning Challenge (25-shot): Questions of grade-school science.

  • HellaSwag (10-shot): A commonsense inference benchmark.

  • MMLU (5-shot): 57 tasks in various domains such as maths, computer science, and law.

  • TruthfulQA (0-shot): A benchmark that evaluates how truthful is the model when answering questions.

Falcon-40B outperforms Meta AI’s LLaMa-65B on all these tasks.

Falcon RefinedWeb

The Falcon models were mainly trained on the Falcon RefinedWeb dataset. It was also created by TII and is distributed under an Apache 2.0 license.

RefinedWeb was extracted from CommonCrawl and has been thoroughly curated. TII claims it is multimodal-friendly since they preserved links and alt texts of images.

In the dataset card published in the Hugging Face Hub, TII wrote: “This public extract […]”. To me, it is thus unclear whether the Falcon models have been trained on this public version of the dataset, which is only an “extract”, or whether they have used a bigger internal version.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 The Kaitchup
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More