What You Cannot Do With Llama 2

A permissive license with one catch

Jul 29, 2023

Image from Pixabay — Edited by the author

Llama 2 is now the top trending large language model (LLM) and there are many good reasons for this. It’s performing better than previous LLMs on public benchmarks and, in contrast to Llama 1, you can use Llama 2 in commercial applications.

You can already find many tutorials on how to use and deploy Llama 2 into production.

On The Kaitchup, I have shown how to run Llama 2 on a consumer GPU and how to quantize it to reduce its size.

Quantization of Llama 2 with GTPQ for Fast Inference on Your Computer

Benjamin Marie

July 27, 2023

Read full story

One more thing on which I would like to insist before publishing more articles using Llama 2 is the limits of the license.

Llama 2 is not truly open and you can’t use it commercially for any purposes.

Thank you for reading The Kaitchup. This post is public so feel free to share it.

The Catch: You Can’t Use Llama 2 to Improve Another LLM

In the license, we can read the following:

v. You will not use the Llama Materials or any output or results of the
Llama Materials to improve any other large language model (excluding Llama 2 or derivative works thereof).

“Llama Materials” include the model itself.

Note: “Improve” is a very important word here. It leaves the door open for Meta to qualify whatever they want as an “improvement”.

For instance, you cannot generate a dataset with Llama 2 and use it to train/fine-tune another LLM. That’s extremely restrictive, even more than OpenAI’s terms of use.

OpenAI doesn’t allow this only if the LLM trained on the generated dataset is competing with OpenAI “services”. For instance, instruct Falcon models are trained on Bai Ze which is generated by ChatGPT. That’s fine according to OpenAI terms of use, if you don’t use Falcon models in a product that competes with OpenAI’s products.

Can You Use the Falcon Models For Commercial Applications?

Benjamin Marie

June 26, 2023

Read full story

With Llama 2 you can only use the generated dataset to improve Llama 2. Even for research purposes, the license doesn’t grant you the necessary permissions.

In practice, I have no ideas how they can verify this. I expect that a lot of data generated by Llama 2 will be published on the Internet. Then, these data will be naively crawled and will be used to train other LLMs, infringing Llama 2’s license.

So again, we are in this situation where Meta can crawl the entire Internet, including your data/intellectual property, to develop its own LLMs but then we can’t use the data these models generates to develop our own LLMs.

I don’t think this kind of license has a future. But this is just my opinion.

If you want to read the entire license attached to Llama 2, it’s here.

Share The Kaitchup

The Kaitchup – AI on a Budget

Quantization of Llama 2 with GTPQ for Fast Inference on Your Computer

Can You Use the Falcon Models For Commercial Applications?

Discussion about this post