Train Instruct LLMs On Your GPU with…

Benjamin Marie

Sep 4, 2023

Instruct LLMs on a budget

Read →

5 Comments

Ivan Leo

Apr 15, 2024

Hey Benjamin! Thanks for this great series, loving the more practical content.

I think you missed out a command in the notebook to install the repository itself as a package.

```

!pip install deepspeed>=0.9.0

!git clone https://github.com/microsoft/DeepSpeedExamples.git

%cd DeepSpeedExamples/applications/DeepSpeed-Chat/

!pip install -r requirements.txt

!pip install -e . # This part is needed to run the code

```

Expand full comment

Reply (1)

Benjamin Marie

Apr 15, 2024

Hi!

When I wrote the notebook, I didn't need to install the package it seems. Now, if you don't install it, the notebook doesn't run?

I'll update the notebook and article to add this.

Expand full comment

Ronan McGovern

Sep 15, 2023

Great series.

Btw that’s a huge Lora parameter compared to the LoRA paper (4) but I suppose it’s good if deepspeed recommends...?

Expand full comment

Reply (1)

Benjamin Marie

Sep 15, 2023

I think overall, DeepSpeed Chat is weird regarding LoRA recommendations... Their default example uses LoRA while keeping the entire base model unfrozen. I think I never saw that before and I wonder what is the benefit of such a configuration. I tried with and without freezing the base model, and I can say that keeping it frozen is much better.

Expand full comment

Reply (1)

Ronan McGovern

Sep 15, 2023

yeah that completely goes against the LoRA paper to unfreeze the base model.. I don't get it

Expand full comment

The Kaitchup – AI on a Budget

Train Instruct LLMs On Your GPU with…