I think overall, DeepSpeed Chat is weird regarding LoRA recommendations... Their default example uses LoRA while keeping the entire base model unfrozen. I think I never saw that before and I wonder what is the benefit of such a configuration. I tried with and without freezing the base model, and I can say that keeping it frozen is much better.
Hey Benjamin! Thanks for this great series, loving the more practical content.
I think you missed out a command in the notebook to install the repository itself as a package.
```
!pip install deepspeed>=0.9.0
!git clone https://github.com/microsoft/DeepSpeedExamples.git
%cd DeepSpeedExamples/applications/DeepSpeed-Chat/
!pip install -r requirements.txt
!pip install -e . # This part is needed to run the code
```
Hi!
When I wrote the notebook, I didn't need to install the package it seems. Now, if you don't install it, the notebook doesn't run?
I'll update the notebook and article to add this.
Great series.
Btw that’s a huge Lora parameter compared to the LoRA paper (4) but I suppose it’s good if deepspeed recommends...?
I think overall, DeepSpeed Chat is weird regarding LoRA recommendations... Their default example uses LoRA while keeping the entire base model unfrozen. I think I never saw that before and I wonder what is the benefit of such a configuration. I tried with and without freezing the base model, and I can say that keeping it frozen is much better.
yeah that completely goes against the LoRA paper to unfreeze the base model.. I don't get it