Discussion about this post

User's avatar
Rui's avatar

Thanks for the update. What do you think the right pad/unk token we should use for llama3?

Expand full comment
Trelis Research's avatar

Nice weekly summary. Just some comments/Qs on your notebook 14:

1. Do you have a reference for the dequanting code? Or, did you have to develop it from scratch?

2. I notice this line in the dequanting function:

```

def dequantize_model(model, to='./dequantized_model', dtype=torch.float16, device="cuda"):

"""

'model': the peftmodel you loaded with qlora.

```

When you say 'model', do you in fact mean the base model OR the peftmodel? It seems to me the dequanting function expects a base model (but maybe it works with a peft model too?)

3. The dequantization and merging cell doesn't specify an adapter as an input (so I assume the adapter has been specified earlier in the code). I wonder if it would be better to explicitly set (or reset) the adapter in that cell, to make things more clear?

Expand full comment
19 more comments...

No posts