17 Comments

Yep, I had already done that, but the problem remains. In your Medium article about Phi-1.5, you mentioned this:

"The problem here is that phi-1.5 was pre-trained without padding and the implementation of MixFormerSequentialForCausalLM released by Microsoft with the model doesn’t support attention masking during training. In other words, we can’t properly fine-tune the model to learn when to stop generating. Pad tokens are interpreted as normal tokens. You would have to modify MixFormerSequentialForCausalLM to add support for the attention mask."

Is the same true with Phi-2?

https://medium.com/@bnjmn_marie/how-to-fine-tune-quantize-and-run-microsoft-phi-1-5-e14a1e22ec12

Expand full comment

I didn't try phi-2, yet. I would guess this is still true. I'll investigate and will reply if I find something.

Expand full comment

I'm still stumped. Have you taken a crack at instruction finetuning Phi-2 yet? I do see a finetuned chat model on HF, but I haven't played with it yet: https://huggingface.co/cognitivecomputations/dolphin-2_6-phi-2

Expand full comment

I'll write an article on Phi-2 fine-tuning for next week. Not sure whether I'll succeed to teach it when to stop generating but I have several ideas. I'll let you know here as soon as I have something that works.

Expand full comment

I just LoRA-tuned Phi-2, but it refuses to stop generating until `max_new_tokens` is reached. Phi-1.5 suffered from the same problem. Do you know how to correct it?

Expand full comment

When you load the tokenizer, do you set "add_eos_token=True" ? This adds eos to all the training examples.

Expand full comment

Did you try to set "eos_token_id=tokenizer.eos_token_id" when calling "model.generate"?

For me, it works. The model stops generating when it generates the EOS token. Without that, the model generates the EOS token but ignores it and continues to generate.

The problem that remains is that it tends to never output the EOS token for several of my testing prompts. But maybe that's just because my model is under-trained to learn when to stop, so I'm fine-tuning it again.

Expand full comment

I just tried adding `eos_token = tokenizer.eos_token` and `eos_token_id = tokenizer.eos_token_id` in every possible place:

* AutoModelForCausalLM.from_pretrained()

* model.config

* model.generation_config

* TextGenerationPipeline()

None of them worked. :(

My PEFT adapter was trained with over 10,000 examples. :'(

Expand full comment

Does it generate the EOS token and ignores it or it never generates the EOS token? (to see the EOS token, set skip_special_tokens=False when calling decode).

Currently, 1/4 of my testing prompts generate an EOS token.

Since Phi-2 doesn't seem to use an attention mask, 10k examples might not be enough to teach the model when to generate an EOS token.

Expand full comment

How many examples do you think are required?

Expand full comment

Difficult question... I would say at least 1 epoch over 50k examples for instance

Expand full comment

Oh, duh. I did do that. :)

tokenizer = AutoTokenizer.from_pretrained(

'microsoft/phi-2',

trust_remote_code = True,

add_bos_token = False,

add_eos_token = True,

padding_side = 'right',

)

if not tokenizer.pad_token:

tokenizer.pad_token = tokenizer.unk_token

tokenizer.pad_token_id = tokenizer.unk_token_id

Expand full comment

No, that didn't work for me. Would it be possible to end all of my training examples with '<|endoftext|>'?

Expand full comment

If you set "add_eos_token=True" when you load the tokenizer, it automatically adds '<|endoftext|>' (the EOS token) to all your training examples.

Expand full comment

Woo hoo! Looking forward to it. :)

Expand full comment