Zephyr 7B - LongLLMLingua - gte-tiny - Mistral 7B paper - OpenWebMath
Is there a GitHub repo for doing DPO?
DPO is possible with HF's TRL:
https://huggingface.co/docs/trl/main/en/dpo_trainer
Is there a GitHub repo for doing DPO?
DPO is possible with HF's TRL:
https://huggingface.co/docs/trl/main/en/dpo_trainer