2.0.0
v2.0版本:
- 支持了 Meta Llama 3 系列模型微调训练
- 发布了适用于ORPO/DPO/RM模型的偏好数据集shibing624/DPO-En-Zh-20k-Preference
- 基于llama-3-8b-instruct-262k模型使用ORPO方法微调,得到 模型权重:https://huggingface.co/shibing624/llama-3-8b-instruct-262k-chinese ,及对应的lora权重:https://huggingface.co/shibing624/llama-3-8b-instruct-262k-chinese-lora
What's Changed
- Updates for readme and demo ipynb and a small update for deprecated function by @ker2xu in #360
- Typo by @ker2xu in #362
- add max_length and max_prompt_length by @ZhuangXialie in #367
New Contributors
Full Changelog: 1.9.0...2.0.0