Releases: InternLM/xtuner
Releases · InternLM/xtuner
XTuner Release V0.1.23
What's Changed
- Support InternVL 1.5/2.0 finetune by @hhaAndroid in #737
- [Bug] fix preference_collate_fn attn_mask by @HIT-cwh in #859
- bump version to 0.1.23 by @HIT-cwh in #862
Full Changelog: v0.1.22...v0.1.23
XTuner Release V0.1.22
What's Changed
- [Refactor] fix internlm2 dispatch by @HIT-cwh in #779
- Fix zero3 compatibility issue for DPO by @Johnson-Wang in #781
- [Fix] Fix map_fn in custom_dataset/sft by @fanqiNO1 in #785
- [Fix] fix configs by @HIT-cwh in #783
- [Docs] DPO and Reward Model documents by @RangiLyu in #751
- Support internlm2.5 by @HIT-cwh in #803
- [Bugs] fix dispatch bugs when model not in LOWEST_TRANSFORMERS_VERSION by @HIT-cwh in #802
- [Docs] fix benchmark table by @HIT-cwh in #801
- [Feature] support output without loss in openai_map_fn by @HIT-cwh in #816
- [Docs] fix typos in sp docs by @HIT-cwh in #821
- [Feature] Support the DatasetInfoHook of DPO training by @xu-song in #787
- [Enhance]: Fix sequence parallel memory bottleneck in DPO & ORPO by @RangiLyu in #830
- [Fix] Fix typo by @bychen7 in #795
- [Fix] fix initialization of ref_llm for full param dpo training with zero-3 by @xu-song in #778
- [Bugs] Fix attn mask by @HIT-cwh in #852
- fix lint by @HIT-cwh in #854
- [Bugs] Fix dispatch attn bug by @HIT-cwh in #829
- [Docs]: update readme and DPO en docs by @RangiLyu in #853
- Added minicpm config file to support sft、qlora、lora、dpo by @LDLINGLINGLING in #847
- fix lint by @HIT-cwh in #856
- bump version to 0.1.22 by @HIT-cwh in #855
New Contributors
- @Johnson-Wang made their first contribution in #781
- @xu-song made their first contribution in #787
- @bychen7 made their first contribution in #795
- @LDLINGLINGLING made their first contribution in #847
Full Changelog: v0.1.21...v0.1.22
XTuner Release V0.1.21
What's Changed
- [Feature] Support DPO, ORPO and Reward Model by @RangiLyu in #743
- [Bugs] fix dispatch bugs by @HIT-cwh in #775
- [Bugs] Fix HFCheckpointHook bugs when training deepseekv2 and mixtral withou… by @HIT-cwh in #774
- [Feature] Support the scenario where sp size is not divisible by attn head num by @HIT-cwh in #769
- bump version to 0.1.21 by @HIT-cwh in #776
Full Changelog: v0.1.20...v0.1.21
XTuner Release V0.1.20
What's Changed
- [Enhancement] Optimizing Memory Usage during ZeRO Checkpoint Convert by @pppppM in #582
- [Fix] ZeRO2 Checkpoint Convert Bug by @pppppM in #684
- [Feature] support auto saving tokenizer by @HIT-cwh in #696
- [Bug] fix internlm2 flash attn by @HIT-cwh in #693
- [Bug] The LoRA model will have
meta-tensor
during thepth_to_hf
phase. by @pppppM in #697 - [Bug] fix cfg check by @HIT-cwh in #729
- [Bugs] Fix bugs caused by sequence parallel when deepspeed is not used. by @HIT-cwh in #752
- [Fix] Avoid incorrect
torchrun
invocation with--launcher slurm
by @LZHgrla in #728 - [fix] fix save eval result failed with mutil-node pretrain by @HoBeedzc in #678
- [Improve] Support the export of various LLaVA formats with
pth_to_hf
by @LZHgrla in #708 - [Refactor] refactor dispatch_modules by @HIT-cwh in #731
- [Docs] Readthedocs ZH by @pppppM in #553
- [Feature] Support finetune Deepseek v2 by @HIT-cwh in #663
- bump version to 0.1.20 by @HIT-cwh in #766
New Contributors
Full Changelog: v0.1.19...v0.1.20
XTuner Release V0.1.19
What's Changed
- [Fix] LLaVA-v1.5 official settings by @LZHgrla in #594
- [Feature] Release LLaVA-Llama-3-8B by @LZHgrla in #595
- [Improve] Add single-gpu configs for LLaVA-Llama-3-8B by @LZHgrla in #596
- [Docs] Add wisemodel badge by @LZHgrla in #597
- [Feature] Support load_json_file with json.load by @HIT-cwh in #610
- [Feature]Support Mircosoft Phi3 4K&128K Instruct Models by @pppppM in #603
- [Fix] set
dataloader_num_workers=4
for llava training by @LZHgrla in #611 - [Fix] Do not set attn_implementation to flash_attention_2 or sdpa if users already set it in XTuner configs. by @HIT-cwh in #609
- [Release] LLaVA-Phi-3-mini by @LZHgrla in #615
- Update README.md by @eltociear in #608
- [Feature] Refine sp api by @HIT-cwh in #619
- [Feature] Add conversion scripts for LLaVA-Llama-3-8B by @LZHgrla in #618
- [Fix] Convert nan to 0 just for logging by @HIT-cwh in #625
- [Docs] Delete colab and add speed benchmark by @HIT-cwh in #617
- [Feature] Support dsz3+qlora by @HIT-cwh in #600
- [Feature] Add qwen1.5 110b cfgs by @HIT-cwh in #632
- check transformers version before dispatch by @HIT-cwh in #672
- [Fix]
convert_xtuner_weights_to_hf
with frozen ViT by @LZHgrla in #661 - [Fix] Fix batch-size setting of single-card LLaVA-Llama-3-8B configs by @LZHgrla in #598
- [Feature] add HFCheckpointHook to auto save hf model after the whole training phase by @HIT-cwh in #621
- Remove test info in DatasetInfoHook by @hhaAndroid in #622
- [Improve] Support
safe_serialization
saving by @LZHgrla in #648 - bump version to 0.1.19 by @HIT-cwh in #675
New Contributors
- @eltociear made their first contribution in #608
Full Changelog: v0.1.18...v0.1.19
XTuner Release V0.1.18
What's Changed
- set dev version by @LZHgrla in #537
- [Fix] Fix typo by @KooSung in #547
- [Feature] support mixtral varlen attn by @HIT-cwh in #564
- [Feature] Support qwen sp and varlen attn by @HIT-cwh in #565
- [Fix]Fix attention mask in
default_collate_fn
by @pppppM in #567 - Accept pytorch==2.2 as the bugs in triton 2.2 are fixed by @HIT-cwh in #548
- [Feature] Refine Sequence Parallel API by @HIT-cwh in #555
- [Fix] Enhance
split_list
to supportvalue
at the beginning by @LZHgrla in #568 - [Feature] Support cohere by @HIT-cwh in #569
- [Fix] Fix rotary_seq_len in varlen attn in qwen by @HIT-cwh in #574
- [Docs] Add sequence parallel related to readme by @HIT-cwh in #578
- [Bug] SUPPORT_FLASH1 = digit_version(torch.version) >= digit_version('2… by @HIT-cwh in #587
- [Feature] Support Llama 3 by @LZHgrla in #585
- [Docs] Add llama3 8B readme by @HIT-cwh in #588
- [Bugs] Check whether cuda is available when choose torch_dtype in sft.py by @HIT-cwh in #577
- [Bugs] fix bugs in tokenize_ftdp_datasets by @HIT-cwh in #581
- [Feature] Support qwen moe by @HIT-cwh in #579
- [Docs] Add tokenizer to sft in Case 2 by @HIT-cwh in #583
- bump version to 0.1.18 by @HIT-cwh in #590
Full Changelog: v0.1.17...v0.1.18
XTuner Release V0.1.17
What's Changed
- [Fix] Fix PyPI package by @LZHgrla in #540
- [Improve] Add LoRA fine-tuning configs for LLaVA-v1.5 by @LZHgrla in #536
- [Configs] Add sequence_parallel_size and SequenceParallelSampler to configs by @HIT-cwh in #538
- Check shape of attn_mask during attn forward by @HIT-cwh in #543
- bump version to v0.1.17 by @LZHgrla in #542
Full Changelog: v0.1.16...v0.1.17
XTuner Release V0.1.16
What's Changed
- set dev version by @LZHgrla in #487
- Fix type error when the visual encoder is not CLIP by @hhaAndroid in #496
- [Feature] Support Sequence parallel by @HIT-cwh in #456
- [Bug] Fix bugs in flash_attn1_pytorch by @HIT-cwh in #513
- [Fix] delete cat in varlen attn by @HIT-cwh in #508
- bump version to 0.1.16 by @HIT-cwh in #520
- [Improve] Add
generation_kwargs
forEvaluateChatHook
by @LZHgrla in #501 - [Bugs] Fix bugs when training in non-distributed env by @HIT-cwh in #522
- [Fix] Support transformers>=4.38 and require transformers>=4.36.0 by @HIT-cwh in #494
- [Fix] Fix throughput hook by @HIT-cwh in #527
- Update README.md by @JianxinDong in #528
- [Fix] dispatch internlm rote by @HIT-cwh in #530
- Limit transformers != 4.38 by @HIT-cwh in #531
New Contributors
- @hhaAndroid made their first contribution in #496
- @JianxinDong made their first contribution in #528
Full Changelog: v0.1.15...v0.1.16
XTuner Release V0.1.15
What's Changed
- set dev version by @LZHgrla in #437
- [Bugs] Fix bugs when using EpochBasedRunner by @HIT-cwh in #439
- [Feature] Support processing ftdp dataset and custom dataset offline by @HIT-cwh in #410
- Update prompt_template.md by @aJupyter in #441
- [Doc] Split finetune_custom_dataset.md to 6 parts by @HIT-cwh in #445
- [Improve] Add notes for demo_data examples by @LZHgrla in #458
- [Fix] Gemma prompt_template by @LZHgrla in #454
- [Feature] Add LLaVA-InternLM2-1.8B by @LZHgrla in #449
- show more info about datasets by @amulil in #464
- [Fix] write text with
encoding='utf-8'
by @LZHgrla in #477 - support offline process llava data by @HIT-cwh in #448
- [Fix]
msagent_react_map_fn
error by @LZHgrla in #470 - [Improve] Reorg
xtuner/configs/llava/
configs by @LZHgrla in #483 - limit pytorch version <= 2.1.2 as there may be some bugs in triton2… by @HIT-cwh in #452
- [Fix] fix batch sampler bs by @HIT-cwh in #468
- bump version to v0.1.15 by @LZHgrla in #486
New Contributors
Full Changelog: v0.1.14...v0.1.15
XTuner Release V0.1.14
What's Changed
- set dev version by @LZHgrla in #341
- [Feature] More flexible
TrainLoop
by @LZHgrla in #348 - [Feature]Support CEPH by @pppppM in #266
- [Improve] Add
--repetition-penalty
forxtuner chat
by @LZHgrla in #351 - [Feature] Support MMBench DDP Evaluate by @pppppM in #300
- [Fix]
KeyError
ofencode_fn
by @LZHgrla in #361 - [Fix] Fix
batch_size
of full fine-tuing LLaVA-InternLM2 by @LZHgrla in #360 - [Fix] Remove
system
foralpaca_map_fn
by @LZHgrla in #363 - [Fix] Use
DEFAULT_IMAGE_TOKEN
instead of'<image>'
by @LZHgrla in #353 - [Feature] Support internlm sft by @HIT-cwh in #302
- [Fix] Add
attention_mask
fordefault_collate_fn
by @LZHgrla in #371 - [Fix] Update requirements by @LZHgrla in #369
- [Fix] Fix rotary_base, add
colors_map_fn
toDATASET_FORMAT_MAPPING
and rename 'internlm_repo' to 'intern_repo' by @HIT-cwh in #372 - update by @HIT-cwh in #377
- Delete useless codes and refactor process_untokenized_datasets by @HIT-cwh in #379
- [Feature] support flash attn 2 in internlm1, internlm2 and llama by @HIT-cwh in #381
- [Fix] Fix installation docs of mmengine in
intern_repo_dataset.md
by @LZHgrla in #384 - [Fix] Update InternLM2
apply_rotary_pos_emb
by @LZHgrla in #383 - [Feature] support saving eval output before save checkpoint by @HIT-cwh in #385
- fix lr scheduler setting by @gzlong96 in #394
- [Fix] Remove pre-defined
system
ofalpaca_zh_map_fn
by @LZHgrla in #395 - [Feature] Support
Qwen1.5
by @LZHgrla in #407 - [Fix] Fix no space in chat output using InternLM2. (#357) by @KooSung in #404
- [Fix] typo:
--system-prompt
to--system-template
by @LZHgrla in #406 - [Improve] Add
output_with_loss
for dataset process by @LZHgrla in #408 - [Fix] Fix dispatch to support transformers>=4.36 & Add USE_TRITON_KERNEL environment variable by @HIT-cwh in #411
- [Feature]Add InternLM2-Chat-1_8b full config by @KMnO4-zx in #396
- [Fix] Fix extract_json_objects by @fanqiNO1 in #419
- [Fix] Fix pth_to_hf error by @LZHgrla in #426
- [Feature] Support
Gemma
by @PommesPeter in #429 - add refcoco to llava by @LKJacky in #425
- [Fix] Inconsistent BatchSize of
LengthGroupedSampler
by @LZHgrla in #436 - bump version to v0.1.14 by @LZHgrla in #431
New Contributors
- @gzlong96 made their first contribution in #394
- @KooSung made their first contribution in #404
- @KMnO4-zx made their first contribution in #396
- @fanqiNO1 made their first contribution in #419
- @PommesPeter made their first contribution in #429
- @LKJacky made their first contribution in #425
Full Changelog: v0.1.13...v0.1.14