dataset format

Dataset format

Dataset formats used --train_file_dir and --validation_file_dir

The format of the PT (pre-training) data set is as follows:

text file, one sample per line

txt file

alpaca dataset format, one sample per line, each sample contains the following fields:

json file, one sample per line, each sample contains the following fields:

{"instruction": "text1", "input": "text2", "output": "text3"}

The format of the Reward (reward model) data set is as follows: json file, one sample per line, each sample contains the following fields:

{"question": "text1", "response_chosen": "text2", "response_rejected": "text3"}

The RL (Reinforcement Learning) dataset format is as follows: json file, one sample per line, each sample contains the following fields:

{"instruction": "text1", "input": "text2", "output": "text3"}

SFT datasets can be reused.

Use --dataset_name to load HF datasets, format refer to shibing624/medical