Skip to content

Commit

Permalink
disabling source_data_dir
Browse files Browse the repository at this point in the history
  • Loading branch information
PicoCreator committed Aug 20, 2023
1 parent 1a5557c commit 126f71b
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 4 deletions.
4 changes: 2 additions & 2 deletions RWKV-v4neo/config-example.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -343,9 +343,9 @@ data:
# source_dataset_params:
# language: en

# Use data_dir, if you are using source=text/json/etc
# Use source_data_dir, if you are using source=text/json/etc
# If using relative path, this should be relative to the trainer script path
source_data_dir: ../dataset-text/
# source_data_dir: ../dataset-text/

# After loading the dataset, split out test data used for validation,
# This process is skipped if the dataset includes a test split
Expand Down
4 changes: 2 additions & 2 deletions RWKV-v4neo/config-minimum-example.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -160,9 +160,9 @@ data:
# source: "teven/enwiki_00k" # Hugging face dataset
# source: text # Text mode, used with source_data_dir

# Use data_dir, if you are using source=text/json/etc
# Use source_data_dir, if you are using source=text/json/etc
# If using relative path, this should be relative to the trainer script path
source_data_dir: ../dataset-json-dir/
# source_data_dir: ../dataset-json-dir/

# Tokenizer to use, use either the inbuilt 'neox', or 'world' tokenizer
# If using a custom tokenizer, provide the HF tokenizer name/path
Expand Down

0 comments on commit 126f71b

Please sign in to comment.