Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set output-ngrams to default to true and update README #1776

Merged
merged 5 commits into from
Aug 12, 2023

Conversation

andyzorigin
Copy link
Contributor

No description provided.

@@ -13,7 +13,7 @@ def get_data_overlap_args() -> Any:
required=True,
help="The format of your input file for your training data, e.g. raw, custom, the_pile",
)
parser.add_argument("--output-ngrams", type=bool, default=False, help="Whether to output ngrams")
parser.add_argument("--output-ngrams", type=bool, default=True, help="Whether to output ngrams")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you set the default to True, there's no way to turn it off, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch, I've updated it to be --no-output-ngrams which can be passed to not output ngrams.

@@ -24,14 +24,15 @@ This needs to be run from the data overlap directory; i.e. cd scripts/data_overl

Usage:

python [compute_data_overlap_metrics.py OR run_data_overlap_beam.py] --input-data <input_data> --scenario-data <scenario_data> --output-stats <output_stats> --input-format <input_format>
python [compute_data_overlap_metrics.py OR run_data_overlap_beam.py] --input-data <input_data> --scenario-data <scenario_data> --output-stats <output_stats> --input-format <input_format>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete trailing space?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted, thanks

@andyzorigin andyzorigin merged commit de02a1e into main Aug 12, 2023
3 checks passed
@andyzorigin andyzorigin deleted the andyz/output_ngrams branch August 12, 2023 00:41
danielz02 pushed a commit to danielz02/helm that referenced this pull request Sep 7, 2023
danielz02 pushed a commit to danielz02/helm that referenced this pull request Sep 7, 2023
danielz02 pushed a commit to danielz02/helm that referenced this pull request Sep 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants