-
Notifications
You must be signed in to change notification settings - Fork 526
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* support remote jsonl files for IFT datasets * improve docstring * add support for other extensions * don't duplicate validation check * build dataset before tmpdir deletes * parse uri * only rank 0 download * only download rank 0 * better error * break earlier * log more * more reasonable destination str * use data files format * name points to a preprocessing function I guess * debugging * always something with HF * json vs jsonl [no-ci] * if hf wants it local, make it local [no-ci] * back to tempfile [no-ci] * debug * debug hfds [no-ci] * ... [no-ci] * don't rename file * use tempfile again * updt --------- Co-authored-by: Vitaliy Chiley <[email protected]> Co-authored-by: root <[email protected]>
- Loading branch information
1 parent
2f1bf41
commit af209b3
Showing
1 changed file
with
51 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters