You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
`from datasets import load_dataset, DownloadConfig
from datasets import Dataset
Dataset.cleanup_cache_files
scrolls_datasets = ["quality"]
download_config = DownloadConfig(force_download=True)
data = [load_dataset("tau/scrolls", dataset, force_download=True, download_config=download_config) for dataset in scrolls_datasets]`
Reproduction
No response
Logs
$python main.py
Downloading builder script: 20%|██████████████████▌ | 3.68k/18.6k [00:00<00:01, 11.4kB/s]Traceback (most recent call last):
File "main.py", line 12, in<module>
data = [load_dataset("tau/scrolls", dataset, force_download=True, download_config=download_config) fordatasetin scrolls_datasets]
File "main.py", line 12, in<listcomp>
data = [load_dataset("tau/scrolls", dataset, force_download=True, download_config=download_config) fordatasetin scrolls_datasets]
File "/opt/conda/lib/python3.8/site-packages/datasets/load.py", line 2606, in load_dataset
builder_instance = load_dataset_builder(
File "/opt/conda/lib/python3.8/site-packages/datasets/load.py", line 2277, in load_dataset_builder
dataset_module = dataset_module_factory(
File "/opt/conda/lib/python3.8/site-packages/datasets/load.py", line 1923, in dataset_module_factory
raise e1 from None
File "/opt/conda/lib/python3.8/site-packages/datasets/load.py", line 1889, in dataset_module_factory
return HubDatasetModuleFactoryWithScript(
File "/opt/conda/lib/python3.8/site-packages/datasets/load.py", line 1507, in get_module
local_path = self.download_loading_script()
File "/opt/conda/lib/python3.8/site-packages/datasets/load.py", line 1467, in download_loading_script
return cached_path(file_path, download_config=download_config)
File "/opt/conda/lib/python3.8/site-packages/datasets/utils/file_utils.py", line 211, in cached_path
output_path = get_from_cache(
File "/opt/conda/lib/python3.8/site-packages/datasets/utils/file_utils.py", line 690, in get_from_cache
fsspec_get(
File "/opt/conda/lib/python3.8/site-packages/datasets/utils/file_utils.py", line 396, in fsspec_get
fs.get_file(path, temp_file.name, callback=callback)
File "/opt/conda/lib/python3.8/site-packages/huggingface_hub/hf_file_system.py", line 640, in get_file
http_get(
File "/opt/conda/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 570, in http_get
raise EnvironmentError(
OSError: Consistency check failed: file should be of size 18612 but has size 18605 (datasets/tau/scrolls@main/scrolls.py).
We are sorry for the inconvenience. Please retry with `force_download=True`.
If the issue persists, please let us know by opening an issue on https://github.com/huggingface/huggingface_hub.
Downloading builder script: 100%|█████████████████████████████████████████████████████████████████████████████████████████████▉| 18.6k/18.6k [00:00<00:00, 36.0kB/s]
Hi @kaiqinhu, sorry for the inconvenience. This is usually due to a network issue while downloading. Can you retry with force_download=True or on a different network and let us know if the same error happens again (on the same file). Thanks in advance
Describe the bug
`from datasets import load_dataset, DownloadConfig
from datasets import Dataset
Dataset.cleanup_cache_files
scrolls_datasets = ["quality"]
download_config = DownloadConfig(force_download=True)
data = [load_dataset("tau/scrolls", dataset, force_download=True, download_config=download_config) for dataset in scrolls_datasets]`
Reproduction
No response
Logs
System info
The text was updated successfully, but these errors were encountered: