Skip to content

Latest commit

 

History

History
103 lines (63 loc) · 2.28 KB

02_download_datasets.md

File metadata and controls

103 lines (63 loc) · 2.28 KB

Download datasets


1. TU datasets

Nothing to do. The TU datasets are automatically downloaded.


2. MNIST/CIFAR10 super-pixel datasets

MNIST size is 1.39GB and CIFAR10 size is 2.51GB.

# At the root of the project
cd data/ 
bash script_download_superpixels.sh

Script script_download_superpixels.sh is located here. Codes to reproduce the datasets for MNIST and for CIFAR10.


3. ZINC molecular dataset

ZINC size is 58.9MB.

ZINC-full size is 1.14GB.

# At the root of the project
cd data/ 
bash script_download_molecules.sh

Script script_download_molecules.sh is located here. Code to reproduce the ZINC dataset is here and the ZINC-full dataset is here.(../data/molecules/prepare_molecules.ipynb).


4. PATTERN/CLUSTER SBM datasets

PATTERN size is 1.98GB and CLUSTER size is 1.26GB.

# At the root of the project
cd data/ 
bash script_download_SBMs.sh

Script script_download_SBMs.sh is located here. Codes to reproduce the datasets for PATTERN and for CLUSTER.


5. TSP dataset

TSP size is 1.87GB.

# At the root of the project
cd data/ 
bash script_download_TSP.sh

Script script_download_TSP.sh is located here. Codes to reproduce the TSP dataset is here.


6. CSL dataset

CSL size is 27KB.

# At the root of the project
cd data/ 
bash script_download_CSL.sh

Script script_download_CSL.sh is located here.


7. COLLAB dataset

COLLAB size is 360MB.

No script to run. The COLLAB dataset files will be automatically downloaded from OGB when running the experiment files for COLLAB.


8. All datasets

# At the root of the project
cd data/ 
bash script_download_all_datasets.sh

Script script_download_all_datasets.sh is located here.