CommonsDownloadTool is a Python script used to download and package thousands of Lingua Libre recordings from Wikimedia Commons into .zip
archives.
- Operations/create_datasets.sh - LinguaLibre.org server script which calls and runs CommonsDownloadTool.
- lingualibre.org/datasets/ - LinguaLibre.org page to access the generated archives.
.zip
archives generated by this tool ship a comment detailing how many recordings they contain.
You can read this comment with the following commands:
- On Linux:
zipinfo -z archive.zip
- Phabricator: Lingua-libre > Datasets and mass downlaods column — tickets manager
- Github: Lingua-libre/CommonsDownloadTool — code (Python)