Giga is a UNICEF-ITU global initiative to connect every school to the Internet and every young person to information, opportunity, and choice. By connecting all schools to the Internet, we ensure that every child has a fair shot at success in an increasingly digital world.
This work leverages deep learning and high-resolution satellite images for automated school mapping and is developed under Giga, a global initiative by UNICEF-ITU to connect every school to the internet by 2030.
Obtaining complete and accurate information on schools locations is a critical first step to accelerating digital connectivity and driving progress towards SDG4: Quality Education. However, precise GPS coordinate of schools are often inaccurate, incomplete, or even completely non-existent in many developing countries. In support of the Giga initiative, we leverage computer and remote sensing data to accelerate school mapping. This work aims to support government agencies and connectivity providers in improving school location data to better estimate the costs of digitally connecting schools and plan the strategic allocation of their financial resources.
- Present a publicly available, end-to-end pipeline for automated school location detection from high-resolution satellite images.
- Help governments improve the quality of school location information in their national register.
- Identify new, previously unmapped schools in way that is quick, efficient, and scalable.
For each school and non-school location in our dataset, we downloaded 300 x 300 m, 500 x 500 px high-resolution satellite images from Maxar with a spatial resolution of 60 cm/px.
- ML/DL Frameworks: Scikit-learn, Pytorch
- Programming Language: Python
- Geospatial Libraries: GeoPandas, Rasterio, Fiona, GDAL
conda create -n envname python==3.10.13
conda activate envname
pip install -r requirements.txt
Fixing the Google Maps bug:
Navigate to your site packages, e.g. /anaconda/envs/envname/lib/python3.10/site-packages
.
Under leafmap/common.py
, find the function download_google_buildings()
and replace the building URL as follows:
#building_url = "https://sites.research.google/open-buildings/tiles.geojson"
building_url = "https://openbuildings-public-dot-gweb-research.uw.r.appspot.com/public/tiles.geojson"
To install GDAL/OGR, follow these instructions.
To download the relevant datasets, run python src/data_download.py
:
usage: data_download.py [-h] [--config CONFIG] [--profile PROFILE]
Data Download
options:
-h, --help show this help message and exit
--config CONFIG Path to the configuration file
--profile PROFILE Path to the profile file
To run the data cleaning pipeline, run python src/data_preparation.py
:
usage: data_preparation.py [-h] [--config CONFIG] [--name NAME]
[--sources SOURCES [SOURCES ...]] [--clean_pos CLEAN_POS] [--clean_neg CLEAN_NEG]
Data Cleaning Pipeline
options:
-h, --help show this help message and exit
--config CONFIG Path to the configuration file
--name NAME Folder name
--sources SOURCES [SOURCES ...] Sources (e.g. unicef, osm, overture)
--clean_pos CLEAN_POS Clean positive samples (Boolean indicator)
--clean_neg CLEAN_NEG Clean negative samples (Boolean indicator)
To download Maxar satellite images, run python src/sat_download.py
:
usage: sat_download.py [-h] [--config CONFIG] [--creds CREDS]
[--category CATEGORY] [--iso_code ISO_CODE] [--filename FILENAME]
Satellite Image Download
options:
-h, --help show this help message and exit
--config CONFIG Path to the configuration file
--creds CREDS Path to the credentials file
--category CATEGORY Category (e.g. school or non_school)
--iso_code ISO_CODE ISO 3166-1 alpha-3 code
--filename FILENAME Filename of data (optional)
To train the computer vision models, run python src/train_cnn.py
:
usage: train_cnn.py [-h] [--cnn_config CNN_CONFIG] [--lr_finder LR_FINDER] [--iso ISO [ISO ...]]
Model Training
options:
-h, --help show this help message and exit
--cnn_config CNN_CONFIG Path to the configuration file
--lr_finder LR_FINDER Learning rate finder (Boolean indicator)
--iso ISO [ISO ...] ISO 3166-1 alpha-3 codes
For model prediction, run:
python sat_predict.py \
--data_config="configs/<DATA_CONFIG_FILE_NAME>.yaml" \
--model_config="configs/cnn_configs/<CNN_CONFIG_FILE_NAME>.yaml" \
--sat_config="configs/sat_configs/<SAT_CONFIG_FILE_NAME>.yaml" \
--sat_creds="configs/sat_configs/<SAT_CREDENTIALS_FILE_NAME>.yaml" \
--iso="<ISO_CODE>"
Thank you for considering contributing to Giga! We value your input and aim to make the contribution process as accessible and transparent as possible. Whether you're interested in reporting bugs, discussing code, submitting fixes, proposing features, becoming a maintainer, or engaging with the Giga community, we welcome your involvement.
Click here for detailed Contribution Guidelines
This repository is divided into the following files and folders:
- notebooks/: contains all Jupyter notebooks for exploratory data analysis and model prediction.
- utils/: contains utility methods for loading datasets, building model, and performing training routines.
- src/: contains scripts runnable scripts for automated data cleaning and model training/evaluation.
The datasets are organized as follows:
data
├── rasters
│ ├── maxar
│ │ ├── ISO
│ │ │ ├── school
│ │ │ │ ├── UNICEF-ISO-SCHOOL-00000001.tiff
│ │ │ │ └── ...
│ │ │ ├── non_school
│ │ │ │ ├── UNICEF-ISO-NON_SCHOOL-00000001.tiff
│ │ │ │ └── ...
│ │ │ └── ...
│ │ └── ...
└── vectors
├── school
│ ├── unicef
│ │ ├──ISO_unicef.geojson
│ │ └── ...
│ ├── osm
│ │ ├──ISO_osm.geojson
│ │ └── ...
│ ├── overture
│ │ ├──ISO_overture.geojson
│ │ └── ...
└── non_school
├── osm
│ ├──ISO_osm.geojson
│ └── ...
└── overture
├──ISO_overture.geojson
└── ...
At Giga, we're committed to maintaining an environment that's respectful, inclusive, and harassment-free for everyone involved in our project and community. We welcome contributors and participants from diverse backgrounds and pledge to uphold the standards.
Click here for detailed Code of Conduct
Applied Science AI-enabled School Mapping Team:
- Isabelle Tingzon: [email protected]
- Ivan Dotu Rodriguez: [email protected]
Giga Website: https://giga.global/contact-us/
Global high-resolution satellite images (60 cm/px) from Maxar made available with the generous support of the US State Department. We are also grateful to Dell for providing us with access to High Performance Computing (HPC) clusters with NVIDIA GPU support.