Skip to content

Towards global school mapping using AI & satellite images in support of Giga, a UNICEF-ITU Intitiative

License

Notifications You must be signed in to change notification settings

unicef/giga-global-school-mapping

Repository files navigation

Giga logo

UNICEF Giga: AI-enabled School Mapping

GitHub repo size GitHub stars Twitter Follow

Table of Contents

  1. About Giga
  2. About
  3. Getting Started
  4. Contribution Guidelines
  5. Code Design
  6. Code of Conduct
  7. License
  8. Contact
  9. Acknowledgements

About Giga

Giga is a UNICEF-ITU global initiative to connect every school to the Internet and every young person to information, opportunity, and choice. By connecting all schools to the Internet, we ensure that every child has a fair shot at success in an increasingly digital world.

About

This work leverages deep learning and high-resolution satellite images for automated school mapping and is developed under Giga, a global initiative by UNICEF-ITU to connect every school to the internet by 2030.

Obtaining complete and accurate information on schools locations is a critical first step to accelerating digital connectivity and driving progress towards SDG4: Quality Education. However, precise GPS coordinate of schools are often inaccurate, incomplete, or even completely non-existent in many developing countries. In support of the Giga initiative, we leverage computer and remote sensing data to accelerate school mapping. This work aims to support government agencies and connectivity providers in improving school location data to better estimate the costs of digitally connecting schools and plan the strategic allocation of their financial resources.

Project Objective

  • Present a publicly available, end-to-end pipeline for automated school location detection from high-resolution satellite images.
  • Help governments improve the quality of school location information in their national register.
  • Identify new, previously unmapped schools in way that is quick, efficient, and scalable.

System Flow Diagram

For each school and non-school location in our dataset, we downloaded 300 x 300 m, 500 x 500 px high-resolution satellite images from Maxar with a spatial resolution of 60 cm/px.

Github Repositories

Built With

  • ML/DL Frameworks: Scikit-learn, Pytorch
  • Programming Language: Python
  • Geospatial Libraries: GeoPandas, Rasterio, Fiona, GDAL

Getting Started

Setup

conda create -n envname python==3.10.13
conda activate envname
pip install -r requirements.txt

Fixing the Google Maps bug: Navigate to your site packages, e.g. /anaconda/envs/envname/lib/python3.10/site-packages. Under leafmap/common.py, find the function download_google_buildings() and replace the building URL as follows:

#building_url = "https://sites.research.google/open-buildings/tiles.geojson"
building_url = "https://openbuildings-public-dot-gweb-research.uw.r.appspot.com/public/tiles.geojson"

To install GDAL/OGR, follow these instructions.

Data Download

To download the relevant datasets, run python src/data_download.py:

usage: data_download.py [-h] [--config CONFIG] [--profile PROFILE]

Data Download
options:
  -h, --help         show this help message and exit
  --config CONFIG    Path to the configuration file
  --profile PROFILE  Path to the profile file

Data Preparation

To run the data cleaning pipeline, run python src/data_preparation.py:

usage: data_preparation.py [-h] [--config CONFIG] [--name NAME] 
[--sources SOURCES [SOURCES ...]] [--clean_pos CLEAN_POS] [--clean_neg CLEAN_NEG]

Data Cleaning Pipeline
options:
  -h, --help            show this help message and exit
  --config CONFIG       Path to the configuration file
  --name NAME           Folder name
  --sources SOURCES [SOURCES ...] Sources (e.g. unicef, osm, overture)
  --clean_pos CLEAN_POS Clean positive samples (Boolean indicator)
  --clean_neg CLEAN_NEG Clean negative samples (Boolean indicator)

Satellite Image Download

To download Maxar satellite images, run python src/sat_download.py:

usage: sat_download.py [-h] [--config CONFIG] [--creds CREDS] 
[--category CATEGORY] [--iso_code ISO_CODE] [--filename FILENAME]

Satellite Image Download
options:
  -h, --help           show this help message and exit
  --config CONFIG      Path to the configuration file
  --creds CREDS        Path to the credentials file
  --category CATEGORY  Category (e.g. school or non_school)
  --iso_code ISO_CODE  ISO 3166-1 alpha-3 code
  --filename FILENAME  Filename of data (optional)

Model Training

To train the computer vision models, run python src/train_cnn.py:

usage: train_cnn.py [-h] [--cnn_config CNN_CONFIG] [--lr_finder LR_FINDER] [--iso ISO [ISO ...]]

Model Training
options:
  -h, --help              show this help message and exit
  --cnn_config CNN_CONFIG Path to the configuration file
  --lr_finder LR_FINDER   Learning rate finder (Boolean indicator)
  --iso ISO [ISO ...]     ISO 3166-1 alpha-3 codes

Model Prediction

For model prediction, run:

python sat_predict.py \
--data_config="configs/<DATA_CONFIG_FILE_NAME>.yaml" \
--model_config="configs/cnn_configs/<CNN_CONFIG_FILE_NAME>.yaml" \
--sat_config="configs/sat_configs/<SAT_CONFIG_FILE_NAME>.yaml" \
--sat_creds="configs/sat_configs/<SAT_CREDENTIALS_FILE_NAME>.yaml" \
--iso="<ISO_CODE>"

Contribution Guidelines

Thank you for considering contributing to Giga! We value your input and aim to make the contribution process as accessible and transparent as possible. Whether you're interested in reporting bugs, discussing code, submitting fixes, proposing features, becoming a maintainer, or engaging with the Giga community, we welcome your involvement.

Click here for detailed Contribution Guidelines

Code Design

This repository is divided into the following files and folders:

  • notebooks/: contains all Jupyter notebooks for exploratory data analysis and model prediction.
  • utils/: contains utility methods for loading datasets, building model, and performing training routines.
  • src/: contains scripts runnable scripts for automated data cleaning and model training/evaluation.

File Organization

The datasets are organized as follows:

data
├── rasters
│   ├── maxar
│   │   ├── ISO
│   │   │   ├── school
│   │   │   │    ├── UNICEF-ISO-SCHOOL-00000001.tiff
│   │   │   │    └── ...
│   │   │   ├── non_school
│   │   │   │    ├── UNICEF-ISO-NON_SCHOOL-00000001.tiff
│   │   │   │    └── ...
│   │   │   └── ...
│   │   └── ...
└── vectors
    ├── school
    │   ├── unicef
    │   │   ├──ISO_unicef.geojson
    │   │   └── ...
    │   ├── osm
    │   │   ├──ISO_osm.geojson
    │   │   └── ...
    │   ├── overture
    │   │   ├──ISO_overture.geojson
    │   │   └── ...
    └── non_school
        ├── osm
        │   ├──ISO_osm.geojson
        │   └── ...
        └── overture
            ├──ISO_overture.geojson
            └── ...
    

Code of Conduct

At Giga, we're committed to maintaining an environment that's respectful, inclusive, and harassment-free for everyone involved in our project and community. We welcome contributors and participants from diverse backgrounds and pledge to uphold the standards.

Click here for detailed Code of Conduct

Contact

Applied Science AI-enabled School Mapping Team:

Giga Website: https://giga.global/contact-us/

Acknowledgments💜

Global high-resolution satellite images (60 cm/px) from Maxar made available with the generous support of the US State Department. We are also grateful to Dell for providing us with access to High Performance Computing (HPC) clusters with NVIDIA GPU support.

About

Towards global school mapping using AI & satellite images in support of Giga, a UNICEF-ITU Intitiative

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages