The code in this project compiles a dataset of temporary flight restrictions issued by the FAA and their corresponding shapefiles.
The scripts are written to run on an AWS EC2 Linux instance, so you may need to fork in order to tweak according to your operating system or production environment. I wrote a blog post outlining the process of setting up an EC2 instance with R to run these scripts.
The FAA lists its active TFRs online. Unfortunately no online repository exists for expired TFRs, stymieing analysis of temporary flight restrictions. The data on active TFRs are conveniently formatted in HTML tables, making the data conducive to using the readHTMLTable
function provided by R's XML package.
Download my current TFR data, collected since Sept. 20, 2017.
You are free to copy and modify this code as you see fit to collect your own TFR data. If you find a way to improve the dataset, then please submit a pull request!
R
(download),- R's
XML
package (R Documentation), - R's
RCurl
package (R Documentation) - R's
httr
package (GitHub), wget
(Install with Homebrew if running on Mac OS), andsed
.
- The shell script
shapefile-download.sh
is written specifically to run on an EC2 instance, so directory paths (i.e.,/home/ec2-user/...
) may need to be modified to match the working directory in your production environment. - Edit line 8 of
FAA-TFR-scrapper.R
to set the working directory to match the path to your installation of thetfr-data
repo. Paths may need to be edited on lines 110, 131, and 156 of that file. - For the first collection of TFR data, you'll need to copy
tfrData-export-memory-head.csv
to a new file namedtfrData-export-memory.csv
in the working directory. Additional scrapes will append new TFRs as rows to the end of the file, so you won't want to overwrite it!
The scripts can be run manually to compile the active list of TFRs, download their shapefiles, and append them to the records collected in your dataset thus far.
R CMD BATCH /path/to/scripts/FAA-TFR-scraper.R && /path/to/scripts/shapefile-download.sh
These scripts are designed to be run automatically with no need to oversee their operation. For instance, I have these scripts running on an AWS EC2 instance, with a cron task that runs every hour like so:
0 * * * * R CMD BATCH /home/ec2-user/scripts/FAA-TFR-scraper.R && /home/ec2-user/scripts/shapefile-download.sh
I wrote a blog post outlining the process of setting up an EC2 instance to execute the web scrapes and store your data in an S3 bucket. If you do not have an AWS account currently, you can sign up and run an EC2 instance under the "free tier" for 12 months. You can collect TFR data for a year free of charge!
The FAA posts shapefiles for most TFRs that contain valuable information about the landarea and altitudes affected that R's XML package cannot scrape.
Running the shell script after the R script will loop over the URLs where each active TFRs shapefiles are stored and download them using wget
. The script in this repo places the shapefiles into directory /tmp/
and then unpackage the archives in a specified directory.
You can download my copy of the TFR data and shapefiles at this webpage on my personal website. My dataset has TFR information and corresponding shapefiles dating back to September 20, 2017.
You can use the files as the starting point for your own scraper or for analysis.
tfrData <- read.table("path/to/tfrData-export-memory.csv", head=T, sep=",", na.strings = "")