Skip to content

Latest commit

 

History

History
27 lines (19 loc) · 5.76 KB

README.md

File metadata and controls

27 lines (19 loc) · 5.76 KB

WaterlooRegionAddresses

Smaller, more manageable address data files extracted from the StatCan ODA, and the scripts to generate them, targeted at the Region of Waterloo.

What is the StatCan ODA?

Statistics Canada has released a collection of address data from across the country called The Open Database of Addresses, or ODA for short. All of this data was nominally publicly available before, but from various sources, under different licenses. This made it more difficult to find, and more difficult to have legal assurance that it could be collated into other projects. Notably, the license the ODA is released under (the OGL Canada) is one of the licenses that have been approved by the legal working group for integration into OpenStreetMap.

Why is this necessary?

Statistics Canada only provides the ODA divided on a provincial/territorial level. This is far too much data to comfortably operate on with any GUI OpenStreetMap editor. These scripts filter and break down the data into what is relevant for OpenStreetMap contributors in Waterloo Region. The scripts are fairly simple, and can therefore also be adapted to other regions as desired, with some basic familiarity with Unix shell scripting.

How can I use this?

Do not import this data to OSM. It is not structured for direct import, and the quality isn't always great. The Region already has, from previous StatCan imports, block-level address data represented as address interpolations in OSM which are in many ways better than this data. What you can use it for is another reference while editing, similar to aerial imagery, but for addresses. E.g., if you notice all the houses on a block have address data except one, you can use this to look up what the address of that house is.

To access the data, you'll likely want to start in the overview map. Zoom in if necessary, and click on the box covering the area you're interested in. GitHub seem to have broken their integrated viewer, and you can no longer see area data from mapped elements. As an alternative, you can open the bounding_boxes.geojson file in your map editor instead (see the next paragraph), or view the slices directly in Github, which seems to work for individual nodes. This will tell you the name of that box, and give two links you can use. If you download the entire snapshot here, then you can make a note of the name, and skip to the next paragraph. If you'd rather only use the slice you need, you can try to use the GitHub link, which will take you to GitHub's built-in rendering of the data in that square. This will let you browse the data, but with limited zoom, and often pretty bad performance. If this isn't sufficient, you can use the raw_url given to download the data (if it opens up the file in your browser, just right click to save it).

In iD (the default web editor on the OpenStreetMap website), press F or the Map Data button on the right, then hit the three dot menu on Custom Map Data to upload a geojson file (if you have the whole snapshot, you'll find them in the slices directory). It'll take a bit to open, but once it does, these slices seem small enough to not have a significant impact on performance. In JOSM, you can just open any of the geojson files -- just be sure to keep OSM data and this data in separate layers. If you enabled the opendata plugin in JOSM's settings, you can also open any of the CSV files generated by the scripts, described in the developer documentation.

Some tips for using the data:

  • Cite it as StatCan 46-26-0001 in your changeset's sources. Even if your edits in this changeset consist mostly of other things, if you used this data, you need to cite it.
  • When you're using this data, do some basic sanity checks on it. Sometimes errors can be found with obvious context, like an address that's half a block away from the supposed street it's on, or if the house number is far too large/small for this block, or an even number on the odd side of the street.
  • City data is usually, but not always, more accurate than RoW data. If the two disagree, it should be left to a ground survey (or if it's something that has a website, you can check that), but if even that is inconclusive, defer to city data.
  • Sometimes there will be a bunch of apartment addresses (typically with unit numbers) clustered in an apartment building, haphazardly thrown about on it. These are essentially unusable, since they often don't correspond to actual physical location within the building, and will sometimes list units that don't actually exist, or miss units that do. The situation is similar for row houses.

Advanced usage and development

Information on how to modify and run the scripts, whether to cater towards your needs or to port to a different location, can be found in the DEVELOPING.md file.

A Note About Licensing

This repository is licensed under a MIT license, but as stated above, the address data itself is released under the terms of the OGL Canada. I.e., any scripts or code hosted here are under the terms given in the LICENSE file, but any downloaded or generated data, such as CSV or geojson files, including those in the slices directory, are produced under the terms of the OGL Canada. More information about the OGL Canada usage in this repository can be found in the slices/LICENSE file.