-
Notifications
You must be signed in to change notification settings - Fork 0
Intro
The Department of City Planning (DCP) is New York City’s primary land use agency and is instrumental in designing the City’s physical and socioeconomic framework. DCP’s ambition is to make all of New York a better place to live, to maintain what works, and improve what doesn’t.
Data Engineering (DE) is a team within DCP’s Geographic Data and Engineering (GDE) group. Various complex datasets were produced, published, and maintained by technologists, planners and analysts throughout the agency. In 2018, it was argued and decided that a team was needed specifically for maintaining infrastructure around transforming data, and Data Engineering at DCP began.
The data we produce are used by planners and civic technologists alike in analyses, which help inform decisions that ultimately shape NYC. Our data products also power downstream applications, such as NYC’s Zoning and Land Use Map, used by planners, policy makers, and the public. Therefore, it’s imperative that we publish data and supporting documentation of the highest quality. At this point, we produce 11 primary datasets (or rather data products), and the production, QA, and distribution of these products are our primary focus and responsibility.
While production of data is our primary responsibility, that breaks down into some slightly more discrete tasks and responsibilities of DE. On the technical side of things, we are responsible for
- Maintenance/creation of infrastructure
- for long-term storage of data, both produced by DE and ingested from external sources
- for transformation of data
- for QA of data, codifying knowledge of domain experts and making it easier to explore datasets for irregularities
- for packaging and distribution of data products
- Operations
- building of data products
- running internal QA/facilitating QA by stakeholders
- packaging and distributing data products
At a slightly higher level, our mission breaks down into four categories,
Create and release high quality public datasets about NYC
Build highly transparent and automated data pipelines using open source technologies
Offer more than just data, but also comprehensive documentation and metadata
Bring people together across teams and agencies to share data and to learn from each other