Adding tables and endpoints related to capital projects to Zoning API #211

TylerMatteo · 2024-04-30T18:00:10Z

TylerMatteo
Apr 30, 2024
Maintainer

With #210, we start adding Drizzle schemas for tables related to the Capital Projects map project to this repo. This will be the first of a few Issues for adding tables, which will then be following by Issues for designing and implementing API endpoints that use those tables. I recognize that adding data that doesn't logically have much to do with zoning to a repo called "Zoning API" might rightly raise some eyebrows. With that in mind, I wanted to capture my thoughts as to why we are taking this approach and what it means for our API development strategy going forward.

First, some context/history about Zoning API

This repo was started as part of a larger project in Q4 2023 to build a full stack proof of concept application. The two other repos involved in that project were the front end ae-zoning-map-poc and the design system implementation ae-streetscape. The goal was to experiment with a new "tech stack" (technology, libraries, frameworks, etc) to build something that could solve technical problems common in our existing portfolio, such as building geospatial web apps, APIs, vector tiling, geospatial querying, etc. The front end of the proof of concept was never meant to be a new "product", nor was it meant to involve into one. I (@TylerMatteo) named this repo "Zoning API" without too much thought - we just needed to call it something and we knew that the proof of concept app would deal with datasets that we associate with zoning, namely zoning districts and tax lots. I chose this name in keeping with a best practice I've been taught that says APIs should be named in regard to the domain they are concerned with.

While the front end was intended purely as a proof of concept, my approach for this repo was that it may evolve into a production-ready application. The rationale for this approach being that it's more efficient to improve and refactor this repo than it would be to entirely start from scratch if:

We felt pretty good about the overall stack we used (Nest, Drizzle, OpenAPI, etc)
We ended up working on a project that needed tax lots and zoning districts.

For an API like this, evolving could mean changing/adding/remove tables, redesigning APIs, cleaning up rushed code, writing more/better tests, or even renaming the repo and API itself. My overall assessment from the PoC was that we do feel good about the backend stack in this repo. However, if the Capital Projects map isn't going to use any of the entities already in Zoning API, why are we using this repo?

Why use Zoning API for the capital projects map?

When we started planning out work for the capital projects map, we had to decide whether we should continue building on Zoning API for the backend or start a new repo that used the same stack and structure as Zoning API. The decision to build it into Zoning API ultimately come down to practicality and feeling that making a separate "Capital API" would, to some extent, be a premature optimization.

In scoping out the work for the CP map, a few things became clear:

There is a lot of code in Zoning API that we would want to reuse, mostly related to cross-cutting concerns. HTTP Error OAS schemas, Nest validation pipes, and geometry Drizzle column types are all examples of this. These are things we will eventually break out into NPM packages so that they can be reused but right now we don't have the infrastructure set up for that. I feel like it's better to keep adding to one code base than to have a separate one that we would have to copy and paste code into, even in the short term.
We will need to add a few tables to meet the requirements for the CP map that aren't logically related to capital projects - named city council districts and community boards. This is important because it means we will likely end up breaking apart APIs in the long term anyway, so we may as well lean into simplicity and keep everything in one place for now.
We may need some of the data that's already in Zoning API for the CP map, particularly boroughs. Again, keeping the project to one backend for now keeps things simple.

As we build out more and more APIs to replace Carto and our existing custom backends, we will have to be comfortable with the idea of adding tables and APIs to one repo that eventually get broken out into others. Deciding how many back ends we eventually have and what data goes where is a broader design question which leads me to...

What's the long term strategy?

As of writing this, the long term strategy for how we design APIs is to build an ecosystem of client-agnostic loosely coupled domain driven APIs. That's a lot of jargon so allow me to break it down:

Client-agnostic - the APIs are not tied to a particular client or consumer. In our context, "clients" are usually front end web apps. They should be reusable by different clients that happen to need to same data. This is in contrast to many of our existing APIs like those for Applicant Portal and FactFinder which are purpose built for one client application.
Loosely coupled - To the extent possible, we should build APIs in such a way that changes to one can't cause problems in another. A good way to ensure this, at least early on, is to avoid APIs directly calling each other. "Joining" data between APIs can be handled by clients or, in the long term, by implementing some sort of service mesh architecture.
Domain driven - Domain driven design can serve as a framework for how we break down all of our API needs into separate APIs and what data goes where. By orientating the APIs around "domains", we can keep data and functionality that are logically "close" to each other close to each other in the code.

This strategy can and should change as we get more experience building new APIs, but I thought it was important to capture where my head is at on this topic at the moment.

Sooo microservices?

Those familiar with microservice architectures might point out that what I'm describing sounds a lot like microservices. That's true to an extent and we definitely should lean on microservice principles where applicable. I'm avoiding using that term directly because it carries a lot of meaning that might not apply to us necessarily - including the fact that microservices usually describe building many services to serve one application, whereas we will more likely be building many services that serve many applications. Service oriented architecture is another related pattern that has similarities to what I'm describing here.

Conclusion

By building more and more tables and endpoints into Zoning API, we are essentially making the conscious choice to build a monolith now with the understanding that we will break it apart as we gain more insight into how it should be broken apart and build out more plumbing for sharing code for cross-cutting concerns. Fortunately, we can still get the benefits of client-agnostic APIs in a monolith.

If we use our existing portfolio as a rough guide for how many new APIs we will eventually build, I think its clear why a monolith may become unsustainable in the future. We have > 100 tables in use in Carto right now in additions to a few dozen in our home-grown APIs. That number will likely increase in new APIs as we do a better job normalizing the data. The cognitive complexity of one repo with dozens of entities (each with their own schemas, repositories, services, controllers, etc) could become hard to manage over time. Having a monolith also introduces a single point of failure to our systems compared to separate loosely coupled APIs.

I hope this clarifies why we're doing things the way we are for this project and provides some insight into where I see us going in the future 😃 I'm happy to answer any questions in the comments below.

TangoYankee · 2024-05-02T17:17:19Z

TangoYankee
May 2, 2024
Maintainer

@TylerMatteo Thank you for this write up. I think it makes sense for where we are in the evolution of the projects.

I wanted to tie-in user feedback to the context. We heard a lot of requests along the lines of "break down the data barriers between the applications". By collecting the data into a single source (or eventually organizing it into generally available services) we are able to break down these barriers.

In this case, introducing the "community district" and "city council" boundaries benefits more than the capital planning project. It also benefits the tax lot and zoning district data, which now also have access to these boundaries. At the same time, there have been whispers of adding a borough boundary data to this project. We already have a borough table in "zoning api" but it does not have boundary data. By collocating the capital and zoning datasets in the same database, we can enhance the existing "borough" table with geographic data rather than creating a duplicate "borough" table that only serves the capital data.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding tables and endpoints related to capital projects to Zoning API #211

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Sooo microservices?

Replies: 1 comment

{{title}}

Select a reply

Adding tables and endpoints related to capital projects to Zoning API #211

TylerMatteo Apr 30, 2024 Maintainer

First, some context/history about Zoning API

Why use Zoning API for the capital projects map?

What's the long term strategy?

Sooo microservices?

Conclusion

Replies: 1 comment

TangoYankee May 2, 2024 Maintainer

TylerMatteo
Apr 30, 2024
Maintainer

TangoYankee
May 2, 2024
Maintainer