Skip to content

Measuring and Monitoring APIs

czetie edited this page Aug 1, 2014 · 3 revisions

Convener

Attendees

Carl Zetie (carl dot zetie at gmail dot com)

Notes

One possibly useful taxonomy is to think about measures from an external perspective, an internal perspective, and a "boundary" perspective

External measures focus on how the API appears to an external user

  • Performance/responsiveness of API calls; conformance with SLAs
  • Availability and uptime
  • Equity/consistency of responses
  • Accuracy of responses; one way to verify this is to run your own unit tests externally against the API

Internal measures focus on underlying systems. Whereas external measures might identify a symptom, you generally need internal measures to diagnose. Tyler discussed the dashboard of internal measures he has developed, primarily using homebrew tech.

  • Dashboard includes security, telemetry, metrics
  • Many of the metrics are pulled from the VMs. Primarily server-level counters and events
  • The underlying logging can persist trending information, but instantaneous values are typically not persisted (other than out-of-limit events)
  • The dashboard is intended for the sys admin role, not for developers generally to monitor their own services

The boundary perspective means measuring what's happening at the API itself, which can provide design insights among other things

  • Can learn a lot from watching response times and response codes. e.g. if the number of 404s spikes, it could be because a resource was moved/removed; or it could be a client using the API incorrectly
  • Can look for common sequences of APIs, perhaps indicating something is too granular; or options/parameters that are never used
  • A client that calls the same API over and over may indicate a design flaw on your part (you really need a collection API!); or it may warn of "bad faith" by the client, e.g. somebody is scraping your catalog for their own use

Additional Considerations (added later)

  • TBD
Clone this wiki locally