Background

Lightning Talk

Treat Data Like APIs

Microservices encapsulate code and data of a domain and establish standard interfaces to access the domain via APIs, but service owners sometimes forget that service data goes into a data lake for analysis or reporting. Traditionally, this data lands in a data lake via change data capture (CDC), and data teams use transformation jobs on this “raw” data to build curated datasets. But this breaks the microservice encapsulation and exposes internal data layout to external offline consumers. Making changes to the database for microservice becomes hard because numerous other data pipelines depend on it downstream. Data bugs are only discovered days after they first occur, because those data assets are separated by a couple of degrees from the team that owns the source of truth microservice database schemas.

  To solve this problem, we adopted a simple principle: treat data like APIs. Design documents should have both API specs and data specs. Events can then serve as the data contract to provide data outside of the microservice. Database CDC can then be replaced by dense fact based events and event streaming to provide data for offline consumption. Microservices are free to modify their internal database layouts but evolve their event contracts more thoughtfully similar to best practices in API evolution.

 This talk will cover how to architect your microservices to truly encapsulate data, avoid leaking implementation details and hidden data dependencies.

Gaurav Tungatkar

HingeHealth Inc.