Background

Breakout Session

Change Data Capture & Kafka: How Slack Transitioned to CDC with Debezium & Kafka Connect

Change Data Capture (CDC) has emerged as a critical component in modern data processing architectures, enabling organizations to capture and process real-time data changes efficiently. This talk presents a detailed case study of Slack's transition to CDC, highlighting the utilization of Vitess, Kafka, Debezium, and Kafka Connect in revolutionizing data processing workflows.  

  Slack, a leading communication platform, relies on Vitess as its production relational database, with hundreds of tables, thousands of shards, and some tables receiving up to 25k writes per second. To fulfill the need for data replication for OLAP and analytical purposes, Slack previously used a batch pipeline. This involved reading Vitess backups and writing them to the in-house Data Lake based on S3 & Parquet.  

  In order to reduce cost and improve performance, we built a streaming-based CDC pipeline. Slack helped develop the open source Debezium Vitess Connector on Kafka Connect to read changes from Vitess binlogs and write data to Kafka, including seamless support for Vitess resharding operations. A Kafka Connect sink connector archives the change-log data in Iceberg format. Finally, a Spark job transforms the append-only data into a performant columnar table, enhancing accessibility and usability.    

 The transition to CDC at Slack yielded substantial benefits in performance and cost. Compared to the previous batch pipeline, the CDC pipeline slashed cost by millions and slashed latency from 24 hours to less than 10 minutes.  

  Attendees will gain valuable insights into the benefits and best practices associated with CDC and leveraging a real-time streaming architecture. We will discuss scalability, fault tolerance, and the architectural design decisions made to support CDC at Slack. They will deepen their understanding of the transformative potential of CDC, and learn how to apply CDC and streaming-based architectures to their own work.

Joseph Thaidigsman

Slack

Tom Thornton

Slack