Breakout Session

3 Lessons We Learned Running Stateful Streaming Pipelines with Apache Flink and Kafka

< All 2024 Sessions

At Bloomberg, we deal with tremendously high volumes of data. Financial markets move significantly from millisecond to millisecond. Clients must react to these immense amounts of market data and trust Bloomberg to synthesize and analyze the information in real time.

In order to process the tremendous amount of data related to fixed income markets and calculate intraday bond pricing, our team relies on Apache Kafka and Apache Flink to construct our stateful streaming pipeline. State is critical in our pipeline as it captures market movements. Losing state would be catastrophic to the data quality we provide to our clients.

In this talk, we will talk about why we chose Apache Flink and Apache Kafka, how we set up our stateful streaming pipeline using them, and three challenges we faced when running our Flink applications with fault tolerance using Savepoints. Specifically, we will share our experience restoring source Kafka partitions/offsets from Flink savepoints, restoring from savepoints with event-time timers, and retaining state while dealing with backward-incompatible changes to the Flink topology.

By the end of this talk, you will walk away with the confidence to run your stateful Flink applications without worrying about losing state.

Da Huo

Bloomberg

Download