Flink, Kafka and Prometheus: better together to improve efficiency of your observability platform

Breakout Session

Prometheus has become the go-to solution for monitoring and alerting, ingesting metrics from applications and infrastructure. The ability to efficiently store high volumes of dimensional time series also makes Prometheus a perfect fit for broader operational analytics use cases. Examples include observing fleets of IoT devices, connected vehicles, media streaming devices, and any distributed resources. However, the high cardinality and frequency of events generated by these sources can be challenging.

Apache Flink can preprocess observability events in real-time before writing to Prometheus. Reducing cardinality or frequency can improve the efficiency of your observability platform. Adding contextual information and calculating derived metrics enables deeper operational analysis in real time.

Observing Flink with Prometheus is a solved problem, using Flink Prometheus Exporters. The new Flink-Prometheus connector, a recent addition to the Apache Flink connector family, addresses a different challenge. It enables using Flink to preprocess large volumes of observability data from various sources and write directly to Prometheus at scale.

Kafka completes this architecture by providing reliable stream storage, ensuring ordered delivery of high-volume raw metrics into Flink—critical for maintaining Prometheus time series integrity

In this talk, an Apache Flink committer and the maintainer of the new Flink-Prometheus connector will explore real-world use cases, key challenges, and best practices to leverage Flink and Prometheus together to supercharge your observability platform.


Lorenzo Nicora

Amazon Web Services