Current London 2025

Session Archive

Check out our session archive to catch up on anything you missed or rewatch your favorites to make sure you hear all of the industry-changing insights from the best minds in data streaming.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Flink & Metered Billing - a reprocessing story

Join us to explore how a metered billing system was built and utilized Apache Flink to achieve accurate numbers. We’ll cover session window aggregation, ordering, idempotency, late enrichment, metered billing, reconciliation, watermark alignment, and data accuracy. Learn how we iteratively modified and expanded our business logic, idempotently reprocessing over 500M events more than ten times to ensure precision for our customers. We’ll dive into the architectural decisions, operational strategies, and challenges faced, providing valuable insights into building robust, real-time data processing systems with Flink.

Presenters

Pedro Mázala

Breakout Session
May 21

Kafka Connection Chaos: Surviving the Storm

It is 9 AM, support team began the maintenance to renew Kafka Broker's certificates. At 9:30 AM half of the cluster has been updated correctly, but, the liveness probe metric seems unstable. We check connectivity — everything looks fine. Our monitoring stack shows it is able to consume and produce from/to all brokers. Connections are a bit higher than usual but still within limits. 9:40 AM: some teams start complaining that they can neither consume nor produce. What is happening? Suddenly, we discover the acceptor metric indicating that brokers are blocking 80% of connections. What is an acceptor, and why is it blocking our connections? The previous paragraph describes an incident where our Kafka platform experienced a connection storm, leading to significant degradation. This event highlighted the crucial need for effective connection management and exposed our gaps in understanding Kafka’s connection handling, especially with new connections. In this talk, we will share our journey and insights with platform teams maintaining Kafka. You’ll learn how Kafka on Linux servers manages connections and the challenges you might encounter. We will dive into the metrics and mechanisms Kafka offers to detect and protect against connection storms. And last but not least, we’ll share tips from our experience to help you avoid the mistakes we made.

Presenters

Javier Hortal, Rafael García Ortega

Breakout Session
May 21

No More Hiding: Kafka Streams as a First-Class Citizen in the Kafka Protocol

Kafka Streams, the native stream processing technology of Kafka, stands out for its unique ability to assign work and propagate metadata among clients without the need for separate coordination infrastructure — requiring only a Kafka broker. Traditionally, this has been accomplished by embedding Streams-specific payloads within generic consumer group RPCs of the Kafka protocol and plugging custom assignment logic into the standard Kafka Consumer. As the Kafka consumer transitions to the next-generation consumer protocol, Streams applications are starting to reveal their true identity when communicating with the broker. In the accepted Kafka Improvement Proposal 1071, new Streams-specific RPCs are being added to the Kafka protocol. These will be used by applications to form groups, exchange metadata, and distribute workloads. A new type of group, the Streams group, will coexist alongside the consumer group, elevating Kafka Streams to a first-class citizen on the Kafka broker. Assignments, member metadata, and group configurations will have a single source of truth, and the assignment logic will run iteratively and centrally as part of the broker’s group coordinator. This deeply technical talk will delve into the new Streams rebalance protocol. We will begin with a brief overview of the existing rebalance mechanism in Kafka Streams, highlighting the challenges it presents to users. We will then introduce the new Streams rebalance protocol, comparing it to both the current protocol and the new consumer protocol. You will learn about heartbeats that are more than just liveness signals, why the consumer offsets topic doesn’t just contain consumer offsets, and open-heart surgery on the Kafka consumer — all packaged in a compelling story.

Presenters

Lucas Brutschy, Bruno Cadonna

Breakout Session
May 21

The Luge: Apache Iceberg™ and Streaming

Apache Iceberg™ is known for its rapid ascent as a go to for batch analytics table formats, but did you know it’s also great for streaming applications, as well? In this talk we’ll talk about the current state of the Iceberg Project and how the V3 Table format has brought in some exciting new features that make streaming even easier. Ever had to deal with rapidly changing message schema? Say hello to Variant type. Need to track changes over time? Hi there Row Lineage. The Iceberg Table Spec V3 is bringing all sorts of great new functionality to streaming, but this is just the beginning. We’ll also talk about some of the current features in the works for V4, again, many of them aimed at making the streaming experience better than ever. Join us and get an overview of the present and future of Apache Iceberg.

Presenters

Eric Maynard

Breakout Session
May 21

Native Data Lineage Support in Apache Flink with OpenLineage

Apache Flink has made significant strides in native data lineage support, which is essential for auditing, data governance, regulatory compliance, troubleshooting, and data discovery. In this presentation, we will delve into Flink's built-in lineage graph and listener mechanism, showcasing its current capabilities and recent enhancements brought by FLIP-314. We will emphasize how Flink's native lineage features provide a robust framework for understanding and managing data flows within streaming applications. Furthermore, we will explore the integration of Flink lineage with OpenLineage, an open framework designed for the systematic collection and analysis of data lineage. This integration facilitates seamless lineage data management and visualization across modern data ecosystems. Join us to gain insights into the advancements of native lineage support within Apache Flink and learn how it can significantly enhance your data operations and compliance initiatives.

Presenters

Pawel Leszczynski

Breakout Session
May 21

Disposable Vibes: Great Pacific Garbage Patch of AI Slopware

Vibe Coding is everywhere. AI makes churning out endless code, data, and software easier than ever. Alongside this rise, a new mindset has emerged: forget craftsmanship, abandon discipline—just vibe and let the AI cook. If this gives you pause, that's good! This talk explores a brave new world where practitioners suspend judgment and treat their craft as merely cranking out endless disposable AI-generated content. The consequences are severe: a digital ecosystem drowning in slopware, poor data quality, and crushing technical debt that makes our current challenges seem trivial by comparison. But this talk isn’t all doom and gloom. Joe offers practical advice and solutions for practitioners and leaders to elevate their AI practices today. His strategies help organizations harness AI's power while maintaining code, data quality, and craftsmanship, preventing what could become a gigantic mess beyond comprehension.

Presenters

Joe Reis

Breakout Session
May 21