Current London 2025

Session Archive

Check out our session archive to catch up on anything you missed or rewatch your favorites to make sure you hear all of the industry-changing insights from the best minds in data streaming.

No More Hiding: Kafka Streams as a First-Class Citizen in the Kafka Protocol

Kafka Streams, the native stream processing technology of Kafka, stands out for its unique ability to assign work and propagate metadata among clients without the need for separate coordination infrastructure — requiring only a Kafka broker. Traditionally, this has been accomplished by embedding Streams-specific payloads within generic consumer group RPCs of the Kafka protocol and plugging custom assignment logic into the standard Kafka Consumer. As the Kafka consumer transitions to the next-generation consumer protocol, Streams applications are starting to reveal their true identity when communicating with the broker. In the accepted Kafka Improvement Proposal 1071, new Streams-specific RPCs are being added to the Kafka protocol. These will be used by applications to form groups, exchange metadata, and distribute workloads. A new type of group, the Streams group, will coexist alongside the consumer group, elevating Kafka Streams to a first-class citizen on the Kafka broker. Assignments, member metadata, and group configurations will have a single source of truth, and the assignment logic will run iteratively and centrally as part of the broker’s group coordinator. This deeply technical talk will delve into the new Streams rebalance protocol. We will begin with a brief overview of the existing rebalance mechanism in Kafka Streams, highlighting the challenges it presents to users. We will then introduce the new Streams rebalance protocol, comparing it to both the current protocol and the new consumer protocol. You will learn about heartbeats that are more than just liveness signals, why the consumer offsets topic doesn’t just contain consumer offsets, and open-heart surgery on the Kafka consumer — all packaged in a compelling story.

Presenters

Lucas Brutschy, Bruno Cadonna

Breakout Session

May 21

The Luge: Apache Iceberg™ and Streaming

Apache Iceberg™ is known for its rapid ascent as a go to for batch analytics table formats, but did you know it’s also great for streaming applications, as well? In this talk we’ll talk about the current state of the Iceberg Project and how the V3 Table format has brought in some exciting new features that make streaming even easier. Ever had to deal with rapidly changing message schema? Say hello to Variant type. Need to track changes over time? Hi there Row Lineage. The Iceberg Table Spec V3 is bringing all sorts of great new functionality to streaming, but this is just the beginning. We’ll also talk about some of the current features in the works for V4, again, many of them aimed at making the streaming experience better than ever. Join us and get an overview of the present and future of Apache Iceberg.

Presenters

Eric Maynard

Breakout Session

May 21

Native Data Lineage Support in Apache Flink with OpenLineage

Apache Flink has made significant strides in native data lineage support, which is essential for auditing, data governance, regulatory compliance, troubleshooting, and data discovery. In this presentation, we will delve into Flink's built-in lineage graph and listener mechanism, showcasing its current capabilities and recent enhancements brought by FLIP-314. We will emphasize how Flink's native lineage features provide a robust framework for understanding and managing data flows within streaming applications. Furthermore, we will explore the integration of Flink lineage with OpenLineage, an open framework designed for the systematic collection and analysis of data lineage. This integration facilitates seamless lineage data management and visualization across modern data ecosystems. Join us to gain insights into the advancements of native lineage support within Apache Flink and learn how it can significantly enhance your data operations and compliance initiatives.

Presenters

Pawel Leszczynski

Breakout Session

May 21

Disposable Vibes: Great Pacific Garbage Patch of AI Slopware

Vibe Coding is everywhere. AI makes churning out endless code, data, and software easier than ever. Alongside this rise, a new mindset has emerged: forget craftsmanship, abandon discipline—just vibe and let the AI cook. If this gives you pause, that's good! This talk explores a brave new world where practitioners suspend judgment and treat their craft as merely cranking out endless disposable AI-generated content. The consequences are severe: a digital ecosystem drowning in slopware, poor data quality, and crushing technical debt that makes our current challenges seem trivial by comparison. But this talk isn’t all doom and gloom. Joe offers practical advice and solutions for practitioners and leaders to elevate their AI practices today. His strategies help organizations harness AI's power while maintaining code, data quality, and craftsmanship, preventing what could become a gigantic mess beyond comprehension.

Presenters

Joe Reis

Breakout Session

May 21

Melting Icebergs: Enabling Analytical Access to Kafka Data through Iceberg Projections

An organisation's data has traditionally been split between the operational estate, for daily business operations, and the analytical estate for after-the-fact analysis and reporting. The journey from one side to the other is today a long and torturous one. But does it have to be? In the modern data stack Apache Kafka is your defacto standard operational platform and Apache Iceberg has emerged as the champion of table formats to power analytical applications. Can we leverage the best of Iceberg and Kafka to create a powerful solution greater than the sum of its parts? Yes you can and we did! This isn't a typical story of connectors, ELT, and separate data stores. We've developed an advanced projection of Kafka data in an Iceberg-compatible format, allowing direct access from warehouses and analytical tools. In this talk, we'll cover: * How we presented Kafka data for Iceberg processors without moving or transforming data upfront—no hidden ETL! * Integrating Kafka's ecosystem into Iceberg, leveraging Schema Registry, consumer groups, and more. * Meeting Iceberg's performance and cost reduction expectations while sourcing data directly from Kafka. Expect a technical deep dive into the protocols, formats, and services we used, all while staying true to our core principles: * Kafka as the single source of truth—no separate stores. * Analytical processors shouldn't need Kafka-specific adjustments. * Operational performance must remain uncompromised. * Kafka's mature ecosystem features, like ACLs and quotas, should be reused, not reinvented. Join us for a thrilling account of the highs and lows of merging two data giants and stay tuned for the surprise twist at the end!

Presenters

Tom Scott, Roman Kolesnev

Breakout Session

May 21

Empowering Developers with a Centralized Kafka Library

This presentation details how our platform enablement team developed a centralized Kafka library, empowering developers to build applications with ease. Faced with inconsistent Kafka processing approaches across teams, we created a common library, inspired by the multi-threaded consumer approach described in this Confluent blog post: https://www.confluent.io/blog/kafka-consumer-multi-threaded-messaging (re-implemented in Kotlin). We'll share our challenges, successes, and the current state of this library, now used in over 20 services. Initially, varying team approaches to Kafka processing led to inconsistencies and duplicated effort. Our team recognized the need for standardization. Internal library simplifies Kafka development, promotes best practices, and centralizes key functionalities. It wraps the kafka-clients library, offering simple interfaces for building Kafka consumers and producers that integrate with our Confluent clusters, schema registry, and Avro serialization. Core feature is its multi-threaded consumer implementation, enabling efficient consumption from multiple partitions. We'll share the technical hurdles we encountered during development, discussing our design decisions, multi-threading challenges, and lessons learned. Crucially, the library supports dead-letter queues and message redelivery. It also supports cross-cluster consumers, essential for GDPR compliance, allowing production to multiple Confluent clusters in different regions. We'll cover our versioning strategy and package overlap issues, explaining how we created thin, relocated, and uber JAR versions. Interesting feature is runtime consumer control. By producing events to an internal topic, we can start/stop consumers in live applications without redeployment. The library has simplified Kafka development, promoted consistency, reduced the learning curve, and centralized core functionality. This presentation is ideal for Kafka developers seeking to build internal Kafka libraries.

Presenters

Ademir Spahic, Ammar Latifovic

Breakout Session

May 21