Current London 2025
Session Archive
Check out our session archive to catch up on anything you missed or rewatch your favorites to make sure you hear all of the industry-changing insights from the best minds in data streaming.


Taming the Data Beast: Processing real time Market Data by a Crypto Exchange
As one of Europe’s leading Crypto Exchanges, Bitvavo enables its ~2 million customers to buy, sell and store over 300 digital assets and provides a 24/7 service processing many thousands of transactions per second with stable sub millisecond execution times on its order flow. In this talk I will deep dive on the high level architecture of Bitvavo Exchange and details of how we process and transform trading Data using Confluent Cloud and Imply Druid in real time in order to provide useful insights to our customers focused on Candle Charts. Specifically I will cover architectural patterns, lessons learned and good practices routing and processing high volumes of market data from low latency systems while maintaining the high performance and scalability required from the leading European Crypto Exchange in Europe.
Marcos Maia


From Zero to Hero: petabyte-scale Tiered Storage lessons
Whether you’re running mission critical applications or just shipping logs in real time, Tiered Storage can make your Kafka Cluster cheaper, easier to manage and faster. To understand the benefits, tradeoffs and the development history join this talk where we’ll uncover KIP-405 and showcase how the community backed up this important feature for Apache Kafka. We’ll rollback the KIP history, starting from 2018, to understand the major milestones and share details on how major industry leaders like Apple, Datadog and Slack helped out and tested both the Tiered Storage functionality and the first AWS S3 open source plugin. Furthermore, we’ll share details, gotchas, and tradeoffs of users successfully adopting Tiered Storage in production at scale, surpassing 150 GB/s of throughput. If you want to optimize your Apache Kafka cluster for performance, cost, and overall health, this session is for you.
Francesco Tisiot, Filip Yonov


Into the Otter-Verse: Using time-travel to transform Kafka Streams development and operations.
One potential benefit of using a stream processor like Kafka Streams to build applications on the log is the ability to time travel. What if you could go back in time and query state stores to see when a bug was introduced? Or what if you could freeze the state of a running application and make a copy to do pre-deploy testing? This potential has largely gone unrealized because of a missing primitive in Kafka Streams - the ability to create a consistent snapshot that can be read and even cloned into a new application. Until now. We first explain exactly what snapshots and clones are. In short, a snapshot contains all the application's state up to some point in time, and no state after. A clone is a copied application created from this state. Next, we’ll make the case for why snapshots are a game-changing feature for Kafka Streams. Snapshots take your application into a multiverse (or otter-verse) of histories + branches. We’ll show how you can use them to explore your application’s history, interactively debug, test changes against real data, do blue/green deploys, and more. The remainder of the talk dives into the theory + practice of Kafka Streams snapshots. First we cover what’s been missing from Kafka Streams to support them. In particular, Kafka Streams currently lacks synchronization mechanisms to enable a consistent topology-wide snapshot. It also maintains state locally, which makes a snapshot difficult to access. Next, we discuss how we fill these gaps with Responsive. Specifically, we give an overview of RS3, our S3-backed store built on SlateDB, and how we use it with our SDK to take consistent snapshots. We’ll close this section with our vision for how snapshots can be contributed back to Kafka Streams. Finally, we’ll close the talk with a demo to show the power of snapshots in action. Viewers should come away with an understanding of snapshots/ clones, how they can be used to solve common problems, and how we’ve built them in Responsive.
Rohan Desai


Kafka Consumer 4.0 - Major version, major improvements, get to know it all
A new major version of the KafkaConsumer is out, bringing in fundamental changes and improvements, as it’s the first version to fully implement the next generation of the Consumer Group Rebalance Protocol, introduced with KIP-848. It’s all a brand new production-ready feature now! Want to hear about how these major changes materialize in the KafkaConsumer? What’s in? What’s out? What’s different? This talk is for you then! We will cover the core of the new rebalance protocol, its implementation on the Java client, and how it significantly improves and simplifies the whole group consumption experience, addressing its main pain points. We will also share about the revamped KafkaConsumer threading model, shipped alongside the new rebalance protocol client implementation. It all sounds promising, but we do know that upgrades might be scary, right? Whether you’re a Kafka developer, operator, or architect, this talk will equip you with everything you need to confidently adopt KafkaConsumer 4.0 in your client applications. From how the live upgrade and protocol interoperability works, to detailed client changes: configuration changes, API deprecations and additions, improved API behavior, new metrics…
Lianet Magrans, David Jacot


Stream Processing and Cascading Materialized Views: Why, How, and What
Materialized views (MV) are a core concept in databases. In streaming databases like KsqlDB and RisingWave, MVs are maintained through continuous incremental stream processing engines. Users can define cascading MVs, or more specifically, MVs on top of other MVs, to express complex stream processing logic. However, the management of cascading MVs can introduce substantial technical hurdles for the database system. To illustrate, consider the scenario where an MV within the stack is unable to promptly process events from its upstream sources. This not only results in immediate spikes in latency for downstream MVs but also creates backpressure, potentially causing a system crash. Additionally, if an MV experiences a crash, it can trigger a pause in the entire MV stack's processing. Overcoming these challenges to recover the MV and its downstream MVs while preserving data consistency is a formidable task. In this presentation, I will begin by exploring the critical considerations when it comes to maintaining cascading materialized views: namely, consistency, elasticity, and fault tolerance. Subsequently, I will delve into the potential advantages and disadvantages of various approaches, along with strategies for efficient logging and checkpointing to minimize system downtime. Finally, I will share insights gained from our experiences in managing hundreds of cascading materialized views in real-world production environments.
Yingjun Wu


Why Kafka is always late? Is that really a problem?
Kafka is fast, but lag is everywhere. Data falls behind, consumers can’t keep up, and alerts keep firing. The usual reaction? Blame Kafka. The real issue? Kafka does exactly what it’s built to do: decouple producers and consumers. Lag isn’t a bug, it’s a side effect. Tracking offsets won’t save you. The real problem is time lag: the gap between when data is produced and when it’s actually processed. Consumer rebalances, inefficient commits, slow APIs, and bad scaling decisions all make it worse. Little’s Law predicts when lag will spiral, but most teams ignore it. This talk breaks down what’s really happening when Kafka "falls behind", why, and what you can do about it. Batching, commit strategies, parallel consumption, dropping messages, many options are available. Start controlling lag before it controls you.
Stephane Derosiaux