Current London 2025

Session Archive

Check out our session archive to catch up on anything you missed or rewatch your favorites to make sure you hear all of the industry-changing insights from the best minds in data streaming.

Tick-Tock with Kafka: Building Scalable Timer Functionality for Microservices

Timers are a cornerstone of any software system, yet traditional implementations often rely on in-memory solutions or RDBMS dependencies. In this talk, I’ll present a unique approach that leverages Kafka alone to power timer functionality—eliminating the need for RDBMS and embracing a distributed architecture. Using Kafka Streams, I’ll demonstrate how to efficiently schedule delayed work at web scale, enabling resilient and scalable microservices.

Presenters

Sergey Zyrianov

Lightning Talk

May 21

Global Schema Registry: Door to Data Catalog

Data sharing across cloud service providers is emerging as a mission-critical need for large-scale enterprises and for those looking for cloud agnostic event streaming solutions. While this can be achieved with Kafka multi-region architecture for high availability of data, it still remains a challenge for clients to establish data contracts and evolve their schemas to be in-sync across Kafka clients. In this talk, we will discuss how Fidelity Investments designed a multi-cloud global registry for schemas using schema registry as a centralized repository for managing schemas enterprise-wide. We will also deep dive into the topology of our global schema registry service and demonstrate how it remains resilient over different failure scenarios (region/CSP). We will review metrics that are monitored for deeper observability and benefits such as the simplification of data contracts between producers and consumers and the untangling of data sharing channels across organizational units. Whether you are an analyst or an architect, this session will improve your ability to discover, manage, and correlate event schemas across a wide range of personas.

Presenters

Nagashree B

Lightning Talk

May 21

Apache Kafka: meet Apache Druid DART

Druid and Kafka have been best buddies for 10 years, courting and sparking their way around data analytics parties to excess. At the end of 2024, the Apache Druid community released a new query API, DART, giving them access to even more parties and fun times - but this time, where being able to execute complex queries quickly matters more than concurrency. Join to see Druid's DART engine get the slideware treatment, and a Kafka + DART-powered Druid + Grafana analytics pipeline working, complete with step-by-step instructions to make your own.

Presenters

Peter Marshall, Dave Klein

Lightning Talk

May 21

How We Replaced Node.js with Apache Flink for Real-Time Deduplication and Cut Costs by 7x

ShareChat is one of the largest social media platforms in India, with over 180 million monthly active users. We had a high-throughput real-time stream (>200K RPS) processing using a Node.js + Redis-based deduplication with a 24-hour window. In this talk, I'll walk you through how we transitioned to an Apache Flink-based solution, the challenges we faced, and the strategies that led to a 7x cost reduction. Topics Covered: 1. State Management at Scale: - Our early attempts to structure Flink state efficiently to handle massive-scale deduplication. - Lessons learned in making the job manageable and performant despite the huge state size. 2. Autoscaling Challenges: - How we leveraged the Flink Kubernetes Operator to enable autoscaling. - Why autoscaling initially increased duplication—and how we solved it. 3. When Async API Matters in Apache Flink: - Understanding the role of Async I/O in Flink. - How it impacts performance and resource efficiency in real-time streaming. 4. How We Achieved 7x Cost Savings

Presenters

Andrei Manakov

Lightning Talk

May 21

Massive Kafka Streams Topology Revamp in Production: No Chaos, No Headaches! My Key Takeaways 🦾!

You've been rocking Kafka Streams in production for a while, but guess what? Times have changed! Your Kafka skills have leveled up, and/or your business is pushing for a fresh twist... 🚀 Now, you need to revamp your entire kafka stream topology—without breaking everything! 😱 But how do you pull this off without disrupting consumers, ensuring accurate the last data updates into your internal topics, and avoiding the headache of renaming your microservice or tweaking input/output topics? 🫨 Join me as we dive into "remapping" fonctionnality from Kstreamplify, our open-source library from Michelin adding extra capabilities to Kafka Streams. Through a simple, hands-on example, I'll show you how to make these changes smoothly. Grab a seat 🪑—let's make topology changes a breeze! 🌪️✨

Presenters

Marie-Laure Momplot

Lightning Talk

May 21

pyflink Table API on streaming data - Fearless python data engineering

Data engineers around the world have embraced python as the language of choice for designing and developing data engineering pipelines. For streaming data, python DSLs serve the purpose of writing complex business logic in a fluent, readable and efficient way. Apache Flink Table API python transforms enable data streaming engineers to write sophisticated stream transforms such as tumbling, hopping windows and group by key aggregations with pure pythonic fluent DSLs. Join this session to learn the beauty and ease of use of writing python Table API transformations on streaming data with Kafka as the source. This session would also show a live demo of writing python Table API aggregations on streaming data with Kafka. The audience would come out of this session armed with the knowledge of writing complex streaming data transformations using their popular language of choice, python and understand how to construct streaming data pipelines using Apache Flink Table API.

Presenters

Diptiman Raichaudhuri

Breakout Session

May 21