Current London 2025

Session Archive

Check out our session archive to catch up on anything you missed or rewatch your favorites to make sure you hear all of the industry-changing insights from the best minds in data streaming.

The Future of AI is Event-Driven

Autonomous agents are reshaping enterprise operations, but scaling them isn’t just about smarter AI—it’s about better infrastructure. Agents need real-time data, seamless tool integration, and shared outputs across systems. Rigid request/response models create bottlenecks, while event-driven architecture (EDA) unlocks the flexibility and scalability agents require. This session will show how EDA enables autonomous agents to thrive. Key takeaways include: - How EDA enables real-time, adaptive agent workflows and multi-agent problem solving. - Key design patterns like Orchestrator-Worker, Multi-Agent Collaboration, and Market-Based Competition. - Strategies for leveraging Kafka to handle scalability, fault tolerance, and low latency. - Lessons from microservices evolution to solve interoperability and context-sharing challenges. This talk is for engineers and architects building scalable AI systems. You’ll leave with actionable insights to design resilient, event-driven agents and future-proof your infrastructure for enterprise-scale AI.

Presenters

Sean Falconer, Andrew Sellers

Breakout Session

May 21

Observability Made Easy: Unlocking Kafka Client Insights with KIP-714

Kafka is the backbone of modern data streaming architectures, but understanding what’s happening inside your clients has long been a challenge. KIP-714 changes the game by introducing a standardized and extensible way to expose client metrics, making observability accessible to everyone—not just Kafka experts. In this talk, we’ll explore why KIP-714 is a must-have for non-trivial systems, how it seamlessly integrates with popular observability stacks like OpenTelemetry, and what it means for debugging, performance tuning, and SLA monitoring. With real-world examples and a live demo, you’ll see how easy it is to connect Kafka clients to your telemetry and logging pipelines, unlocking deep insights with minimal effort. Whether you’re an engineer, SRE, or architect, you’ll walk away with practical knowledge on leveraging KIP-714 to make your Kafka-powered systems more transparent, resilient, and debuggable. No prior Kafka internals knowledge required—just a desire to see your data streams with clarity!

Presenters

Florent Ramiere

Breakout Session

May 21

Before and After: Transforming Wix’s Online Feature Store with Apache Flink

At Wix, our Feature Store processes billions of events every day to power data-driven experiences - from real-time personalizations to machine learning model inferences. Our initial, Apache Storm–based design struggled under massive event volumes, resulting in significant data loss and complex maintenance challenges that limited our ability to scale. In this session, we'll share how we re-architected our online feature store with Apache Flink. You'll learn about the limitations of our previous design, the challenges we faced, and the principles that guided our shift to a high-performance online feature store. We'll illustrate how we combined Apache Spark, Apache Kafka, Aerospike and Apache Flink to achieve high-throughput, low latency feature computations and seamless real-time updates to over 2,500 features, without data loss. Expect a direct, architecture focused session where we’ll compare our old and new designs, sharing the lessons learned along the way, without the philosophical debates.

Presenters

Tal Sheldon, Guy Levinhr

Breakout Session

May 21

From days to seconds: adidas' journey to scalable Kafka self-service

This is a story of a team who was at the verge of becoming a victim of his own success, with a massive adoption of a technology and the challenge to maintain a decent service quality, while keeping the infrastructure stable and reliable. Implementing multi-tenancy in Kafka is not too complex when the number of use cases sharing the cluster is low. A central team can operate the infrastructure, taking care of the heavy lifting and creating required assets on demand. This is true until adoption starts growing and the solution becomes a problem. You are a bottleneck and any service request piles up until an agent can resolve it, increasing resolution times and frustration at the same pace. Also, the amount of mistakes committed when you are doing everything by hand is very high, provoking toil and unexpected side effects and operational complexities. In this talk, we'll explain how we reverted the tendency implementing a non opinionated, vendor-agnostic self-service solution, delegating completely the responsibility to maintain assets to our stakeholders (topics, permissions, schemas, connectors) and reducing resolution times for any of these activities several orders of magnitude, from days to seconds. All of these while keeping the balance between governance and autonomy. Also, we'll explain how we managed to implement a standard based documentation model using AsyncAPI specs, enabling data discovery and reusability and reducing duplication. The main takeaways of the talk will be: * Technical Architecture, architectural decisions and tradeoffs * Operational model of the solution * DSL Specification * Rollout strategy to reach Globally Available state * SLAs and Adoption KPIs

Presenters

Guillermo Lagunas, Jose Manuel Cristobal

Breakout Session

May 21

Ensuring Client Continuity in Kafka: High Availability in Confluent Kafka

Managing large-scale Kafka clusters is both a technical challenge and an art. At Trendyol, our Data Streaming team operates Kafka as the backbone of a vast event-driven ecosystem, ensuring stability and seamless client experiences. However, we faced recurring issues during broker restarts—applications experienced connectivity errors due to misconfigured topics and improper bootstrap server configurations. To address this, we leveraged Confluent Stretch Kafka across multiple data centers, enabling automatic leader elections without service disruptions. Additionally, we enforced topic creation and alter policies and built a custom Prometheus exporter to detect misconfigured topics in real time, allowing us to notify owners and take corrective actions proactively. Through rigorous alerting mechanisms and enforcement via our Internal Development Platform (IDP), we have successfully eliminated disruptions during broker restarts, enabling smooth cluster upgrades and chaos testing. This session will provide practical insights into architecting resilient Kafka deployments, enforcing best practices, and ensuring high availability in a production environment handling thousands of clients. Attendees will learn: How multi-DC Kafka clusters ensure client continuity The impact of misconfigured replication factors and how to prevent them How real-time monitoring and alerts reduce operational risks Practical strategies to enforce resilient topic configurations

Presenters

Yalın Doğu Şahin, Mehmetcan Güleşçi

Lightning Talk

May 21

Simplifying Real-Time Vector Store Ingestion with Apache Flink

Retrieval-Augmented Generation (RAG) has become a foundational paradigm that augments the capabilities of language models—small or large—by attaching information stored in vector databases to provide grounding data. While the concept is straightforward, maintaining up-to-date embeddings as data constantly evolves across various source systems remains a persistent challenge. This lighting talk explores how to build a real-time vector ingestion pipeline on top of Apache Flink and its extensive connector ecosystem to keep vector stores fresh at all times seamlessly. To eliminate the need for custom code while still preserving a reasonable level of configurability, a handful of composable user-defined functions (UDFs) are discussed to address loading, parsing, chunking, and embedding of data directly from within Flink's Table API or Flink SQL jobs. Easy-to-follow examples demonstrate how the discussed approach helps to significantly lower the entry barrier for RAG adoption, ensuring that retrieval remains consistent with your latest knowledge.

Presenters

Hans-Peter Grahsl

Lightning Talk

May 21