Current Bengaluru 2025

Session Archive

Check out our session archive to catch up on anything you missed or rewatch your favorites to make sure you hear all of the industry-changing insights from the best minds in data streaming.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Transforming APIs into an event source with Kafka Streams

REST-based Request-response APIs are the lifeblood of most architectures. API transactions, specifically involving the CUDs (Creates, Updates, Deletes) are great sources of events and/or change data. This is also true of various RPC style APIs. However, because of the client-server nature, such events need to be published by the service, using an "outboxing" mechanism and subsequently disseminated to pub-sub event broker, most often Apache Kafka. This involves code changes to the application and in addition, a more expansive solution like CDC (change-data-capture) to collect those changes from the outbox table. In this session, we will present an alternative pattern that leverages standard API proxies / gateway solutions to process and push events to Kafka, as they occur in the API data plane at real-time. We will address the following important questions during this Show Me How session: 1. How to source event data from APIs (and API gateways) such as Kong and Istio 2. How to synthesize event data into the desired form using Kafka Streams 3. How to source change feeds against aggregates using Kafka Streams state stores 4. Considerations and tradeoffs for this architecture 5. Key use-cases (hint:Event Sourcing, Data exchange Audit-trails and more!) This is a potential no-code alternative to CDC, but with added benefits such as broad access to domain and context-specific request-response data, as opposed to having to process multiple row-level changes. Finally, this also enables treating eventing as a cross-cutting concern associated with API management and operations. Attendees can expect to take away a new and interesting approach towards a progressive, low-friction transition to an event-driven architecture

Presenters

Avinash Upadhyaya

Breakout Session

Serverless Streaming Pipelines: Integrating Delta Live Tables, Kafka, and Serverless Architectures

Real-time data processing is at the heart of modern applications, demanding scalability, reliability, and efficiency. This talk showcases how Delta Live Tables (DLT), Apache Kafka, and serverless architectures converge to create dynamic, end-to-end streaming pipelines without the need for version management. We’ll delve into: Advanced Use Cases: Learn how industries are leveraging DLT and Kafka to build resilient, real-time streaming workflows for applications. Versionless Pipelines: Understand how Databricks' automatic runtime upgrades for Delta Live Tables simplify maintenance while supporting platform enhancements. Optimization Strategies: Explore cost-effective, scalable design principles that serverless architectures bring to streaming workflows. Technical Deep-Dive: Discover how DLT handles checkpointing and data quality enforcement in streaming workflows, ensuring data reliability and fault tolerance. Interactive Demo: Witness a hands-on deployment of a zero-maintenance, serverless pipeline integrating DLT and Kafka for real-world data challenges. Attendees will be equipped with actionable insights and best practices to modernize their streaming workflows, reduce operational complexity, and unlock the full potential of real-time analytics.

Presenters

Vikas Reddy Aravabhumi, Shasidhar Eranti

Breakout Session

Building a Scalable RealTime ML Platform for competitive gaming, harnessing the power of Kafka & Flink

In the dynamic world of competitive gaming, delivering personalised user experiences and driving meaningful business outcomes requires an advanced machine learning (ML) platform. Join this session to explore the architecture and implementation of a scalable ML ecosystem designed to enable real-time feature engineering, inference, and personalisation at scale , using the power of Flink and Kafka combined while the duo of BigTable & Redis acting as feature storage. We will dive into the ML Platform Architecture following the key points: Realtime Personalisation: Realtime Decision and Recommendation to influence user [viz. Adapting game lobbies, missions, and rewards to user preferences, with a turn around time of less than a few seconds] Feature Store: The heart of ML platform, which automates realtime feature preparation & processing pipelines. It also acts as low latency and high frequency store for realtime raw data, aggregate calculations, transformation to provision/build/create/serve features Observability: System observability; Tracking & monitoring ML experiments, DQA [Data quality assessment] Business Impact: Personalised User Journey (New, Old), Game and lobby Recommendation, Missions, Rewards (Offers, Coupons) boost the user retention & higher user engagement. Skill Benchmarking, gives user a fair opportunity to participate in game with a opponent , which increase the user engagement & clock time of the user spent on the platform It helps decide & draw the user graduation journey in the platform, which allows for opportunities to cross sell & up sell in the platform

Presenters

Mahesh Jadhav, Lakhan Marda

Breakout Session

Inside Uber's Large-Scale Real-Time Analytics Platform

At Uber, the EVA platform that drives substantial advancements in our real-time analytics capabilities, empowering various business use cases across marketing, engineering, data science, and operations and internal use cases around metrics, logs & query analytics. The platform features Apache Kafka for real-time data transport, Apache Flink for stream processing, Spark for batch processing, HDFS for deep storage needs, and Apache Pinot as the core analytics engine. Additionally, it features internal service Neutrion for Presto-like queries on Pinot and metadata service for dataset management. As part of the talk, we cover the matured architecture for real-time analytics ecosystem powering Uber’s usecases that serve up to 10s of thousands of queries/sec, several million writes/sec and host up to tens of Petabytes of Pinot datasets. We also cover two critical business and observability use case. 1. Real-time processing and ingestion using AthenaX(SQL based transformation on Flink), Flink and Kafka to provide analytics on realtime data. 2. Real-time Analytics powered by Apache Pinot to serve analytics at high QPS with sub-second latency 3. Disaster resiliency and disaster recovery strategies for Apache Pinot datasets. The talk covers Uber’s two use cases that solve real-time analytics challenges for business and observability. 1. Use case 1: Business use case(rides/eats related) 2. Use case 2: Observability use case (metrics/logs related) The audience will gain practical insights into designing real-time analytics systems centered around Apache Pinot and effectively leveraging complementary real-time technologies to build robust and high-performing solutions.

Presenters

Rohit Yadav, Satish Duggana

Breakout Session

One Client to Rule them all

Let’s be honest: who wants to have more than one client to connect to a data system? Now consider Apache Kafka. It ships with four different Java clients: producer, consumer, admin, and streams. Want to create a topic in a producer application? Use the admin client and the producer client. Want to produce and consume? Either use the producer and the consumer, or use Kafka Streams. So how did we get here? And more importantly: how can we simplify it? Are incremental improvements enough? In this talk, we’ll propose a radical approach: a single unified Java client built from scratch for producing, consuming, processing, and administration tasks. We take you on a brainstorming session about what we can and cannot do, and what we want to achieve. How can we make simple things easy and difficult things possible? What does a modern Java API look like, using the standard library, a reasonable threading model, lambdas, and futures for async calls? We think it's high time that we take another look at the Java clients and build a client ready for the next decade. Come and join the conversation about the future of Kafka clients.

Presenters

Matthias Sax

Breakout Session

Real-Time Social Security: Leveraging Apache Kafka and Flink for Efficient Benefits Disbursements

Imagine a social security system that's faster, fairer, and more compassionate. A system where benefits are disbursed in real-time, fraud is detected before it happens, and eligibility verification is accurate and efficient. This isn't just a vision – it's a reality made possible by harnessing the power of real-time data streaming technologies like Apache Kafka, Apache Flink, and cloud-based data platforms. This talk will explore the transformative potential of real-time data streaming in social security. Discover how this technology can: - Accelerate benefit disbursement using Apache Kafka's event-driven architecture, ensuring timely support for those in need - Proactively detect and prevent fraud using Apache Flink's real-time processing and machine learning capabilities, protecting the integrity of the system - Enhance eligibility verification using cloud-based data platforms, reducing errors and overpayments - Personalize benefits using real-time data analytics, tailoring support to individual needs Through real-world examples and case studies, we'll demonstrate the power of real-time data streaming to create a more equitable, efficient, and effective social security system. Don't miss this opportunity to revolutionize the future of social security.

Presenters

Venkata Kalyana Chakravarthy Bitragunta, Varsha Sharma

Breakout Session