Current Bengaluru 2025
Session Archive
Check out our session archive to catch up on anything you missed or rewatch your favorites to make sure you hear all of the industry-changing insights from the best minds in data streaming.


Building Knowledge GRAPH RAG using Neo4j
Knowledge graphs are used in development to structure complex data relationships, drive intelligent search functionality, and build powerful AI applications that can reason over different data types. • Knowledge graphs can connect data from both structured and unstructured sources (databases, documents, etc.), providing an intuitive and flexible way to model complex, real-world scenarios. • Unlike tables or simple lists, knowledge graphs can capture the meaning and context behind the data, allowing you to uncover insights and connections that would be difficult to find with conventional databases. • This rich, structured context is ideal for improving the output of large language models (LLMs), because you can build more relevant context for the model than with semantic search alone.
Jayita Bhattacharyya


Build ‘India Scale’ (~ 1billion events/day) E-commerce with Kafka - The Shiprocket Story
In the fast-evolving e-commerce space, managing and processing vast amounts of data is paramount to delivering superior experiences. Join this session to understand how Shiprocket solves this business challenge and handles over ~1 billion events daily, with Apache Kafka serving as the backbone of our architecture, enabling seamless real-time data streaming and processing, and Event decoupling. In this session, we dissect and discuss the following : - How a multi-tenant architecture, supporting more than 1,400 databases and 50,000 tables, addresses the complexity of decentralized data management and real-time querying. - Implementing effective Change Data Capture (CDC) processes has been key to achieving real-time insights across this distributed landscape. - How we enable seamless real-time microservices communication, Buyer communications, and third-party webhooks, managing ~50 million interactions daily. This capability ensures a smooth e-commerce experience for all stakeholders. Shiprocket has deep innovations linked to this platform which streamlines commerce and empowers merchants to thrive in the dynamic e-commerce ecosystem, domestically and internationally. Attendees will leave the room with practical insights into scaling high-volume, multi-tenant systems using Apache Kafka. They will also learn how Kafka drives Shiprocket’s data platform and how our hybrid architecture—combining Confluent-managed Kafka with In-house Kubernetes deployments (powered by Strimzi)—strikes the perfect balance between cost efficiency and operational control. The session will also highlight key challenges faced while scaling Kafka and the architectural optimizations that significantly enhanced its performance.
Shashi Kant Singh, Kavya Ramaiah, Ratnesh Kumar, Pratibha Singh


Ins and Outs of the Outbox Pattern
The outbox pattern is a common solution for implementing data flows between microservices. By channeling messages through an outbox table, it enables services to update their own local datastore and at the same time send out notifications to other services via data streaming platforms such as Apache Kafka, in a reliable and consistent way. However, as with everything in IT, there’s no free lunch. How to handle backfills of outbox events, how to ensure idempotency for event consumers? Doesn’t the pattern cause the database to become a bottleneck? And what about alternatives such as “Listen-to-Yourself”, or the upcoming Kafka support for 2-phase commit transactions (KIP-939)? It’s time to take another look at the outbox pattern! In this session I’ll start by bringing you up to speed on what the outbox pattern *is*, and then go on to discuss more details such as: - Implementing the pattern safely and efficiently - Its semantics, pros and cons - Dealing with backfills - Potential alternatives to the outbox pattern and the trade-offs they make
Gunnar Morling


Introduction to Stateful Stream Processing with Apache Flink
Stream Processing has evolved quickly in a short time: only a few years ago, it was mostly simple real-time aggregations with limited throughput and consistency. Today, many stream processing applications have sophisticated business logic, strict correctness guarantees, high performance, low latency, and maintain terabytes of state without databases. Stream processing frameworks also abstract a lot of the low-level details away, such as routing the data streams, taking care of concurrent executions, and handling various failure scenarios while ensuring correctness.
Viktor Gamov


Reducing Kafka Producer Cross Network Costs with RackAwareStickyPartitioner
Dream11, the world’s largest fantasy sports platform, manages unparalleled scale, with RPM surpassing 300 million during flagship events like IPL 2024. With Kafka producers forming the backbone of real-time data pipelines, Dream11 faced a significant challenge: soaring cross-availability-zone (AZ) network costs due to the indiscriminate partitioning strategy of regular producer partitioners. To address this, Dream11 engineering developed the RackAwareStickyPartitioner, a custom solution for Kafka producers that achieved a 70% reduction in cross-AZ network costs. By intelligently routing producer batches to Kafka partitions within the same AZ, this innovation minimized cross-AZ traffic while preserving high throughput. A 10-day controlled experiment demonstrated a dramatic cost reduction in “DataTransfer-Regional-Bytes” by over 30%. This optimization is tailored for high-throughput scenarios, with careful consideration required for low-volume applications to avoid partition skew. Join this session to explore how Dream11 engineered a cost-efficient solution for Kafka producers at scale, sharing insights on architecture, challenges, and real-world impact.
Sunaim Nazar


Empowering AIAgents with Real-Time Insights: From Streams to Smarter Actions
The architecture of a pipeline that brings together real-time data ingestion, analytics, intelligent retrieval systems, and AI models. See how users can integrate RAG, AI agents to augment real-time decision-making with contextually relevant information with real world use cases that emphasize practical tips for how to architect scalable systems that seamlessly blend AI and streaming technologies. Core ThemeIntro to Real-time AnalyticsPinotIntro to RAGCompound AI Systems Integration of AI agents to StreamingReal time systems and why they are necessary Discussion of the Real world use-cases like Brand Sentiment AnalysisTravel agent bots Demo on the integration Kafka —> Pinot —> AI agent
Jayesh Asrani