Explore our lineup

With over 50 sessions, Current 2025 features an exciting cataloge of presentations, workshops, and breakout sessions that will transform the way you work.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Keynote
Mar 19, 2025 10:00
Keynote
Mar 19, 2025 10:00
The future starts here! Confluent CEO Jay Kreps and some of the top minds in data took to the keynote stage at Current Bengaluru to demonstrate how Data Streaming Platforms are transforming organizations and powering next generation AI with unified and reliable real-time data. It’s a game-changer for every industry and every data practitioner. Welcome to what’s next.
Keynote
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Lightning Talk
Mar 19, 2025 12:30
Lightning Talk
Mar 19, 2025 12:30
With Apache Kafka 4.0 around the corner, kafka users will have no choice but to migrate Zookeeper based clusters to Kraft. In this talk, I will talk about how to prepare the existing zookeeper based kafka clusters to migrate to KRaft. will dive into the world of Kafka producer configs and explore how to understand and optimize them for better performance. We will talk about the considerations before the migration, common mistakes and how to avoid them. Session Overview: - Introduction to KRaft - Migration prep: minimizing the impact of potential downtime - Discuss Kraft specific configs like process.roles, node.id and controller.quorum.voters - Common mistakes and how to avoid them - Demo Through this session, attendees will gain the knowledge and tools necessary to navigate this transition effectively, ensuring their Kafka deployments are poised for future growth and innovation.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Lightning Talk
Mar 19, 2025 12:30
Lightning Talk
Mar 19, 2025 12:30
Swiggy, India’s leading food delivery platform, processes millions of messages every second to power real-time recommendations, predictions, order tracking, and personalized user experiences. In this session, we’ll explore the challenges Swiggy faced while managing open-source Kafka and how we successfully migrated to Confluent’s managed Kafka cluster, streamlining operations and significantly improving performance. We’ll also dive into the critical role Confluent Kafka plays in our microservices architecture, with a special focus on the complexities of Kafka consumer canary testing. We’ll discuss why this process is complex and how we uniquely solved these challenges to ensure reliable, efficient service delivery. Finally, we’ll demonstrate how Confluent Kafka enables Swiggy to handle millions of messages per second, empowering real-time analytics, predictive models like SLA predictions, and personalized user experiences at scale. This session will provide valuable insights into Kafka’s central role in modern microservices architectures and how Confluent Kafka supports high-performance, scalable, and real-time data pipelines for large-scale applications.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Lightning Talk
Mar 19, 2025 12:30
Lightning Talk
Mar 19, 2025 12:30
The growing adoption of Kubernetes and Kafka for distributed systems presents exciting opportunities alongside unique challenges for enhancing the availability and resilience of Kafka deployments. While Kubernetes offers powerful orchestration capabilities, deploying a Kafka cluster within a single Kubernetes cluster can expose organizations to limitations. A Kubernetes cluster outage may render the entire Kafka system unavailable, disrupting applications and clients. To overcome this, many organizations including us are working to achieve scalable, distributed, multi-zone Kafka clusters where the Kafka nodes span across multiple Kubernetes clusters in nearby availability zones. This multi-cluster approach provides several key benefits. It ensures high availability by preventing single-cluster outages, supports migration efforts by allowing Kafka nodes to be deployed across clusters with minimal disruption, and optimizes resource usage by leveraging the combined capacity of multiple Kubernetes environments. However, implementing such deployments introduces significant challenges, including managing increased network complexity and costs, ensuring low-latency connectivity for performance, and maintaining data consistency in latency-sensitive environments. This session explores practical methodologies and principles for deploying Kafka across Kubernetes clusters, focusing on broker and controller distribution, fault tolerance, scalability, cross-cluster communication, and resource synchronization. Attendees will gain insights into challenges associated with distributing Kafka across Kubernetes clusters and explore potential solutions within the Operator framework. Tailored for developers and operators, this talk provides actionable takeaways for enhancing Kafka’s resilience, scalability, and flexibility on Kubernetes, including best practices for resource integration, configuration management, and performance tuning.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Lightning Talk
Mar 19, 2025 12:30
Lightning Talk
Mar 19, 2025 12:30
In this session, Team Yubi demonstrates how an intelligent streaming data pipeline leveraging Apache Kafka creates a unified analytical platform to deliver near real-time insights from a centralized Redshift data warehouse. Business operations teams face challenges approving large-ticket trades due to fragmented data across multiple systems managed by different teams. Fetching and reconciling this data often involves writing complex queries—expertise many operations teams lack—leading to delays in due diligence and decision-making. To solve this, we built a robust streaming data pipeline that centralizes disparate data sources into Redshift. The pipeline uses Apache Kafka for streaming, Kubernetes for scalability, dbt for data transformations, and Redshift WLM with data sharing for optimized query execution. Our custom Kafka sink connectors process data efficiently in two modes—snapshot (replicating the source RDS) and CDC (capturing incremental changes)—within a single flush cycle. This approach keeps the warehouse up-to-date, reduces ETL loads, lowers infrastructure costs, and enables quick data refresh cycles. The unified platform also lays the foundation for AI-based Text-to-SQL (TTS) capabilities, allowing teams to generate SQL queries using natural language for ad-hoc requests and reports. By enabling real-time streaming, Team Yubi empowers operations teams to process high-value transactions—disbursing amounts worth hundreds of crores—quickly and efficiently. The ability to reinitiate actions seamlessly in case of failures minimizes operational bottlenecks and ensures smooth transaction workflows, reducing revenue impact. Join us to learn how real-time data streaming transforms operational efficiency and decision-making.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Lightning Talk
Mar 19, 2025 12:30
Lightning Talk
Mar 19, 2025 12:30
In today’s fast-paced world, where actionable business insights drive competitive advantage, tapping into dynamic real-time streams marks the evolution of data-driven decision-making and revolutionizing business intelligence. Traditional batch-based data pipelines slowed down decision-making, causing delays in business insights, and limiting our ability to respond in real time. Join this session to learn, how at Myntra, we revamped our data infrastructure by transforming batch-based pipelines into a robust, real-time streaming architecture, reducing latency from hours to mere minutes. This session will also delve into how we leveraged Kafka, Spark Structured Streaming, and Delta Lake to create a scalable, low-latency ingestion pipeline. By implementing exactly-once semantics and optimizing data flows, we achieved the reliability and scalability needed to power mission-critical use cases.We’ll also explore how this transformation addressed the inherent limitations of traditional batch systems, enabling data freshness, operational agility, and the delivery of actionable near real-time business insights. These advancements have redefined how Myntra supports its dynamic ecosystem, driving unprecedented agility. The audience will gain actionable strategies for building real-time streaming pipelines, overcoming data freshness challenges, and unlocking the potential of near real-time insights to fuel innovation and growth at scale. Key highlights: 1. Kafka-Centric Streaming Architecture: Delve into the architectural design where Kafka powers seamless integration between streaming and batch workflows,efficiently handling millions of events/minute. 2. Data Freshness & Completeness Challenges: Understand how Myntra ensures data freshness and completeness using write ahead logs, micro-batch freshness propagation. 3. Operational Innovations with Delta and Spark: Explore how Apache Spark enabled efficient real-time ingestion, exactly-once semantics and fault tolerance in high-throughput.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 13:00
Breakout Session
Mar 19, 2025 13:00
Event streaming is great but sometimes it’s easier to use a queue, especially when parallel consumption is more important than ordering. Wouldn't it be great if you had the option of consuming your data in Apache Kafka just like a message queue? For workloads where each message is an independent work item, you’d really like to be able to run as many consumers as you need, cooperating to handle the load, and to acknowledge messages one at a time as the work is completed. You might even want to be able to retry specific messages. This is much easier to achieve using a queue rather than a topic with a consumer group. KIP-932 brings queuing semantics to Apache Kafka. It introduces the concept of share groups. Share groups let your applications consume data off regular Kafka topics with per-message acknowledgement and without worrying about balancing the number of partitions and consumers. With this KIP, you can bring your queuing workloads to Apache Kafka. Come and hear about this innovative new feature being added to Apache Kafka 4.0.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 13:00
Breakout Session
Mar 19, 2025 13:00
Kafka and relational databases have long been part of event-driven architectures and streaming applications. However, Kafka topics and database tables have historically been separate abstractions with independent storage and transaction mechanisms. Making them work together seamlessly can be challenging, especially because queuing has been viewed as an anti-pattern in a stock database. This talk will describe how to close this gap by providing a customized queuing abstraction inside the database that can be accessed via both SQL and Kafka’s Java APIs. Since topics are directly supported by the database engine, applications can easily leverage ACID properties of local database transactions allowing exactly-once event processing. Patterns such as Transactional Outbox (writing a data value and sending an event) or any atomicity required across many discrete database and streaming operations can be supported out of the box. In addition, the full power of SQL queries can be used to view records in topics and also to join records in topics with rows in database tables. In this talk we cover the synergy between Kafka's Java APIs, SQL, and the transactional capabilities of the Oracle Database. We describe the implementation, which uses a transactional event queue (TxEventQ) to implement a Kafka topic and a modified Kafka client that provides a single, unified JDBC connection to the database for event processing and traditional database access.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 13:00
Breakout Session
Mar 19, 2025 13:00
Apache Kafka has become an essential technology for modern data streaming applications. However, its learning curve can be steep for developers. This presentation will help you overcome the everyday challenges of Kafka development and streamline your development experience using the Confluent VS Code Extension. We begin by exploring the hurdles developers face when starting with Kafka: grappling with complex concepts, bootstrapping initial code, managing data, and achieving meaningful interaction with their applications. Then, we introduce the game-changing Confluent VS Code Extension - an open-source, free tool designed to transform the Kafka development experience. Through a practical, live demonstration, we'll follow a new application developer's day and showcase how the extension simplifies everything from environment setup to schema management. You'll see how to rapidly generate and deploy producer and consumer applications, handle schema evolution, debug message validation issues, and manage your development environment effectively without leaving your IDE. The presentation concludes with real-world implementation strategies, including GitOps integration and multi-environment management. Join the growing community of developers revolutionizing their Kafka development workflow. Start building faster, more intelligently, and reliably with the Confluent VS Code Extension today.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 13:00
Breakout Session
Mar 19, 2025 13:00
John Deere manufacturing factories are equipped with thousands of state-of-the-art smart industrial robots and other machines. These next-gen factories are on a path to Industry 5.0 which requires equally advanced and well-integrated OT and IT systems to enable real-time availability and processing of OT data for faster decision making near the source of data in the factory and for improved overall operational efficiency in the organization. We present our Manufacturing IoT Edge platform, designed to fulfil the vision of Manufacturing 5.0, using open-source tools and standard protocols like MQTT, Sparkplug, Apache Kafka, Kafka Connect, Kafka Streams and more for collection, contextualization, stream processing, historization and analysis of manufacturing OT data in real-time. We cover technical details like core concepts of MQTT protocol with Sparkplug specification, how it is optimized for SCADA/IIoT solutions, how the Sparkplug data is processed using open-source Apache Kafka and ecosystem including custom-built Kafka Connectors for ingestion and stateful Kafka Streams processors. All the details we plan to present are relevant for building IoT Edge platforms for any other industrial domain as well. If you want to learn about any of the following, come join us! * Classic challenges of Industrial edge IoT platforms * Solution architecture and design trade offs * Technical details of MQTT, Sparkplug, Kafka Connect and Kafka Streams * Specific complexities of stream processing of Sparkplug data with Kafka and ways to handle these * Overall, how industrial IoT Edge use case is implemented
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 13:00
Breakout Session
Mar 19, 2025 13:00
In this session, Dream11 engineering team will share the secret sauce and the innovation around Apache Kafka consumers, processing tens of millions of events using a re-engineered Kafka consumer library. Dream11 is one of the largest fantasy sports platforms in the world, handling peak user concurrency of over 15 million during IPL 2024, with edge RPM surpassing 300 million. The business operates under highly time-sensitive conditions, experiencing hockey-stick traffic surges just before the start of matches. To ensure real-time updates for game users, the Dream11 platform heavily relies on Apache Kafka in the critical pipelines of end user services. As the scale grew, the legacy Kafka consumer (simple, high-level) began facing challenges such as delays and data loss, severely impacting user trust. To address these issues, the Dream11 engineering team innovated and developed a low-level Kafka consumer. In this consumer, polling is decoupled from processing and executing both in parallel which fixed our frequent rebalancing problem. For processing the messages we created dedicated worker pool which improved our speed significantly. We disabled the auto-commit and commits were done in batches making sure at-least-once processing, ensuring no data loss. With the growth of the microservices ecosystem, Kafka pipelines became integral to many services. Building on the success of the low-level consumer, Dream11 engineering team created a platform from Kafka consumer library that abstracted the complexities of Kafka integration. This library provides simple interfaces for developers to implement business logic seamlessly. Over time, it matured with features like backpressure, enabling developers to process messages locally during incidents or to scale across a consumer pool with varied core counts. Join this session to learn strategies to optimize Kafka consumers for low latency and high reliability at massive scale.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 14:00
Breakout Session
Mar 19, 2025 14:00
In August 2023, WarpStream introduced itself as a Kafka-compatible, S3-native streaming solution offering powerful features such as a BYOC-native approach, decoupling of storage and compute as well as data and metadata, offset-preserving replication, and direct-to-S3 writes. It shines in a specific niche—logging, observability, and data lake feeding—where a slight increase in latency is a fair trade-off for substantial cloud cost savings and simplified operations. In this session, we'll take a look into ShareChat's journey of migrating our logging systems from managed Kafka-compatible solutions to WarpStream. At ShareChat, logging suffered from 2 issues: highly unpredictable workloads and high inter-zone fees for data replication across brokers. Logging volume could spike up to 5 times the normal rate for brief periods before returning to baseline. We had to over-provision our Kafka clusters to prevent costly rebalancing and scaling issues, resulting in unnecessary expenses. WarpStream offers a solution with its stateless, autoscaling agents—eliminating the need to manage local disks or rebalance brokers. Moreover, by leveraging S3 for replication, WarpStream allows us to eliminate inter-zone fees. In this session, we’ll discuss things like setting up WarpStream in your cloud, best practices for agents (brokers) and clients, fine-tuning your cluster's latency, and offer advice for local testing. You'll see a detailed cost comparison between WarpStream and both multi-zone and single-zone Kafka-compatible solutions. Additionally, we'll demonstrate how to set up comprehensive monitoring for your WarpStream cluster at various levels of granularity—including agent, topic, and zone. Finally, we'll cover essential alerts you should configure for your agents and our experience in consuming from WarpStream from inside Spark jobs and share the best Spark configs that worked for us.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 14:00
Breakout Session
Mar 19, 2025 14:00
Apache Kafka v4.0 introduces significant changes that will affect how you deploy, configure, and operate your Kafka clusters. In addition to the famous "zookeeper removal", what else does Kafka v4.0 bring to us? As a Kafka user, what changes will impact my existing kafka cluster? As a Flink user, am I safe from the kafka v4.0 upgrade when adopting flink kafka connector? In this session, I'll go through all the important changes in Apache Kafka v4.0, like the log4j2 upgrade(KIP-653), the next-generation rebalance protocol(KIP-848), the async consumer, the Eligible Leader Replicas(KIP-996), some default configuration changes(KIP-1030), and many component/API deprecation and removal...etc. Not just introducing these changes, most importantly, I'll let you know what they impact your existing cluster, for both Kafka and Flink. Like in the change of KIP-996, will it change the existing unclean leader election mechanism? like in the change of KIP-848, will the Flink kafka connector need to adopt the new async consumer for the new rebalance protocol..., like in the removal of old client protocol API versions(KIP-896), will it be incompatible to my existing kafka clients or Flink Kafka connector...etc. After this session, you will have better understanding of what changes the Apache Kafka v4.0 will have. And you can also know what "actions" you have to take for your existing clusters, no matter it's Kafka or Flink. Finally, you know you can upgrade to Kafka v4.0 without any "surprise".
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 14:00
Breakout Session
Mar 19, 2025 14:00
At Atlassian, we have numerous inter-connected services whose shards are deployed on data centers across the globe. These services together participate in a complex stream-based ‘data migration’ workflow. To enable this, within the Platform org, we built a scalable, resilient and globally consistent Orchestrator. This Orchestrator leverages Kafka’s State store, Kafka streams and Kafka Connect. It provides a “service mesh” equivalent for Kafka-integrated services, enabling seamless coordination and communication between different “steps” of the workflow. The architecture allows different shards and nodes of services to enter and exit the service mesh, specify tenants, allocate usage-based quota to callers and so on. Central to our solution are 5 things: 1️⃣ A State store based context management that’s globally consistent 2️⃣ An SDK that services can readily integrate. This SDK seamlessly abstracts out Kafka for application developers. 3️⃣ A dynamic registry of services which not only catalogs the available services but also maintains an up-to-date map of service deployments across data centers, their health and their usage. 4️⃣ The orchestrator's intelligent routing algorithms that enable an application developer to seamlessly run a workflow that automatically resolves the most appropriate ‘data shard’ for each ‘step’ of the workflow, based on the application's requirements and the callee service’s constraints 5️⃣ A Kafka Connect based “message relay” service which handles cross data center message movement and which optionally provides exactly-once guarantees. Join us to explore the inner workings of this Orchestrator, how it leverages Kafka’s (possibly less popular) capabilities to address modern distributed stream-based data migration applications. We'll discuss real-world use cases from Jira, Confluence and other popular products of Atlassian and share our insights which can help you push the boundaries of what's possible with Kafka.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 14:00
Breakout Session
Mar 19, 2025 14:00
Within Uber, we have numerous Kafka clusters comprising thousands of nodes tailored to different use cases. These clusters collectively handle a few trillion messages daily, amassing multiple Petabytes of data ingestion. These messages are distributed across thousands of topics. Several of these clusters are exceptionally large, exceeding 150 nodes in some cases. Kafka serves a crucial role in enabling inter-service communication, transporting database changelogs, facilitating data lake ingestion, and more. Notably, Kafka houses business-critical data like billing and payment information. Kafka is a tier-0 technology at Uber, guaranteeing 99.99% data durability, and its availability is tied to the health of the underlying nodes. However, these nodes are ageing, leading to increasing disk failures and the need for replacements, with potential risks of offline partitions and data loss. To ensure uninterrupted operations, there is a need to migrate topics and partitions to newer, high-performance SKUs. The migration introduces several challenges: 1. Preserving rack-aware distribution to maintain zone failure resiliency during the migration. 2. Managing significant differences in disk capacity between the old SKU (legacy) nodes and the new SKU (H20A) nodes. 3. Adhering to disk usage thresholds on the new SKU nodes to avoid performance degradation. 4. Balancing nodes within racks to ensure continuous resiliency and fault tolerance. 5. Handling variability in Kafka cluster configurations, especially for low-latency clusters, where introducing new replicas could increase latency. Join us to learn how we overcame these challenges using strategies like tiered storage and cluster rebalance to successfully migrate Kafka infrastructure at Uber.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 14:00
Breakout Session
Mar 19, 2025 14:00
Are you struggling to come to terms with Flink SQL WINDOW functions for processing your stream ? Are you new to Flink SQL ? If answers to both these questions are ‘Yes’, then, join my session on getting introduced to WINDOW functions on a data stream, with live examples. No presentations please ! ! Followed by live coding of Flink SQL WINDOW operations on real world streaming data and get your hands dirty ! ! We will start by understanding the syntax of a WINDOW function in general and then dive deeper into Flink Table Valued Functions (TVF) with Flink 1.20. Then we’ll understand how TUMBLE, HOP WINDOW functions operate using live SQL examples. Next, we will build an end-to-end demo with data streams generated by Kafka and apply Flink SQL WINDOW operations on the data stream to transform and aggregate data. You come out of my session with an enhanced knowledge about data stream WINDOW functions using Flink SQL and will be able to run the example to align it closer to your use case.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 15:00
Breakout Session
Mar 19, 2025 15:00
Operational resilience and Disaster Recovery (DR) through Kafka is indispensable for businesses to grow at a rapid pace in a high velocity and high risk environment of digital payments that PhonePe operates in. It has been core to our platform first approach from Day 1 and helped us drive widespread digital adoption in India. It ensures data integrity and availability thereby safeguarding user trust, business operations and regulatory compliance preventing financial losses. PhonePe has revolutionised India’s financial and digital landscape to build a cashless economy with financial traceability; and processes ~9 billion transactions/month which is almost 4 times the size of other global digital payment giants. This has enabled financial inclusion by empowering millions of Indians easy access to digital payments across both urban and rural India and including a 30 million vast merchant network. Join this session to learn how we demystified MirrorMaker2 for achieving DR in various ways through Kafka. We will talk about selection criteria for applications, different types of outages and the cost-benefit analysis for these applications. With MM2, we will explain how application-side dual writes challenge was resolved and the intricacies of setting up Shallow mirroring, Switch implementations with offset translation and Building automatic failure detectors. The session will also talk about Monitoring & Alerting and L7 proxy setup for transparent failovers. The audience would also have key takeaways on cluster architecture, rack-awareness for brokers, producers and consumers and implementation considerations from our learnings of setting up a platform – which is compliant by design, and able to scale to handle high volume and speed of data flow.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 15:00
Breakout Session
Mar 19, 2025 15:00
AI-powered agent systems are becoming essential for automation, personalization, and real-time decision-making. But how do we ensure that these agents can process information continuously, maintain context, and provide intelligent responses at scale? This talk explores how Apache Kafka and Apache Flink can be used to build dynamic real-time agent systems. We'll start with the basics of agent-based systems - how they work, how they communicate, and how they retrieve and generate relevant knowledge using Retrieval-Augmented Generation. Then, we'll look into real-time streaming architectures, showing how Kafka handles message passing between agents and Flink processes events to track context and enable intelligent responses. By the end of this session, you'll have a clear roadmap for designing AI-driven agent systems that are context-aware, efficient and work with a continuous stream of data. Whether you're working on chatbots, monitoring systems, or intelligent automation, this talk will provide practical insights into bridging streaming data with generative AI to power the next generation of autonomous agents. Perfect for beginners and experts alike, this session offers valuable insights for all skill levels.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 15:00
Breakout Session
Mar 19, 2025 15:00
REST-based Request-response APIs are the lifeblood of most architectures. API transactions, specifically involving the CUDs (Creates, Updates, Deletes) are great sources of events and/or change data. This is also true of various RPC style APIs. However, because of the client-server nature, such events need to be published by the service, using an "outboxing" mechanism and subsequently disseminated to pub-sub event broker, most often Apache Kafka. This involves code changes to the application and in addition, a more expansive solution like CDC (change-data-capture) to collect those changes from the outbox table. In this session, we will present an alternative pattern that leverages standard API proxies / gateway solutions to process and push events to Kafka, as they occur in the API data plane at real-time. We will address the following important questions during this Show Me How session: 1. How to source event data from APIs (and API gateways) such as Kong and Istio 2. How to synthesize event data into the desired form using Kafka Streams 3. How to source change feeds against aggregates using Kafka Streams state stores 4. Considerations and tradeoffs for this architecture 5. Key use-cases (hint:Event Sourcing, Data exchange Audit-trails and more!) This is a potential no-code alternative to CDC, but with added benefits such as broad access to domain and context-specific request-response data, as opposed to having to process multiple row-level changes. Finally, this also enables treating eventing as a cross-cutting concern associated with API management and operations. Attendees can expect to take away a new and interesting approach towards a progressive, low-friction transition to an event-driven architecture
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 15:00
Breakout Session
Mar 19, 2025 15:00
Real-time data processing is at the heart of modern applications, demanding scalability, reliability, and efficiency. This talk showcases how Delta Live Tables (DLT), Apache Kafka, and serverless architectures converge to create dynamic, end-to-end streaming pipelines without the need for version management. We’ll delve into: Advanced Use Cases: Learn how industries are leveraging DLT and Kafka to build resilient, real-time streaming workflows for applications. Versionless Pipelines: Understand how Databricks' automatic runtime upgrades for Delta Live Tables simplify maintenance while supporting platform enhancements. Optimization Strategies: Explore cost-effective, scalable design principles that serverless architectures bring to streaming workflows. Technical Deep-Dive: Discover how DLT handles checkpointing and data quality enforcement in streaming workflows, ensuring data reliability and fault tolerance. Interactive Demo: Witness a hands-on deployment of a zero-maintenance, serverless pipeline integrating DLT and Kafka for real-world data challenges. Attendees will be equipped with actionable insights and best practices to modernize their streaming workflows, reduce operational complexity, and unlock the full potential of real-time analytics.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 15:00
Breakout Session
Mar 19, 2025 15:00
In the dynamic world of competitive gaming, delivering personalised user experiences and driving meaningful business outcomes requires an advanced machine learning (ML) platform. Join this session to explore the architecture and implementation of a scalable ML ecosystem designed to enable real-time feature engineering, inference, and personalisation at scale , using the power of Flink and Kafka combined while the duo of BigTable & Redis acting as feature storage. We will dive into the ML Platform Architecture following the key points: Realtime Personalisation: Realtime Decision and Recommendation to influence user [viz. Adapting game lobbies, missions, and rewards to user preferences, with a turn around time of less than a few seconds] Feature Store: The heart of ML platform, which automates realtime feature preparation & processing pipelines. It also acts as low latency and high frequency store for realtime raw data, aggregate calculations, transformation to provision/build/create/serve features Observability: System observability; Tracking & monitoring ML experiments, DQA [Data quality assessment] Business Impact: Personalised User Journey (New, Old), Game and lobby Recommendation, Missions, Rewards (Offers, Coupons) boost the user retention & higher user engagement. Skill Benchmarking, gives user a fair opportunity to participate in game with a opponent , which increase the user engagement & clock time of the user spent on the platform It helps decide & draw the user graduation journey in the platform, which allows for opportunities to cross sell & up sell in the platform
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 16:00
Breakout Session
Mar 19, 2025 16:00
At Uber, the EVA platform that drives substantial advancements in our real-time analytics capabilities, empowering various business use cases across marketing, engineering, data science, and operations and internal use cases around metrics, logs & query analytics. The platform features Apache Kafka for real-time data transport, Apache Flink for stream processing, Spark for batch processing, HDFS for deep storage needs, and Apache Pinot as the core analytics engine. Additionally, it features internal service Neutrion for Presto-like queries on Pinot and metadata service for dataset management. As part of the talk, we cover the matured architecture for real-time analytics ecosystem powering Uber’s usecases that serve up to 10s of thousands of queries/sec, several million writes/sec and host up to tens of Petabytes of Pinot datasets. We also cover two critical business and observability use case. 1. Real-time processing and ingestion using AthenaX(SQL based transformation on Flink), Flink and Kafka to provide analytics on realtime data. 2. Real-time Analytics powered by Apache Pinot to serve analytics at high QPS with sub-second latency 3. Disaster resiliency and disaster recovery strategies for Apache Pinot datasets. The talk covers Uber’s two use cases that solve real-time analytics challenges for business and observability. 1. Use case 1: Business use case(rides/eats related) 2. Use case 2: Observability use case (metrics/logs related) The audience will gain practical insights into designing real-time analytics systems centered around Apache Pinot and effectively leveraging complementary real-time technologies to build robust and high-performing solutions.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 16:00
Breakout Session
Mar 19, 2025 16:00
Let’s be honest: who wants to have more than one client to connect to a data system? Now consider Apache Kafka. It ships with four different Java clients: producer, consumer, admin, and streams. Want to create a topic in a producer application? Use the admin client and the producer client. Want to produce and consume? Either use the producer and the consumer, or use Kafka Streams. So how did we get here? And more importantly: how can we simplify it? Are incremental improvements enough? In this talk, we’ll propose a radical approach: a single unified Java client built from scratch for producing, consuming, processing, and administration tasks. We take you on a brainstorming session about what we can and cannot do, and what we want to achieve. How can we make simple things easy and difficult things possible? What does a modern Java API look like, using the standard library, a reasonable threading model, lambdas, and futures for async calls? We think it's high time that we take another look at the Java clients and build a client ready for the next decade. Come and join the conversation about the future of Kafka clients.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 16:00
Breakout Session
Mar 19, 2025 16:00
Imagine a social security system that's faster, fairer, and more compassionate. A system where benefits are disbursed in real-time, fraud is detected before it happens, and eligibility verification is accurate and efficient. This isn't just a vision – it's a reality made possible by harnessing the power of real-time data streaming technologies like Apache Kafka, Apache Flink, and cloud-based data platforms. This talk will explore the transformative potential of real-time data streaming in social security. Discover how this technology can: - Accelerate benefit disbursement using Apache Kafka's event-driven architecture, ensuring timely support for those in need - Proactively detect and prevent fraud using Apache Flink's real-time processing and machine learning capabilities, protecting the integrity of the system - Enhance eligibility verification using cloud-based data platforms, reducing errors and overpayments - Personalize benefits using real-time data analytics, tailoring support to individual needs Through real-world examples and case studies, we'll demonstrate the power of real-time data streaming to create a more equitable, efficient, and effective social security system. Don't miss this opportunity to revolutionize the future of social security.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 16:00
Breakout Session
Mar 19, 2025 16:00
Zach built an entire data platform by himself that thousands of data engineers use each year. In this talk, he goes through how he did it and the choices and lessons he learned along the way!
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 16:00
Breakout Session
Mar 19, 2025 16:00
The session will demonstrate the steps required to set up Single Sign-On (SSO) in Confluent Control Center using Confluent for Kubernetes (CFK). It will use a practical example to cover the technical aspects of configuring SSO using OpenID Connect (OIDC) with Confluent Control Center, focusing on the necessary configurations with CFK to automate the setup process. Attendees will gain insights into enhancing security and user management in data streaming environments and preparing for OAuth deployment in Confluent Platform.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 16:00
Breakout Session
Mar 19, 2025 16:00
Windowed aggregation on streams is a powerful concept for analytical data processing, especially in cases where waiting hours or even minutes for data to be available is inconceivable. While most people think of aggregations as an analytical requirement, they also help trim down data to size and can be critical to scaling systems without leading to ballooning costs. We had a similar use case in ShareChat where we had hundreds of thousands of counter increments (updates)/sec for everything from the number of views on a post to revenue numbers. Our databases could not keep up with the write volume, and there were frequent hot-spotting issues. Furthermore, the data was often inconsistent due to multiple writers, and taking locks further added to our misery. To solve this issue, we used Kafka and Kafka Streams to build an aggregation framework that can handle hundreds of thousands of counter increments per second. Streaming aggregation helped us batch updates and helped reduce the write throughput in our DBs. Further, it helped solve our hot-spotting issues and eliminated the need to take locks. This talk discusses the challenges of building the platform, managing an in-house Kafka setup, and the lessons learned from tuning Kafka Streams. Furthermore, we discuss how we optimised the solution to scale to even 1M updates/sec with zero hiccups or manual involvement. Today, the offering forms an integral part of our core streaming platform, and the talk will be helpful for developers who have similar requirements for streaming aggregations or want to know more about event-driven architectures using Kafka and Kafka Streams.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 17:00
Breakout Session
Mar 19, 2025 17:00
Confluent Managed solution for Apache Flink is expanding its analytical capabilities with the introduction of ML_FORECAST and ML_ANOMALY_DETECTION functions. Developers can now harness the power of established models like ARIMA for continuous forecasting and anomaly detection, all within the familiar SQL interface. This advancement eliminates the need for external ML services and enables continuous processing by embedding these analytical capabilities directly in your streaming pipeline. In this 20-minute session, tailored for developers with stream processing experience, we'll explore how to integrate sophisticated time series analysis into Flink SQL applications. We'll start by introducing the newly developed ML_FORECAST function, which brings ARIMA modeling capabilities to streaming data. We'll then demonstrate the ML_ANOMALY_DETECTION function and show how it can be combined with Kafka-sourced data streams for real-time anomaly detection. Finally, we'll build a complete streaming application that combines both functions to forecast metrics and detect anomalies in a continuous manner. By the end of the session, attendees will understand how to leverage these powerful new functions to build production-ready continuous forecasting and anomaly detection systems using just Flink SQL.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 17:00
Breakout Session
Mar 19, 2025 17:00
You love Jupyter Notebooks and Python? This session shows you how to also do all kinds of Kafka-related tasks - directly in your Jupyter Notebook. The key for a seamless Kafka/Python/Jupyter Notebook experience is the Open Source library "kafi", the "Swiss Army Knife" for Kafka. We will show you how to cover all of the following use cases (and more) with "kafi" in your Jupyter Notebook, each with only a tiny number of lines of Python code: * Kafka administration * Schema Registry administration * Kafka Backups * Simple stream processing * Microservices/agents * Building a bridge from Kafka to Pandas dataframes/files (e.g. turn a Kafka topic into a Pandas dataframe or Parquet file) After this session, many Kafka use cases that might have been terrifying for you before will have turned into easy as pie.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 17:00
Breakout Session
Mar 19, 2025 17:00
Seamless authentication to Kafka can be efficiently achieved through the integration of OAuth 2.0 and OpenID Connect (OIDC), enabling secure, token-based access to Kafka clusters. By leveraging these protocols, organizations can significantly enhance their security posture while simplifying identity management. OAuth 2.0 provides a robust framework for token-based authentication, reducing the reliance on long-term user credentials and mitigating the risks associated with credential exposure. This integration allows businesses to centralize authentication, improve access control, and ensure that only authorized users and applications can interact with Kafka resources. With OAuth 2.0 and OIDC, organizations can enforce role-based access control (RBAC), which ensures that users and applications only have access to the resources they need. This level of granularity in access management helps prevent unauthorized access and minimizes the potential attack surface. Throughout the integration process, key concepts of OAuth 2.0 and OIDC will be covered, along with practical steps for configuring them within Kafka. By the end of the session, participants will understand how to implement OAuth 2.0 and OIDC to streamline authentication, improve security, and simplify Kafka client access management in enterprise environments, all while maintaining a high level of control and compliance.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 17:00
Breakout Session
Mar 19, 2025 17:00
Knowledge graphs are used in development to structure complex data relationships, drive intelligent search functionality, and build powerful AI applications that can reason over different data types. • Knowledge graphs can connect data from both structured and unstructured sources (databases, documents, etc.), providing an intuitive and flexible way to model complex, real-world scenarios. • Unlike tables or simple lists, knowledge graphs can capture the meaning and context behind the data, allowing you to uncover insights and connections that would be difficult to find with conventional databases. • This rich, structured context is ideal for improving the output of large language models (LLMs), because you can build more relevant context for the model than with semantic search alone.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 17:00
Breakout Session
Mar 19, 2025 17:00
In the fast-evolving e-commerce space, managing and processing vast amounts of data is paramount to delivering superior experiences. Join this session to understand how Shiprocket solves this business challenge and handles over ~1 billion events daily, with Apache Kafka serving as the backbone of our architecture, enabling seamless real-time data streaming and processing, and Event decoupling. In this session, we dissect and discuss the following : - How a multi-tenant architecture, supporting more than 1,400 databases and 50,000 tables, addresses the complexity of decentralized data management and real-time querying. - Implementing effective Change Data Capture (CDC) processes has been key to achieving real-time insights across this distributed landscape. - How we enable seamless real-time microservices communication, Buyer communications, and third-party webhooks, managing ~50 million interactions daily. This capability ensures a smooth e-commerce experience for all stakeholders. Shiprocket has deep innovations linked to this platform which streamlines commerce and empowers merchants to thrive in the dynamic e-commerce ecosystem, domestically and internationally. Attendees will leave the room with practical insights into scaling high-volume, multi-tenant systems using Apache Kafka. They will also learn how Kafka drives Shiprocket’s data platform and how our hybrid architecture—combining Confluent-managed Kafka with In-house Kubernetes deployments (powered by Strimzi)—strikes the perfect balance between cost efficiency and operational control. The session will also highlight key challenges faced while scaling Kafka and the architectural optimizations that significantly enhanced its performance.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 18:00
Breakout Session
Mar 19, 2025 18:00
The outbox pattern is a common solution for implementing data flows between microservices. By channeling messages through an outbox table, it enables services to update their own local datastore and at the same time send out notifications to other services via data streaming platforms such as Apache Kafka, in a reliable and consistent way. However, as with everything in IT, there’s no free lunch. How to handle backfills of outbox events, how to ensure idempotency for event consumers? Doesn’t the pattern cause the database to become a bottleneck? And what about alternatives such as “Listen-to-Yourself”, or the upcoming Kafka support for 2-phase commit transactions (KIP-939)? It’s time to take another look at the outbox pattern! In this session I’ll start by bringing you up to speed on what the outbox pattern *is*, and then go on to discuss more details such as: - Implementing the pattern safely and efficiently - Its semantics, pros and cons - Dealing with backfills - Potential alternatives to the outbox pattern and the trade-offs they make
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 18:00
Breakout Session
Mar 19, 2025 18:00
Stream Processing has evolved quickly in a short time: only a few years ago, it was mostly simple real-time aggregations with limited throughput and consistency. Today, many stream processing applications have sophisticated business logic, strict correctness guarantees, high performance, low latency, and maintain terabytes of state without databases. Stream processing frameworks also abstract a lot of the low-level details away, such as routing the data streams, taking care of concurrent executions, and handling various failure scenarios while ensuring correctness.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 18:00
Breakout Session
Mar 19, 2025 18:00
Dream11, the world’s largest fantasy sports platform, manages unparalleled scale, with RPM surpassing 300 million during flagship events like IPL 2024. With Kafka producers forming the backbone of real-time data pipelines, Dream11 faced a significant challenge: soaring cross-availability-zone (AZ) network costs due to the indiscriminate partitioning strategy of regular producer partitioners. To address this, Dream11 engineering developed the RackAwareStickyPartitioner, a custom solution for Kafka producers that achieved a 70% reduction in cross-AZ network costs. By intelligently routing producer batches to Kafka partitions within the same AZ, this innovation minimized cross-AZ traffic while preserving high throughput. A 10-day controlled experiment demonstrated a dramatic cost reduction in “DataTransfer-Regional-Bytes” by over 30%. This optimization is tailored for high-throughput scenarios, with careful consideration required for low-volume applications to avoid partition skew. Join this session to explore how Dream11 engineered a cost-efficient solution for Kafka producers at scale, sharing insights on architecture, challenges, and real-world impact.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
Mar 19
Breakout Session
Mar 19, 2025 18:00
Breakout Session
Mar 19, 2025 18:00
The architecture of a pipeline that brings together real-time data ingestion, analytics, intelligent retrieval systems, and AI models. See how users can integrate RAG, AI agents to augment real-time decision-making with contextually relevant information with real world use cases that emphasize practical tips for how to architect scalable systems that seamlessly blend AI and streaming technologies. Core ThemeIntro to Real-time AnalyticsPinotIntro to RAGCompound AI Systems Integration of AI agents to StreamingReal time systems and why they are necessary Discussion of the Real world use-cases like Brand Sentiment AnalysisTravel agent bots Demo on the integration Kafka —> Pinot —> AI agent
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Lightning Talk
May 20, 2025 12:30
Lightning Talk
May 20, 2025 12:30
For years, PPC—Greece’s leading electric utility—focused on power generation, distribution, and supply. However, our digital channels lagged behind. As part of a major digital transformation, we needed to refactor the core backend engine, powering these channels, due to system decommissioning. This raised a critical challenge: How could we bring customer data closer to our digital channels—reliably, at scale, and without escalating operational costs? Our solution: Confluent. We faced skepticism—tight project timelines, a steep learning curve, resistance to moving beyond Microsoft’s Event Hub, and the ever-present temptation to rely on legacy API calls. Instead of a big-bang approach, we started small: streaming CRM data via Confluent’s CDC connector into PostgreSQL on Azure. This eliminated API bottlenecks, mitigated quota limitations, improved resilience, and optimized operational costs. Challenges arose—simultaneous CRM migrations overloaded connectors, requiring fine-tuned data handling. But we pushed through. Today, our digital channels operate with a real-time, unified customer view, improved response times, and Confluent serving as the foundation of our data strategy. Now that we’ve learned to walk, it’s time to run. What’s next? Real-time energy insights from PV systems, heat pumps, and smart meters, plus proactive customer operations. Join us to explore how Confluent transformed our data strategy—and what’s ahead.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Lightning Talk
May 20, 2025 12:30
Lightning Talk
May 20, 2025 12:30
Managing user access in Confluent Cloud within modern, dynamic environments can be challenging, especially as teams scale. In this talk, we’ll explore how Just-in-Time (JIT) user provisioning combined with group mappings can redefine access control for your Kafka deployments. Learn how this automated approach streamlines user onboarding and ensures that access permissions align dynamically with your organization’s evolving structure. I’ll share practical examples and best practices for integrating these features with your identity provider, reducing administrative overhead, and tightening security without slowing down your operations. Key Takeaways: Automation in Action: Understand how JIT provisioning automates user creation at the point of authentication, reducing manual overhead. Streamlined Group Management: Insight into how dynamic group mappings simplify permission management, aligning user roles with organizational policies. Security & Scalability: Learn how automated access control strengthens security, reduces manual errors and scales with your organization’s needs.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Lightning Talk
May 20, 2025 12:30
Lightning Talk
May 20, 2025 12:30
At OpenAI, Kafka fuels real-time data streaming at massive scale, but traditional consumers struggle under the burden of Partition management, Offset tracking, Error handling, Retries, Dead Letter Queues (DLQ), and Dynamic scaling—all while racing to maintain ultra-high throughput. As deployments scale, complexity multiplies. Enter Kafka Forwarder—a game-changing Kafka Consumer Proxy that flips the script on traditional Kafka consumption. By offloading client-side complexity and pushing messages to consumers, it ensures at-least-once delivery, automated retries, and seamless DLQ management via Databricks. The result? Scalable, reliable, and effortless Kafka consumption that lets teams focus on what truly matters. Want to see how OpenAI cracked the code for frictionless, high-scale Kafka streaming? Join us as we dive into the motivation, architecture, and hidden challenges behind Kafka Forwarder—and discover how OpenAI orchestrates Kafka consumption across multiple clusters and regions with unparalleled efficiency.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Lightning Talk
May 20, 2025 12:30
Lightning Talk
May 20, 2025 12:30
Real-time retrieval-augmented generation (RAG) is poised to revolutionize how businesses leverage streaming vector data, but many current RAG architectures fall short of meeting the demands of real-time use cases. These architectures, originally designed for batch-based workflows, struggle with latency issues that prevent applications like real-time personalization, financial analysis, and fleet optimization from achieving their full potential. In this session, we’ll introduce an emerging real-time RAG reference architecture - originally designed by Uber - designed specifically to handle the complexities of streaming vector data. We’ll explore how this architecture overcomes the limitations of traditional RAG systems by enabling real-time analysis on freshly created vector embeddings. Attendees will leave this session with actionable insights into building and deploying real-time RAG systems, unlocking new possibilities for applications that demand both speed and accuracy in vector-driven analysis.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Lightning Talk
May 20, 2025 12:30
Lightning Talk
May 20, 2025 12:30
While Apache Kafka has typically ensured backward and forward compatibility, Kafka 4.0 will introduce breaking changes by dropping support for some older API versions (KIP-896). This session will detail these changes, explain the reasoning behind them, and equip platform teams to adapt. We'll explore the real-world impact, provide essential warnings for app developers, review the added metrics for identifying unsupported APIs, and develop an action plan to ready your clients for a smooth upgrade to Kafka 4.0.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Lightning Talk
May 20, 2025 12:30
Lightning Talk
May 20, 2025 12:30
The Kubernetes Gateway API is the preferred method for specifying how traffic flows both from clients outside a Kubernetes cluster to services running inside the cluster (aka north/south traffic), as well as how services can communicate inside a cluster (aka east/west traffic). When vendors support the standard, end-users reap the benefits such as portability and reduced vendor lock-in. The Kubernetes Gateway API, like the rest of Kubernetes, is under the governance of the Cloud Native Computing Foundation (CNCF), which in turn is part of the Linux Foundation. Today, the Gateway API includes standard ways to define HTTP and gRPC traffic into and within a Kubernetes cluster, with experimental work under way for TLS, TCP and UDP traffic. For HTTP, this means for example that given any incoming HTTP request, you can define filters, transformations, and routing rules that are applied before the request is passed to its final destination in the cluster. In this talk, I argue that event-driven architectures deserve the same treatment. Organisations want to unlock the data in Kafka, which puts pressure on Kafka admins that need to expose data to additional internal and external clients while maintaining strong governance. However there isn't a standard way to safely expose Kafka to clients at the scale and speed required by businesses. Existing Kubernetes solutions like the TCP support in the Gateway API are helpful but are not Kafka protocol-aware. In this talk, I’ll explain a new proposal for a Kafka extension to the Kubernetes Gateway API standard. This proposal makes it very easy for Kubernetes and Kafka administrators to manage access to their Kafka clusters in a cloud-native way. Kafka can even be securely exposed to consumers outside of the Kubernetes cluster, which opens new doors and ways of leveraging the valuable data within. We’ll review early implementations that support this initiative.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Lightning Talk
May 20, 2025 12:30
Lightning Talk
May 20, 2025 12:30
Migrating from a monolithic Postgres system to a distributed architecture is a high-stakes balancing act. Over five years, we transformed our legacy infrastructure, with Kafka Streams emerging as the backbone bridging old and modern systems, ensuring uninterrupted compliance, real-time reporting, and ML-driven insights. This talk details how we collaborated across legacy teams, new service developers, external partners, and ML engineers to build a resilient streaming platform. Our layered Kafka Streams topologies served as a universal abstraction layer, addressing key challenges: - Orchestrating Cross-Team Workflows: Legacy monoliths (using CDC with Debezium), Kafka-based new services, and external systems often produced conflicting schemas. We unified these data streams, enabling downstream innovation without tight coupling to source systems. - Simplifying Operations: To manage tens of complex topologies, we developed internal tools for automated topology validation, state store monitoring, simplified replays, and efficient debugging, significantly reducing new engineer onboarding time. - Compliance at Streaming Speed: Processing every transaction through Kafka Streams allowed us to implement real-time compliance checks with sub-100ms latency. This stream-first approach cut regulatory implementation time from weeks to days without altering legacy systems. - Reporting & Machine Learning: Integrating with Databricks, we converted real-time streams into batch-compatible datasets using Spark Structured Streaming and Delta tables for sub-minute processing. Our pipeline also enabled real-time feature engineering, enhancing ML model performance for recommendations and risk scoring. The target audience is data engineers, architects, and team leads tackling legacy modernization, cross-team collaboration, and real-time analytics. Attendees will learn strategies to align priorities, accelerate compliance, and unify real-time and batch pipelines for reporting and ML.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 13:00
Breakout Session
May 20, 2025 13:00
Confluent Managed solution for Apache Flink is expanding its analytical capabilities with the introduction of ML_FORECAST and ML_ANOMALY_DETECTION functions. Developers can now harness the power of established models like ARIMA for continuous forecasting and anomaly detection, all within the familiar SQL interface. This advancement eliminates the need for external ML services and enables continuous processing by embedding these analytical capabilities directly in your streaming pipeline. In this 20-minute session, tailored for developers with stream processing experience, we'll explore how to integrate sophisticated time series analysis into Flink SQL applications. We'll start by introducing the newly developed ML_FORECAST function, which brings ARIMA modeling capabilities to streaming data. We'll then demonstrate the ML_ANOMALY_DETECTION function and show how it can be combined with Kafka-sourced data streams for real-time anomaly detection. Finally, we'll build a complete streaming application that combines both functions to forecast metrics and detect anomalies in a continuous manner. By the end of the session, attendees will understand how to leverage these powerful new functions to build production-ready continuous forecasting and anomaly detection systems using just Flink SQL.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 13:00
Breakout Session
May 20, 2025 13:00
Apache Flink has grown to be a large, complex piece of software that does one thing extremely well: it supports a wide range of stream processing applications with difficult-to-satisfy demands for scalability, high performance, and fault tolerance, all while managing large amounts of application state. Flink owes its success to its adherence to some well-chosen design principles. But many software developers have never worked with a framework organized this way, and struggle to adapt their application ideas to the constraints imposed by Flink's architecture. After helping thousands of developers get started with Flink, I've seen that once you learn to appreciate why Flink's APIs are organized the way they are, it becomes easier to relax and accept what its developers have intended, and to organize your applications accordingly. The key to demystifying Apache Flink is to understand how the combination of stream processing plus application state has influenced its design and APIs. A framework that cares only about batch processing would be much simpler than Flink, and the same would be true for a stream processing framework without support for state. In this talk I will explain how Flink's managed state is organized in its state backends, and how this relates to the programming model exposed by its APIs. We'll look at checkpointing: how it works, the correctness guarantees that Flink offers, how state snapshots are organized, and what happens during recovery and rescaling. We'll also look at watermarking, which is a major source of complexity and confusion for new Flink developers. Watermarking epitomizes the requirement Flink has to manage application state in a way that doesn't explode as those applications run continuously on unbounded streams. This talk will give you a mental model for understanding Apache Flink. I'll conclude by explaining how these concepts that govern the implementation of Flink's runtime have shaped the design of Flink's SQL API.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 13:00
Breakout Session
May 20, 2025 13:00
Streaming data is a critical component of modern data architectures. This talk explores how to determine your streaming needs and design a robust solution using Apache Iceberg, a next-generation table format built for flexibility and scalability. We’ll dive into the foundational tools that enable streaming pipelines, including Apache Flink, Apache Kafka, Debezium, Kafka Connect, and Apache Spark, breaking down their roles and use cases in processing, transporting, and transforming streaming data. The talk will also highlight Iceberg-specific considerations, such as managing compaction to optimize query performance and dealing with delete files for handling record-level updates and deletes. Whether you’re building real-time analytics, powering machine learning models, or streaming raw data into your data lakehouse, this session will provide actionable insights and best practices for building reliable and efficient streaming workflows with Apache Iceberg.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 13:00
Breakout Session
May 20, 2025 13:00
Almost overnight, AI has rewritten the modern tech stack. At the top of the stack, Cursor, CoPilot, and Claude can now be found in most developer IDEs. At the bottom, foundational models like o1, Llama, and Gemini increasingly power backend business logic. What does that mean for everything else in the middle, like developer tools? And what does that especially mean for developers who need to be productive in managing, operating, and testing Kafka and its applications? Whether you use Flink, Confluent, WarpStream, or whatever else, attendees of this talk will learn an approach to Kafka tooling that balances short-term AI gains with long-term engineering best practices.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 13:00
Breakout Session
May 20, 2025 13:00
As one of Europe’s leading Crypto Exchanges, Bitvavo enables its ~2 million customers to buy, sell and store over 300 digital assets and provides a 24/7 service processing many thousands of transactions per second with stable sub millisecond execution times on its order flow. In this talk I will deep dive on the high level architecture of Bitvavo Exchange and details of how we process and transform trading Data using Confluent Cloud and Imply Druid in real time in order to provide useful insights to our customers focused on Candle Charts. Specifically I will cover architectural patterns, lessons learned and good practices routing and processing high volumes of market data from low latency systems while maintaining the high performance and scalability required from the leading European Crypto Exchange in Europe.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 13:00
Breakout Session
May 20, 2025 13:00
Whether you’re running mission critical applications or just shipping logs in real time, Tiered Storage can make your Kafka Cluster cheaper, easier to manage and faster. To understand the benefits, tradeoffs and the development history join this talk where we’ll uncover KIP-405 and showcase how the community backed up this important feature for Apache Kafka. We’ll rollback the KIP history, starting from 2018, to understand the major milestones and share details on how major industry leaders like Apple, Datadog and Slack helped out and tested both the Tiered Storage functionality and the first AWS S3 open source plugin. Furthermore, we’ll share details, gotchas, and tradeoffs of users successfully adopting Tiered Storage in production at scale, surpassing 150 GB/s of throughput. If you want to optimize your Apache Kafka cluster for performance, cost, and overall health, this session is for you.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 13:00
Breakout Session
May 20, 2025 13:00
One potential benefit of using a stream processor like Kafka Streams to build applications on the log is the ability to time travel. What if you could go back in time and query state stores to see when a bug was introduced? Or what if you could freeze the state of a running application and make a copy to do pre-deploy testing? This potential has largely gone unrealized because of a missing primitive in Kafka Streams - the ability to create a consistent snapshot that can be read and even cloned into a new application. Until now. We first explain exactly what snapshots and clones are. In short, a snapshot contains all the application's state up to some point in time, and no state after. A clone is a copied application created from this state. Next, we’ll make the case for why snapshots are a game-changing feature for Kafka Streams. Snapshots take your application into a multiverse (or otter-verse) of histories + branches. We’ll show how you can use them to explore your application’s history, interactively debug, test changes against real data, do blue/green deploys, and more. The remainder of the talk dives into the theory + practice of Kafka Streams snapshots. First we cover what’s been missing from Kafka Streams to support them. In particular, Kafka Streams currently lacks synchronization mechanisms to enable a consistent topology-wide snapshot. It also maintains state locally, which makes a snapshot difficult to access. Next, we discuss how we fill these gaps with Responsive. Specifically, we give an overview of RS3, our S3-backed store built on SlateDB, and how we use it with our SDK to take consistent snapshots. We’ll close this section with our vision for how snapshots can be contributed back to Kafka Streams. Finally, we’ll close the talk with a demo to show the power of snapshots in action. Viewers should come away with an understanding of snapshots/ clones, how they can be used to solve common problems, and how we’ve built them in Responsive.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 14:00
Breakout Session
May 20, 2025 14:00
A new major version of the KafkaConsumer is out, bringing in fundamental changes and improvements, as it’s the first version to fully implement the next generation of the Consumer Group Rebalance Protocol, introduced with KIP-848. It’s all a brand new production-ready feature now! Want to hear about how these major changes materialize in the KafkaConsumer? What’s in? What’s out? What’s different? This talk is for you then! We will cover the core of the new rebalance protocol, its implementation on the Java client, and how it significantly improves and simplifies the whole group consumption experience, addressing its main pain points. We will also share about the revamped KafkaConsumer threading model, shipped alongside the new rebalance protocol client implementation. It all sounds promising, but we do know that upgrades might be scary, right? Whether you’re a Kafka developer, operator, or architect, this talk will equip you with everything you need to confidently adopt KafkaConsumer 4.0 in your client applications. From how the live upgrade and protocol interoperability works, to detailed client changes: configuration changes, API deprecations and additions, improved API behavior, new metrics…
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 14:00
Breakout Session
May 20, 2025 14:00
Materialized views (MV) are a core concept in databases. In streaming databases like KsqlDB and RisingWave, MVs are maintained through continuous incremental stream processing engines. Users can define cascading MVs, or more specifically, MVs on top of other MVs, to express complex stream processing logic. However, the management of cascading MVs can introduce substantial technical hurdles for the database system. To illustrate, consider the scenario where an MV within the stack is unable to promptly process events from its upstream sources. This not only results in immediate spikes in latency for downstream MVs but also creates backpressure, potentially causing a system crash. Additionally, if an MV experiences a crash, it can trigger a pause in the entire MV stack's processing. Overcoming these challenges to recover the MV and its downstream MVs while preserving data consistency is a formidable task. In this presentation, I will begin by exploring the critical considerations when it comes to maintaining cascading materialized views: namely, consistency, elasticity, and fault tolerance. Subsequently, I will delve into the potential advantages and disadvantages of various approaches, along with strategies for efficient logging and checkpointing to minimize system downtime. Finally, I will share insights gained from our experiences in managing hundreds of cascading materialized views in real-world production environments.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 14:00
Breakout Session
May 20, 2025 14:00
Kafka is fast, but lag is everywhere. Data falls behind, consumers can’t keep up, and alerts keep firing. The usual reaction? Blame Kafka. The real issue? Kafka does exactly what it’s built to do: decouple producers and consumers. Lag isn’t a bug, it’s a side effect. Tracking offsets won’t save you. The real problem is time lag: the gap between when data is produced and when it’s actually processed. Consumer rebalances, inefficient commits, slow APIs, and bad scaling decisions all make it worse. Little’s Law predicts when lag will spiral, but most teams ignore it. This talk breaks down what’s really happening when Kafka "falls behind", why, and what you can do about it. Batching, commit strategies, parallel consumption, dropping messages, many options are available. Start controlling lag before it controls you.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 14:00
Breakout Session
May 20, 2025 14:00
Detecting problems as they happen is essential in today’s fast-moving world. This talk shows how to build a simple, powerful system for real-time anomaly detection. We’ll use Apache Kafka for streaming data, Apache Flink for processing it, and AI to find unusual patterns. Whether it’s spotting fraud, monitoring systems, or tracking IoT devices, this solution is flexible and reliable. First, we’ll explain how Kafka helps collect and manage fast-moving data. Then, we’ll show how Flink processes this data in real time to detect events as they happen. We’ll also explore how to add AI to the pipeline, using pre-trained models to find anomalies with high accuracy. Finally, we’ll look at how Apache Iceberg can store past data for analysis and model improvements. Combining real-time detection with historical data makes the system smarter and more effective over time. This talk includes clear examples and practical steps to help you build your own pipeline. It’s perfect for anyone who wants to learn how to use open-source tools to spot problems in real-time data streams.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 14:00
Breakout Session
May 20, 2025 14:00
Ever wondered how OpenAI keeps Kafka running smoothly while scaling, upgrading, or replacing clusters? Join us for an inside look at the strategies and tools we use for seamless Kafka migrations at massive scale — without ever missing a message. We'll also explore best practices for Kafka consumers, patterns for high availability and disaster recovery, and lessons learned from real-world incidents and edge cases. Attendees will learn a new set of tools and tactics for making infrastructure changes safely and transparently. We'll cover applications to specific technologies including Apache Kafka, Apache Flink for stateful stream processing, Apache Spark (Structured Streaming) for streaming ELT, and Uber uForwarder as a platform for managed Kafka consumers.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 14:00
Breakout Session
May 20, 2025 14:00
Autoscaling is an important part of modern cloud-native architecture. It allows applications to handle a big load at peak times while helping to optimize costs and make deployments more green and sustainable at the same time. Apache Kafka is well known for its scalability. It can grow with your project from a small cluster up to hundreds of brokers. But it was not very elastic for a long time and using dynamic autoscaling with it was very hard. This talk will guide the attendees through the main challenges of auto-scaling Apache Kafka on Kubernetes. It will show how these challenges can be solved with the help of new features added recently in Strimzi and Apache Kafka projects such as auto-rebalancing, node pools, or tiered storage. And it will help the users get started with the auto-scaling of Apache Kafka.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 15:30
Breakout Session
May 20, 2025 15:30
How can you leverage AI and LLM in a regulated environment without overwhelming development teams with security overhead? At Alpian—a fast-moving Swiss digital bank—Kafka and event-driven architecture form the backbone of our cloud-native platform. This event-first approach has enabled us to scale tenfold with a lean, expert team, paving the way for a new generation of internal and client-facing LLM applications. We’ve found that RAG is essential for enhancing accuracy and extending prompt context in generative AI. Continuous integration of real-time data is key to delivering the most recent and relevant information, as demonstrated by our budget assistant—a conversational tool advising clients on financial transactions. However, as a bank we must adhere to strict regulations on data management, encryption, locality, and sensitive data access. Robust guarantees on what data is shared, where it is stored, and how it’s managed are critical—even if these requirements seem at odds with using foundational models. How do we push innovation while remaining compliant? In this talk, you learn: System Design&Architecture: How the Alpian platform leverages Kafka events for service communication and as the foundation for AI and machine learning models with built-in security and privacy. Data Regulation Compliance: How Alpian meets data regulations by using Schema Registry and field-level encryption via Confluent CSFLE and how we integrated schema management and tagging rules directly into our CI/CD pipeline. Streaming RAG: How streaming is used to generate embeddings for the budget assistant, demonstrating that a central, secure event model can support LLM-based analytics and real-time AI without compromising data privacy or developer productivity. This “secure by design” approach shows how addressing data sensitivity at the event level protects your entire architecture—from analytics to microservices and AI-driven platforms—while maintaining innovation and compliance.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 15:30
Breakout Session
May 20, 2025 15:30
There’s a shift towards disaggregated architectures using object storage and open table formats. Cost efficiency, avoidance of vendor lock-in, standardization, and proper governance with a single source of truth are benefits of this new paradigm. However, there are also challenges. Most of our systems have been designed to work with physical disks, with their own optimization and debugging methods. Object storage works in a totally different way than physical disks and requires a new set of capabilities to minimize latency and decrease cloud costs. In this talk, Anton will share the lessons learned from moving data and systems from block storage to object storage. Using Apache Flink, a popular stream processing engine often used for data lake ingestion, as a case study, we’ll start with an overview of Iceberg and the FileIO pluggable module for reading, writing, and deleting files. We’ll continue with the journey of cost optimization with the Flink File Connector. Then, we'll delve into the creation of a custom Flink connector for object storage, addressing the limitations of the built-in File Connector. This custom connector uses techniques like metadata synchronization and optimized partitioning to reduce the number of requests without introducing additional latency. This talk is ideal for data engineers and architects who are building data lakes on object storage and using Apache Flink for data processing. You'll learn practical strategies and best practices for optimizing performance and cost in disaggregated architectures, including how to build custom Flink connectors tailored to object storage.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 15:30
Breakout Session
May 20, 2025 15:30
Event streaming is great but sometimes it’s easier to use a queue, especially when parallel consumption is more important than ordering. Wouldn't it be great if you had the option of consuming your data in Apache Kafka just like a message queue? For workloads where each message is an independent work item, you’d really like to be able to run as many consumers as you need, cooperating to handle the load, and to acknowledge messages one at a time as the work is completed. You might even want to be able to retry specific messages. This is much easier to achieve using a queue rather than a topic with a consumer group. KIP-932 brings queuing semantics to Apache Kafka. It introduces the concept of share groups. Share groups let your applications consume data off regular Kafka topics with per-message acknowledgement and without worrying about balancing the number of partitions and consumers. With this KIP, you can bring your queuing workloads to Apache Kafka. Come and hear about this innovative new feature starting with Early Access in Apache Kafka 4.0, and then Preview in Apache Kafka 4.1.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 15:30
Breakout Session
May 20, 2025 15:30
Traditional monolithic applications are migrated to the cloud, typically using a microservice-like architecture. Although this migration leads to significant benefits such as scalability and development agility, it also leaves behind the transactional amenities, such as serializability, that database systems have served developers for decades. Today’s transactional cloud applications depart from these database amenities by combining aspects of state management, service messaging, and service coordination in application logic. In this talk, I will present Styx, a novel open-source dataflow-based cloud application runtime that executes scalable, low-latency transactional applications. Cloud applications in Styx can be developed as Stateful Entities: simple objects that can form arbitrary stateful function orchestrations. The Styx runtime takes care of serializable state consistency, exactly-once processing, state and event partitioning, parallelization and scaling. In this session, you will learn how Kafka, with ideas from stateful stream processing and database transactions, can be combined in order to create transactional cloud application runtimes, bringing us back to the 80s: the time when developers did not have to deploy complex technology stacks, but rather author pure business logic and trust the database for the rest.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 15:30
Breakout Session
May 20, 2025 15:30
Event streaming with Kafka Streams is powerful but can feel overwhelming to understand and implement. Breaking down ​​advanced concepts into smaller single-purpose topologies makes learning more approachable. Kafka Streams concepts will be introduced with an interactive web application that allows you to visualize input topics, output topics, changelog topics, state stores, and more. What happens when state store caching is disabled? What if topology optimization is enabled? Or what if stream time isn't advanced? These questions will be easily explored by visualizing the topology and Kafka Streams configurations. This interactive tutorial's real-time events are generated by actual data on your laptop, including running processes, thread details, windows, services, and user sessions. Moving a window on your laptop can trigger many examples, allowing you to see how the topology handles them. The audience will select from an interactive poll of concepts to cover for the session, selecting from concepts on branching, emitting on change, windowing, repartitioning, joining, and more. Join me on this ​​journey of learning Kafka Streams. You'll deepen your understanding of Kafka Streams concepts and gain access to tools that let you explore advanced concepts independently. All examples and visualization will be available in an open-source project.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 15:30
Breakout Session
May 20, 2025 15:30
The outbox pattern is a common solution for implementing data flows between microservices. By channeling messages through an outbox table, it enables services to update their own local datastore and at the same time send out notifications to other services via data streaming platforms such as Apache Kafka, in a reliable and consistent way. However, as with everything in IT, there’s no free lunch. How to handle backfills of outbox events, how to ensure idempotency for event consumers? Doesn’t the pattern cause the database to become a bottleneck? And what about alternatives such as “Listen-to-Yourself”, or the upcoming Kafka support for 2-phase commit transactions (KIP-939)? It’s time to take another look at the outbox pattern! In this session I’ll start by bringing you up to speed on what the outbox pattern *is*, and then go on to discuss more details such as: - Implementing the pattern safely and efficiently - Its semantics, pros and cons - Dealing with backfills - Potential alternatives to the outbox pattern and the trade-offs they make
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 15:30
Breakout Session
May 20, 2025 15:30
Apache Flink is uniquely positioned to serve as the backbone for AI agents, enhancing them with stream processing as a new, powerful tool. We’ll explore how Flink jobs can be transformed into autonomous, goal-driven "Agents" that interact with data streams, trigger actions, and adapt in real time. We’ll showcase Flink jobs as AI agents through two key stream processing & AI use cases: 1) financial planning & detection of spending anomalies, as well as 2) forecasting demand & supply chain monitoring for disruptions. AI agents need business context. We’ll discuss embedding foundation models with schema registries and data catalogs for contextual intelligence while ensuring data governance and security. We’ll integrate Apache Kafka event streams with data lakes in open-table formats like Apache Iceberg, enabling AI agents to leverage real-time and historical data for consistency and reasoning. We’ll also cover latency optimization for time-sensitive use cases while preventing hallucinations. Finally, we’ll demonstrate an open-source conversational platform on Apache Kafka, where multiple AI agents are assigned to a business process, continuously process real-time events while optimizing for their individual goals, interacting, and negotiating with each other. By combining Flink and Kafka, we can build systems that are not just reactive but proactive and predictive, paving the way for next-generation agentic AI.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 16:30
Breakout Session
May 20, 2025 16:30
This talk presents a performance-tuned Apache Kafka pipeline for generating embeddings on large-scale text data streams. To store embeddings, our implementation supports various vector databases, making it highly adaptable to many applications. Text embeddings are fundamental for semantic search and recommendation, representing text in high-dimensional vector spaces for efficient similarity search using approximate k-nearest neighbors (kNN). By storing these embeddings and providing semantic search results given a query, vector databases are central to retrieval-augmented generation systems. We present our Kafka pipeline for continuously embedding texts to enable semantic search on live data. We demonstrate its end-to-end implementation while addressing key technical challenges: - First, the pipeline performs text chunking to adhere to the maximum input sequence length of the embedding model. We use an optimized overlapping text chunking strategy to ensure that context is maintained across chunks. - Using HuggingFace’s Text Embeddings Inference (TEI) toolkit in a lightweight, containerized GPU environment, we achieve efficient, scalable text embedding computation. TEI supports a wide range of state-of-the-art embedding models. - As an alternative to relying on Kafka Streams, our solution implements optimized processing of small batches using Kafka consumer and producer client APIs, allowing batched API calls to TEI. Our benchmark results confirm this choice, indicating high efficiency with significantly improved throughput and reduced latency compared to other approaches. - Finally, Kafka Connect allows real-time ingestion into vector databases like Qdrant, Milvus, or Vespa, making embeddings instantly available for semantic search and recommendation. With Kafka’s high-throughput streaming, optimized interactions with GPU-accelerated TEI, and efficient vector serialization, our pipeline achieves scalable embedding computation and ingestion into vector databases.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 16:30
Breakout Session
May 20, 2025 16:30
Data streaming engineers need tooling to efficiently provision, maintain, and evolve the data stream platform. The Confluent Terraform Provider does just that, providing human-readable infrastructure-as-code to build a Confluent Cloud environment in a matter of minutes. In this session, we’ll start from a blank canvas and create a new environment - complete with an Apache KafkaⓇ cluster, stream governance, and processing with Flink. Next we’ll create Kafka topics, define data contracts and determine how to transform our input data. We won’t forget about security and access controls - so let’s create service accounts with the necessary roles and permissions. Finally, we’ll set it all in motion by streaming events into Kafka and querying the output of our new data pipeline. When we’re done, you’ll have the tools needed to build and maintain your data streaming platform. Let’s do this!
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 16:30
Breakout Session
May 20, 2025 16:30
Curious about how OpenAI leverages Apache Flink for real-time data processing? In this session, we will dive into the technical intricacies of building the Flink platform at OpenAI. We’ll walk you through our Flink infrastructure setup—including deployment strategies, integration with Kafka, and our multi-region architecture. Additionally, we’ll explore how we’ve enhanced PyFlink to operate effectively at our scale. Finally, we’ll discuss the challenges we face, share our strategies for overcoming them, and outline our future roadmap.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 16:30
Breakout Session
May 20, 2025 16:30
Authenticating users is crucial in every production Kafka deployment. Apache Kafka ships with diverse authentication options, including password based SASL mechanisms and mTLS. As computing workloads adopt identities in the form of short-lived X.509 certificates, using them for mTLS offers significant advantages over passwords as they limit the impact of a credential leak and cannot be brute-forced. This talk starts by looking into how authentication works in Kafka and different configurations to customise it. We'll cover challenges faced when migrating users to mTLS and review options to minimise the operational effort. Then, we will share an approach that adds support for mTLS on the SASL listener so users can continue using their existing KafkaPrincipal and fallback to passwords seamlessly during the migration, giving cluster administrators and users confidence before moving away from SASL. Finally we will talk about how enabling Kafka brokers to serve distinct server and client certificates supports adoption of mTLS for inter-broker communication, and the learnings and pitfalls of rolling this out in the fleet.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 16:30
Breakout Session
May 20, 2025 16:30
Ingesting data from Apache Kafka into Apache Iceberg presents a recurring challenge in modern ETL workflows. The conventional approach relies on connectors, yet this method introduces operational hurdles due to the fundamental differences between these systems. Kafka excels at real-time streaming workloads, while Iceberg is optimized for analytical data storage and batch ingestion. Bridging these paradigms creates several inefficiencies: 1. Batch Operations on Streaming Storage: Attempting batch operations on Kafka, a system designed for streaming, results in ingestion bottlenecks and increased strain on Kafka brokers. One example is initial table hydration, where historical data retrieval often means uncached reads. This significantly delays topic-to-table hydration, impacting broker performance and straining resources in latency-sensitive environments. 2. Streaming Operations on Batch Storage: Applying streaming-like ingestion patterns to Iceberg generates numerous small Parquet files. These files pollute Iceberg’s metadata, degrade query performance, and increase the need for maintenance operations. 3. Lack of Unified Table Maintenance: Aggressive creation of small files containing updates will conflict with maintenance operations running in the background, leading to wasteful retries.  In this talk, Alex will share insights and lessons learned from building Tableflow, a unified batch/streaming storage system that allowed us to address all three. He will talk about specific solutions implemented in the Kora storage engine that mitigate these issues, making both systems work cohesively.  Attendees will gain actionable knowledge on overcoming operational challenges, implementing innovative solutions, and designing scalable pipelines that maximize the potential of both Kafka and Iceberg.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 17:30
Breakout Session
May 20, 2025 17:30
Prometheus has become the go-to solution for monitoring and alerting, ingesting metrics from applications and infrastructure. The ability to efficiently store high volumes of dimensional time series also makes Prometheus a perfect fit for broader operational analytics use cases. Examples include observing fleets of IoT devices, connected vehicles, media streaming devices, and any distributed resources. However, the high cardinality and frequency of events generated by these sources can be challenging. Apache Flink can preprocess observability events in real-time before writing to Prometheus. Reducing cardinality or frequency can improve the efficiency of your observability platform. Adding contextual information and calculating derived metrics enables deeper operational analysis in real time. Observing Flink with Prometheus is a solved problem, using Flink Prometheus Exporters. The new Flink-Prometheus connector, a recent addition to the Apache Flink connector family, addresses a different challenge. It enables using Flink to preprocess large volumes of observability data from various sources and write directly to Prometheus at scale. Kafka completes this architecture by providing reliable stream storage, ensuring ordered delivery of high-volume raw metrics into Flink—critical for maintaining Prometheus time series integrity In this talk, an Apache Flink committer and the maintainer of the new Flink-Prometheus connector will explore real-world use cases, key challenges, and best practices to leverage Flink and Prometheus together to supercharge your observability platform.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 17:30
Breakout Session
May 20, 2025 17:30
In this session, we will delve into the practical boundaries of the Kafka Streams DSL and showcase why the Processor API stands out as the ultimate tool for addressing complex streaming scenarios. Using Michelin’s tire delivery process as a guiding example, we will illustrate how the different join types can be implemented with the DSL and where its limitations begin to emerge.  The challenges of joining events from multiple topics, whether driven by event-based or time-based logic, and achieving fine-grained control over state stores led us to embrace the Processor API. While the DSL is convenient and expressive for many use cases, the Processor API consistently proves to be the most powerful solution for real-world applications requiring precision and flexibility.  Whether you’re an architect, developer, or Kafka enthusiast, this session will equip you with actionable insights into designing custom state stores, optimizing for low latency, and implementing adaptable join logic to meet evolving business needs. Rather than advocating for abandoning the DSL entirely, the session highlights the importance of recognizing its limitations and understanding why the Processor API is often worth the additional effort.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 17:30
Breakout Session
May 20, 2025 17:30
Let’s be honest: who wants to have more than one client to connect to a data system? Now consider Apache Kafka. It ships with four different Java clients: producer, consumer, admin, and streams. Want to create a topic in a producer application? Use the admin client and the producer client. Want to produce and consume? Either use the producer and the consumer, or use Kafka Streams. So how did we get here? And more importantly: how can we simplify it? Are incremental improvements enough? In this talk, we’ll propose a radical approach: a single unified Java client built from scratch for producing, consuming, processing, and administration tasks. We take you on a brainstorming session about what we can and cannot do, and what we want to achieve. How can we make simple things easy and difficult things possible? What does a modern Java API look like, using the standard library, a reasonable threading model, lambdas, and futures for async calls? We think it's high time that we take another look at the Java clients and build a client ready for the next decade. Come and join the conversation about the future of Kafka clients.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 17:30
Breakout Session
May 20, 2025 17:30
Pinterest rule engine platform, also known as Guardian, allows Subject Matter Experts (SMEs) to analyze real time event streams for patterns of abuse and create rules to block those patterns. Guardian addresses various domain-specific challenges, including spam / fraud enforcement, Media Research Council (MRC), account takeover attacks (ATO), risk monitoring, and unsafe content enforcement fanout, etc. However, the legacy Guardian platform was built under a monolithic architecture and is unable to keep up with the data scale and the increasing demands and risks faced by stakeholders. To tackle these challenges, we redesigned next-gen Guardian with event-driven architecture by choosing FlinkSQL for scalable event processing and integrating with various data storage systems like Kafka, Starrocks, Iceberg and internal KVstore that cater to specific data access requirements. In this talk, we would like to share the design and learnings of building the new system. Specifically, we’ll focus on how FlinkSQL interacts with different storage systems and how FlinkSQL is leveraged to support asynchronous data processing needs, including stream splitting & pruning, data ingestion, rule enforcement and rewind & replay. Our revamped architecture has yielded significant improvements in scalability, efficiency, development velocity and data compliance. Additionally, we will touch base some ongoing efforts on safe schema evolution, which have become more challenging under the event-driven design with various storage systems and FlinkSQL introduced.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 17:30
Breakout Session
May 20, 2025 17:30
In this session, we will share our journey of modernizing a 40-year-old mainframe legacy system at BEC Financial Technologies, a financial tech provider of core banking platform for over 20 banks in Denmark. We will discuss how we leveraged Kafka to enable real-time data streaming from our mainframe to the new Salesforce platform, creating new business opportunities. Our presentation will cover the transition from traditional end-of-day batch processes to real-time data synchronization, highlighting the challenges and solutions we encountered. We will delve into the importance of DevOps in managing Kafka topics and the implementation of a Kappa architecture to handle both massive spikes and usual real-time data volumes. Key patterns such as event-carried state transfer, compacted topics, and change data capture (CDC) will be explored, along with our data reconciliation mechanisms to ensure consistency between DB2 and Kafka. We will also share lessons learned from our experience, including mistakes to avoid, such as relying on centralized components for data transformation and not using a schema registry. Additionally, we will discuss the benefits of using Kafka for both online events and batch jobs, and the considerations for deciding between bulk and REST in runtime. Furthermore, we will talk about all architecture design and some critical design decisions that were made during the implementation. This talk is ideal for architects, data engineers, and developers looking to modernize their legacy systems and integrate real-time data streaming into their platforms. Join us to learn how BEC Financial Technologies and its subsidiary Scoutz are transforming the banking industry with innovative data streaming solutions.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 17:30
Breakout Session
May 20, 2025 17:30
How do you make 10TB of data per hour accessible, scalable, and easy to integrate for multiple internal consumers? In this talk, we’ll share how we overcame storage throughput limitations by migrating to Kafka Streams and developing a unified template application. Our solution not only eliminated bottlenecks but also empowered internal clients to build reliable Kafka Streams applications in just a few clicks—focusing solely on business logic without worrying about infrastructure complexity. We’ll dive into our architecture, implementation strategies, and key optimizations, covering performance tuning, monitoring, and how our approach accelerates adoption across teams. Whether you're managing massive data pipelines or seeking to streamline access for diverse stakeholders, this session will provide practical insights into leveraging Kafka Streams for seamless, scalable data flow.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 20
Breakout Session
May 20, 2025 17:30
Breakout Session
May 20, 2025 17:30
Have you ever wondered what happens under the hood when your Kafka client talks to the broker? In this session, we’ll take a deep dive into the Kafka wire protocol - the low-level language that powers communication between Kafka components. We’ll break it down step by step to make it easy to understand. You’ll see how requests and responses are structured and get a clear picture of how everything fits together. To make it even more concrete, we’ll look at code examples that show how to build a Kafka request byte by byte. By the end of this session, you’ll have a solid grasp of the Kafka wire protocol, giving you the tools to create your own Kafka client - if you wish!
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Breakout Session
May 21, 2025 9:00
Breakout Session
May 21, 2025 9:00
Artificial Intelligence thrives on data—especially timely data. In this talk, we’ll explore how to integrate event-driven architectures with popular AI/ML frameworks to unlock real-time intelligence. We’ll dive into the nuts and bolts of constructing a continuous data pipeline using open-source technologies like Kafka Streams, Apache Flink, and popular AI libraries such as TensorFlow or PyTorch. We’ll walk through end-to-end examples: from data ingestion, cleaning, and feature extraction, to model inference in near-real time. You’ll discover how to optimize model performance under streaming conditions, employing sliding windows and advanced time-series techniques. Additionally, we’ll address operational challenges such as model updates in production, handling concept drift, and balancing compute resources with streaming throughput demands. Attendees will leave with a blueprint for setting up an event-driven AI pipeline, armed with concrete tips on choosing the right open-source frameworks, monitoring streaming model performance, and orchestrating seamless model deployments. If you’ve ever wondered how to blend AI with real-time event processing to deliver actionable insights the moment they matter, this session is for you.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Breakout Session
May 21, 2025 9:00
Breakout Session
May 21, 2025 9:00
At Pinterest, counters are at the core of feature engineering, enabling teams to uncover event patterns and transform discoveries into actionable features. Our journey to build a robust counter framework surfaced several distinctive challenges: 1. The demand for a scalable architecture capable of managing hundreds of counters. 2. The ability to explore multiple window sizes from a minute to a week for the same counter with frequent updates to gain richer and faster insights. 3. The continual onboarding of new counters to stay ahead of emerging trends. In this session, we will delve into how we tackled these challenges by building a scalable and efficient real-time event counter framework with Apache Kafka, Apache Flink and a wide-column store. Our approach involves a two-stage data processing layer: - Stage 1: Flink jobs read event streams, apply filtering, enrich them with metadata outlining aggregation logic, and write intermediate records to Kafka. The stateless FlinkSQL queries dynamically generated from user-supplied SQL scripts ensures seamless addition and swift deployment of new counters. - Stage 2: A stateful Flink job consumes intermediate records, computes counter results and writes them to a wide-column store for online serving. To facilitate multiple window sizes with frequent updates, we leveraged a chain-of-window technique to efficiently cascade aggregated results from smaller to larger windows, therefore minimizing redundant computations and reducing data shuffling. We group counter results to emit multiple records in a single write. To avert write traffic surges as windows close, a custom rate limiter intelligently spreads out writes over time. These optimizations efficiently reduce write requests and avoid traffic spikes to the wide-column store, thus lowering costs and improving stability of the overall system. Attendees will gain insights into Flink’s SQL and windowing functionalities for scalable stream processing in real-world applications.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Breakout Session
May 21, 2025 9:00
Breakout Session
May 21, 2025 9:00
A major concern when starting with Kafka Streams is how to handle (un)expected errors. Generally, you want to track these errors, identify the records that caused the failures, and possibly reprocess them. To achieve this, you often need to implement a custom try-catch mechanism and send these errors to a dedicated topic. Does this challenge sound familiar? Welcome aboard! At Michelin, we face it too. For our own needs, we embedded this kind of error-handling mechanism in a home-made solution, but this solution has its limitations. Thus, we proposed two Kafka Improvement Proposals to enhance the Kafka Streams exception handling experience. KIP-1033 introduces a new processing exception handler, complementing existing deserialization and production exception handlers. Now, any exceptions that occur during processing are caught and transmitted to the handler, allowing you to define your error-handling logic. Complementary to this, KIP-1034 adds native support for routing failed records to a dead-letter queue topic of your choice. By the end of this talk, you will walk away with the latest updates these KIPs bring, helping you build more robust Kafka Streams applications against processing errors with less effort.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Breakout Session
May 21, 2025 9:00
Breakout Session
May 21, 2025 9:00
When building an event-driven architecture, teams often discuss exactly-once delivery and idempotency as if they were interchangeable concepts. This misunderstanding can lead to unnecessary complexity, increased operational overhead, and, in some cases, unreliable systems. In this talk, I will share a real-world case study from a project where our team fell into this trap. Initially, we assumed that enabling exactly-once semantics in Kafka would solve all our deduplication problems. However, as the system evolved, we realized that this approach didn’t eliminate the need for idempotency at the application level. The result? A complex, hard-to-debug system with redundant safeguards that sometimes worked against each other. Attendees will learn: The key differences between exactly-once delivery and idempotency. Why assuming one implies the other can introduce unnecessary complexity. How our team untangled this confusion and simplified our architecture. Practical guidelines for designing robust, event-driven systems without over-engineering them. This talk is ideal for engineers and architects working with Kafka and event-driven systems who want to avoid common pitfalls and build more maintainable, scalable architectures.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Breakout Session
May 21, 2025 9:00
Breakout Session
May 21, 2025 9:00
To offer its customers state-of-the-art digital services, Daimler Truck manages anonymized data from more than 12,000 connected buses operating in Europe using the CTP, an installed piece of technology that streams telemetry data (such as vehicle speed, GPS position, acceleration values, and braking force). The throughput going through the system is around 500k messages per second, on an average latency of around 5 seconds between the vehicle and when the data is available for consumption. Follow our three-year journey of developing self-managed, stateful Apache Flink applications on top of a treasure trove of near-real-time data, with the ultimate goal of delivering business-critical products like Driver Performance Analysis, Geofencing, EV Battery Health and Signal Visualization. Starting with a team completely new to Flink, we learned through trial, error, and iteration—eventually building a modern, resilient data processing setup. In this session, we'll share our victories, setbacks, and key lessons learned, focusing on practical tips for managing self-hosted Flink clusters. Topics will include working with Flink operators, understanding load distributions, scaling pipelines, and achieving operational reliability. We'll also delve into the mindset shifts required to succeed in building robust, real-time data systems. Whether you're new to Flink, transitioning from batch to streaming, or scaling existing pipelines, this talk offers actionable insights to help you architect, deploy, and optimize your self-managed Flink environment with confidence.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
May 21, 2025 9:00
May 21, 2025 9:00
Ever since Apache Kafka spearheaded the real-time revolution, there has been a real-time vs batch divide in the data engineering community. The tools, architectures, and mindsets were so different that most people worked with one or the other and companies had to effectively maintain two data engineering teams to meet their data processing needs. But the rise of Apache Iceberg is bringing a dramatic shift in the data landscape. We have batch data powerhouses, like Snowflake and Databricks racing to adopt Iceberg support, followed by streaming tools like Apache Flink, and Confluent, arguably the leader in real-time data, adopting Iceberg with its Tableflow product. Now, real-time databases, like Apache Druid, are integrating Iceberg as well, so that we can query both our real-time and batch data with a single tool, often in a single query. I believe we really are seeing a revolution in data engineering. In this session, we’ll take a look at three key players in this data revolution, Kafka, Druid, and Iceberg. We’ll start with a brief introduction to each tool and then we’ll see some examples of architectures that allow us to get the most value from our data regardless of how old it is. Finally, we’ll talk about where this might be heading and how we, as data engineers, can thrive in this brave new world. It is my hope that you’ll leave this session with an understanding of some key tools, architectural patterns, and ways of looking at data that will equip you to, more efficiently, deliver the quality data your organization needs.
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Breakout Session
May 21, 2025 9:00
Breakout Session
May 21, 2025 9:00
Once thought to be a temporary state, more and more organizations are finding out that maintaining on-prem Kafka and a cloud deployment may last years or even forever. Confronting disparity deployments means dealing with the inherent differences between on-prem and cloud Kafka. Whether you are using a service provider or maintaining your own, there are important items to tackle for long term success. In this talk we will cover the most important strategies to ensure a successful hybrid deployment such as: * Entitlement: How to manage and unify AUTHN and AUTHZ * Data availability: Patterns for data migration and continual sync between on-prem and cloud * One onboarding to rule them all: Altering your existing control plane to accommodate hybrid * Monitoring: Creating a standard for your entire Kafka estate At the end of this talk you will understand the critical aspects that need to be addressed to cut through the confusion, and enjoy long term hybrid stability.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Breakout Session
May 21, 2025 10:00
Breakout Session
May 21, 2025 10:00
Apache Iceberg is a robust foundation for large-scale data lakehouses, yet its incremental processing model lacks native support for CDC, making updates and deletes challenging. While many teams turn to Kafka and Flink for CDC processing, this comes with high infrastructure costs and operational complexity. We needed a cost-effective solution with minute-level latency that supports dozens of terabytes of CDC data processing per day. Since we were already using Flink for Iceberg ingestion, we set out to extend it for CDC processing as well. In this session, we’ll share how we tackled this challenge by writing change data streams as append tables and reading append tables as change streams. This approach makes Iceberg tables function like Kafka topics, with two added benefits: Iceberg tables remain directly queryable, making troubleshooting and application integration more approachable and streamlined. Similar to Kafka consumers, multiple engines can independently process Iceberg tables. However, unlike Kafka clusters, there is no need to scale infrastructure. We will also explore optimization opportunities with Iceberg and Flink, including when to materialize tables and how to choose between append and upsert modes to enhance integration. If you’re working on data processing over Iceberg, this session will provide practical, battle-tested strategies to overcome limitations and scale efficiently while keeping the infrastructure simple.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Breakout Session
May 21, 2025 10:00
Breakout Session
May 21, 2025 10:00
The day start with one problem: How to have content from a CMS system to reflect into multiple systems, especially in our global search (knauf.com) “right away”! That’s easy if you can wait (welcome back to 1920s). Today, milliseconds can mean the difference between a happy customer (in our case, editor) and one lost to frustration. Why is that impressive? ~200 editors working and ~3.3 millions of connections a day!!! Here is where the stream helps us with (near) real-time data processing. Our system efficiently integrates Contentful CMS, Confluent Kafka, and Apache Flink to create a real-time data pipeline that captures, processes, and analyzes content updates with lightning-fast speed and precision.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Breakout Session
May 21, 2025 10:00
Breakout Session
May 21, 2025 10:00
Attention, Data Streaming Engineers! In a world where speed is everything, choosing the proper stream processing framework is crucial. Want to supercharge your apps with real-time data processing? Should you opt for the streamlined Kafka Streams, a lightweight library for building streaming applications, or the feature-rich Apache Flink, a powerful and flexible stream processing framework? Viktor Gamov, a principal developer advocate at Confluent with extensive experience in stream processing, will walk you through the nuts and bolts of these two leading technologies. Through live coding and practical examples, we'll cover: • Mastering State Management: Discover how each framework handles stateful computations and pick up optimization tips. • Fault Tolerance in Practice: See how Kafka Streams and Flink keep your applications running smoothly, even when things go wrong. • Scalability Showdown: Find out which tool scales better under heavy loads and complex tasks. • Integration Insights: Learn how to seamlessly fit these frameworks into your existing setup to boost productivity. We'll explore scenarios showcasing each option’s strengths and weaknesses, giving you the tools to choose the best fit for your next project. Whether you're into microservices, event-driven systems, or big data streaming, this talk is packed with practical knowledge that you can immediately apply to your projects, improving performance and efficiency.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Breakout Session
May 21, 2025 10:00
Breakout Session
May 21, 2025 10:00
Autonomous agents are reshaping enterprise operations, but scaling them isn’t just about smarter AI—it’s about better infrastructure. Agents need real-time data, seamless tool integration, and shared outputs across systems. Rigid request/response models create bottlenecks, while event-driven architecture (EDA) unlocks the flexibility and scalability agents require. This session will show how EDA enables autonomous agents to thrive. Key takeaways include: - How EDA enables real-time, adaptive agent workflows and multi-agent problem solving. - Key design patterns like Orchestrator-Worker, Multi-Agent Collaboration, and Market-Based Competition. - Strategies for leveraging Kafka to handle scalability, fault tolerance, and low latency. - Lessons from microservices evolution to solve interoperability and context-sharing challenges. This talk is for engineers and architects building scalable AI systems. You’ll leave with actionable insights to design resilient, event-driven agents and future-proof your infrastructure for enterprise-scale AI.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Breakout Session
May 21, 2025 10:00
Breakout Session
May 21, 2025 10:00
Kafka is the backbone of modern data streaming architectures, but understanding what’s happening inside your clients has long been a challenge. KIP-714 changes the game by introducing a standardized and extensible way to expose client metrics, making observability accessible to everyone—not just Kafka experts. In this talk, we’ll explore why KIP-714 is a must-have for non-trivial systems, how it seamlessly integrates with popular observability stacks like OpenTelemetry, and what it means for debugging, performance tuning, and SLA monitoring. With real-world examples and a live demo, you’ll see how easy it is to connect Kafka clients to your telemetry and logging pipelines, unlocking deep insights with minimal effort. Whether you’re an engineer, SRE, or architect, you’ll walk away with practical knowledge on leveraging KIP-714 to make your Kafka-powered systems more transparent, resilient, and debuggable. No prior Kafka internals knowledge required—just a desire to see your data streams with clarity!
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Breakout Session
May 21, 2025 10:00
Breakout Session
May 21, 2025 10:00
At Wix, our Feature Store processes billions of events every day to power data-driven experiences - from real-time personalizations to machine learning model inferences. Our initial, Apache Storm–based design struggled under massive event volumes, resulting in significant data loss and complex maintenance challenges that limited our ability to scale. In this session, we'll share how we re-architected our online feature store with Apache Flink. You'll learn about the limitations of our previous design, the challenges we faced, and the principles that guided our shift to a high-performance online feature store. We'll illustrate how we combined Apache Spark, Apache Kafka, Aerospike and Apache Flink to achieve high-throughput, low latency feature computations and seamless real-time updates to over 2,500 features, without data loss. Expect a direct, architecture focused session where we’ll compare our old and new designs, sharing the lessons learned along the way, without the philosophical debates.
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Breakout Session
May 21, 2025 10:00
Breakout Session
May 21, 2025 10:00
This is a story of a team who was at the verge of becoming a victim of his own success, with a massive adoption of a technology and the challenge to maintain a decent service quality, while keeping the infrastructure stable and reliable. Implementing multi-tenancy in Kafka is not too complex when the number of use cases sharing the cluster is low. A central team can operate the infrastructure, taking care of the heavy lifting and creating required assets on demand. This is true until adoption starts growing and the solution becomes a problem. You are a bottleneck and any service request piles up until an agent can resolve it, increasing resolution times and frustration at the same pace. Also, the amount of mistakes committed when you are doing everything by hand is very high, provoking toil and unexpected side effects and operational complexities. In this talk, we'll explain how we reverted the tendency implementing a non opinionated, vendor-agnostic self-service solution, delegating completely the responsibility to maintain assets to our stakeholders (topics, permissions, schemas, connectors) and reducing resolution times for any of these activities several orders of magnitude, from days to seconds. All of these while keeping the balance between governance and autonomy. Also, we'll explain how we managed to implement a standard based documentation model using AsyncAPI specs, enabling data discovery and reusability and reducing duplication. The main takeaways of the talk will be: * Technical Architecture, architectural decisions and tradeoffs * Operational model of the solution * DSL Specification * Rollout strategy to reach Globally Available state * SLAs and Adoption KPIs
Breakout Session
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Lightning Talk
May 21, 2025 11:00
Lightning Talk
May 21, 2025 11:00
Managing large-scale Kafka clusters is both a technical challenge and an art. At Trendyol, our Data Streaming team operates Kafka as the backbone of a vast event-driven ecosystem, ensuring stability and seamless client experiences. However, we faced recurring issues during broker restarts—applications experienced connectivity errors due to misconfigured topics and improper bootstrap server configurations. To address this, we leveraged Confluent Stretch Kafka across multiple data centers, enabling automatic leader elections without service disruptions. Additionally, we enforced topic creation and alter policies and built a custom Prometheus exporter to detect misconfigured topics in real time, allowing us to notify owners and take corrective actions proactively. Through rigorous alerting mechanisms and enforcement via our Internal Development Platform (IDP), we have successfully eliminated disruptions during broker restarts, enabling smooth cluster upgrades and chaos testing. This session will provide practical insights into architecting resilient Kafka deployments, enforcing best practices, and ensuring high availability in a production environment handling thousands of clients. Attendees will learn: How multi-DC Kafka clusters ensure client continuity The impact of misconfigured replication factors and how to prevent them How real-time monitoring and alerts reduce operational risks Practical strategies to enforce resilient topic configurations
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Lightning Talk
May 21, 2025 11:00
Lightning Talk
May 21, 2025 11:00
Retrieval-Augmented Generation (RAG) has become a foundational paradigm that augments the capabilities of language models—small or large—by attaching information stored in vector databases to provide grounding data. While the concept is straightforward, maintaining up-to-date embeddings as data constantly evolves across various source systems remains a persistent challenge. This lighting talk explores how to build a real-time vector ingestion pipeline on top of Apache Flink and its extensive connector ecosystem to keep vector stores fresh at all times seamlessly. To eliminate the need for custom code while still preserving a reasonable level of configurability, a handful of composable user-defined functions (UDFs) are discussed to address loading, parsing, chunking, and embedding of data directly from within Flink's Table API or Flink SQL jobs. Easy-to-follow examples demonstrate how the discussed approach helps to significantly lower the entry barrier for RAG adoption, ensuring that retrieval remains consistent with your latest knowledge.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Lightning Talk
May 21, 2025 11:00
Lightning Talk
May 21, 2025 11:00
Partitions and Data Performance" delves into significant changes introduced in Apache Kafka with the introduction of KRaft mode, which stands for Kafka Raft Metadata mode. Traditionally, Apache Kafka, a popular distributed event streaming platform, has relied on Apache ZooKeeper for managing and coordinating Kafka brokers and clusters. However, the dependency on ZooKeeper posed several limitations and complexities, particularly in the areas of scalability, operational simplicity, and performance. In an ambitious move to address these challenges, the Kafka has developed the KRaft mode, essentially removing the dependency on ZooKeeper. we will discuss how KRaft mode simplifies the architecture by integrating the metadata management directly into Kafka, thereby making the system more straightforward to manage and potentially enhancing overall performance. Key points highlighted: 1. Introduction of KRaft Mode: The motivation behind moving Kafka to KRaft mode, emphasizing the desire to eliminate external dependencies and streamline the operation of Kafka clusters. 2. Performance Impacts: It explores the potential impacts of KRaft mode on partitions and data performance. Early benchmarking and testing suggest that KRaft could lead to performance improvements, particularly in reducing latency and enhancing throughput. However, the performance gains can vary based on different deployment scenarios and workloads. 3. Operational Simplicity: By removing ZooKeeper, Kafka strives to reduce the operational burden. This simplification is anticipated to make it easier to deploy, manage, and scale Kafka clusters, which is particularly beneficial in large-scale environments. 4. Migration Considerations: This touches upon considerations for users planning to migrate from ZooKeeper to KRaft mode. It highlights the importance of a thoughtful migration strategy to ensure system stability and data integrity.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Lightning Talk
May 21, 2025 11:00
Lightning Talk
May 21, 2025 11:00
Apache Kafka is all over the place! Now you can begin using brokers, topics, and clusters. If you, like many other users, prefer the terminal to the graphical interface or web consoles, you should be familiar with these CLI tools to increase your productivity. It is a collection of applications that can assist you with everything from creating a cluster to managing your Kafka Connect connectors or Kafka users. Join us as we go over some of the most practical CLIs for Kafka-related tasks as well as some of the fundamental commands that will help you out. Starting with the scripts that are part of the Apache Kafka distribution, we'll move on to more general tools like kcat for Kafka and kcctl for Kafka Connect. Last but not least, if you are using Kubernetes, we will discuss tools for managing customer resources, such as kubectl and strimzi-kafka-cli.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Lightning Talk
May 21, 2025 11:00
Lightning Talk
May 21, 2025 11:00
Struggled with the complexity of designing Kafka Streams applications? Without sufficient up-front architecture work, it’s all too easy to stumble into misunderstandings, rework, or outright failure. Although standards like UML and C4 model have guided software designs for years, stream processing has lacked a visual framework - until now. KSTD (Kafka Streams Topology Design) introduces an open standard and component library for describing and visualising Kafka Stream Topologies with Excalidraw. Simple principles ensure teams can keep diagrams simple yet include important details, build trust in their designs, and streamline the development lifecycle. You will learn how standardised diagrams support team alignment, and how KSTD fosters consistent and clear communication for Kafka Streams. Design up-front, avoid mistakes, save time, and build trust.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Lightning Talk
May 21, 2025 12:30
Lightning Talk
May 21, 2025 12:30
In today’s fast-paced world of real-time data processing, Apache Kafka has become essential for managing massive streams of information. A key performance metric is consumer lag—the number of messages waiting unprocessed in a consumer group. At first glance, rising lag appears to signal that consumers are falling behind. Yet, this metric alone can be misleading. Imagine a busy restaurant where orders pile up on the counter. It might be tempting to blame the chefs, but delays could also stem from late ingredient deliveries or a malfunctioning oven. Similarly, spikes in consumer lag might not indicate a failing consumer at all; they can result from external factors like sluggish downstream systems, temporary bottlenecks in external services, or sudden surges in data volume. This presentation challenges the conventional reliance on consumer lag as the sole indicator of performance. We will explore how integrating additional metrics—such as message ingestion rates, processing throughput, and the health of interconnected services—provides a more holistic view of your Kafka ecosystem. Through real-world case studies and practical insights, you’ll learn to diagnose issues more accurately and uncover hidden bottlenecks that might otherwise go unnoticed. Join us as we peel back the layers of Kafka’s consumer dynamics and move beyond a single metric. Discover strategies to optimize your data pipelines, ensuring they remain robust and agile amid evolving challenges.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Lightning Talk
May 21, 2025 12:30
Lightning Talk
May 21, 2025 12:30
Timers are a cornerstone of any software system, yet traditional implementations often rely on in-memory solutions or RDBMS dependencies. In this talk, I’ll present a unique approach that leverages Kafka alone to power timer functionality—eliminating the need for RDBMS and embracing a distributed architecture. Using Kafka Streams, I’ll demonstrate how to efficiently schedule delayed work at web scale, enabling resilient and scalable microservices.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Lightning Talk
May 21, 2025 12:30
Lightning Talk
May 21, 2025 12:30
Data sharing across cloud service providers is emerging as a mission-critical need for large-scale enterprises and for those looking for cloud agnostic event streaming solutions. While this can be achieved with Kafka multi-region architecture for high availability of data, it still remains a challenge for clients to establish data contracts and evolve their schemas to be in-sync across Kafka clients. In this talk, we will discuss how Fidelity Investments designed a multi-cloud global registry for schemas using schema registry as a centralized repository for managing schemas enterprise-wide. We will also deep dive into the topology of our global schema registry service and demonstrate how it remains resilient over different failure scenarios (region/CSP).  We will review metrics that are monitored for deeper observability and benefits such as the simplification of data contracts between producers and consumers and the untangling of data sharing channels across organizational units. Whether you are an analyst or an architect, this session will improve your ability to discover, manage, and correlate event schemas across a wide range of personas.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Lightning Talk
May 21, 2025 12:30
Lightning Talk
May 21, 2025 12:30
Druid and Kafka have been best buddies for 10 years, courting and sparking their way around data analytics parties to excess. At the end of 2024, the Apache Druid community released a new query API, DART, giving them access to even more parties and fun times - but this time, where being able to execute complex queries quickly matters more than concurrency. Join to see Druid's DART engine get the slideware treatment, and a Kafka + DART-powered Druid + Grafana analytics pipeline working, complete with step-by-step instructions to make your own.
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Lightning Talk
May 21, 2025 12:30
Lightning Talk
May 21, 2025 12:30
ShareChat is one of the largest social media platforms in India, with over 180 million monthly active users. We had a high-throughput real-time stream (>200K RPS) processing using a Node.js + Redis-based deduplication with a 24-hour window. In this talk, I'll walk you through how we transitioned to an Apache Flink-based solution, the challenges we faced, and the strategies that led to a 7x cost reduction. Topics Covered: 1. State Management at Scale: - Our early attempts to structure Flink state efficiently to handle massive-scale deduplication. - Lessons learned in making the job manageable and performant despite the huge state size. 2. Autoscaling Challenges: - How we leveraged the Flink Kubernetes Operator to enable autoscaling. - Why autoscaling initially increased duplication—and how we solved it. 3. When Async API Matters in Apache Flink: - Understanding the role of Async I/O in Flink. - How it impacts performance and resource efficiency in real-time streaming. 4. How We Achieved 7x Cost Savings
Lightning Talk
8:50 AM
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Hadar Federovsky, Akamai / Yulia Antonovsky, Akamai
ACC - Hall 1
May 21
Lightning Talk
May 21, 2025 12:30
Lightning Talk
May 21, 2025 12:30
You've been rocking Kafka Streams in production for a while, but guess what? Times have changed! Your Kafka skills have leveled up, and/or your business is pushing for a fresh twist... 🚀 Now, you need to revamp your entire kafka stream topology—without breaking everything! 😱 But how do you pull this off without disrupting consumers, ensuring accurate the last data updates into your internal topics, and avoiding the headache of renaming your microservice or tweaking input/output topics? 🫨 Join me as we dive into "remapping" fonctionnality from Kstreamplify, our open-source library from Michelin adding extra capabilities to Kafka Streams. Through a simple, hands-on example, I'll show you how to make these changes smoothly. Grab a seat 🪑—let's make topology changes a breeze! 🌪️✨
Lightning Talk
Organized by Confluent
Hadar Federovsky
Akamai
Swaroop Oggu from Databricks as part of the Current Keynote
Hadar Federovsky
Akamai
Keynote
Now Streaming Live

Stream On: From Bottlenecks to Streamline with Kafka Streams Template

Tuesday, May 20, 2025
5:30 PM - 6:15 PM

How do you make 10TB of data per hour accessible, scalable, and easy to integrate for multiple internal consumers? In this talk, we’ll share how we overcame storage throughput limitations by migrating to Kafka Streams and developing a unified template application. Our solution not only eliminated bottlenecks but also empowered internal clients to build reliable Kafka Streams applications in just a few clicks—focusing solely on business logic without worrying about infrastructure complexity. We’ll dive into our architecture, implementation strategies, and key optimizations, covering performance tuning, monitoring, and how our approach accelerates adoption across teams. Whether you're managing massive data pipelines or seeking to streamline access for diverse stakeholders, this session will provide practical insights into leveraging Kafka Streams for seamless, scalable data flow.

Location
Breakout Room 6
Level
Intermediate
Audience
Data Engineer/Scientist, Developer, Executive (Technical)
Track
Apache Kafka

Mike Araujo

Principle Engineer, Medidate Solutions

How do you make 10TB of data per hour accessible, scalable, and easy to integrate for multiple internal consumers? In this talk, we’ll share how we overcame storage throughput limitations by migrating to Kafka Streams and developing a unified template application. Our solution not only eliminated bottlenecks but also empowered internal clients to build reliable Kafka Streams applications in just a few clicks—focusing solely on business logic without worrying about infrastructure complexity. We’ll dive into our architecture, implementation strategies, and key optimizations, covering performance tuning, monitoring, and how our approach accelerates adoption across teams. Whether you're managing massive data pipelines or seeking to streamline access for diverse stakeholders, this session will provide practical insights into leveraging Kafka Streams for seamless, scalable data flow.

Speaking at