Current Bengaluru 2025

Session Archive

Check out our session archive to catch up on anything you missed or rewatch your favorites to make sure you hear all of the industry-changing insights from the best minds in data streaming.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

How I built an entire data platform by myself that thousands of data streaming engineers use in one year

Zach built an entire data platform by himself that thousands of data engineers use each year. In this talk, he goes through how he did it and the choices and lessons he learned along the way!

Presenters

Zack Wilson

Breakout Session

Seamless SSO Configuration in Confluent Control Center for Enhanced Streaming Authentication with CFK

The session will demonstrate the steps required to set up Single Sign-On (SSO) in Confluent Control Center using Confluent for Kubernetes (CFK). It will use a practical example to cover the technical aspects of configuring SSO using OpenID Connect (OIDC) with Confluent Control Center, focusing on the necessary configurations with CFK to automate the setup process. Attendees will gain insights into enhancing security and user management in data streaming environments and preparing for OAuth deployment in Confluent Platform.

Presenters

Varsha Sharma, Shubham Goel

Breakout Session

Streaming Aggregations at Scale using Kafka and Kafka Streams

Windowed aggregation on streams is a powerful concept for analytical data processing, especially in cases where waiting hours or even minutes for data to be available is inconceivable. While most people think of aggregations as an analytical requirement, they also help trim down data to size and can be critical to scaling systems without leading to ballooning costs. We had a similar use case in ShareChat where we had hundreds of thousands of counter increments (updates)/sec for everything from the number of views on a post to revenue numbers. Our databases could not keep up with the write volume, and there were frequent hot-spotting issues. Furthermore, the data was often inconsistent due to multiple writers, and taking locks further added to our misery. To solve this issue, we used Kafka and Kafka Streams to build an aggregation framework that can handle hundreds of thousands of counter increments per second. Streaming aggregation helped us batch updates and helped reduce the write throughput in our DBs. Further, it helped solve our hot-spotting issues and eliminated the need to take locks. This talk discusses the challenges of building the platform, managing an in-house Kafka setup, and the lessons learned from tuning Kafka Streams. Furthermore, we discuss how we optimised the solution to scale to even 1M updates/sec with zero hiccups or manual involvement. Today, the offering forms an integral part of our core streaming platform, and the talk will be helpful for developers who have similar requirements for streaming aggregations or want to know more about event-driven architectures using Kafka and Kafka Streams.

Presenters

Shubham Dhal, Shivam Yadav

Breakout Session

Continuous forecasting and anomaly detection with Flink SQL

Confluent Managed solution for Apache Flink is expanding its analytical capabilities with the introduction of ML_FORECAST and ML_ANOMALY_DETECTION functions. Developers can now harness the power of established models like ARIMA for continuous forecasting and anomaly detection, all within the familiar SQL interface. This advancement eliminates the need for external ML services and enables continuous processing by embedding these analytical capabilities directly in your streaming pipeline. In this 20-minute session, tailored for developers with stream processing experience, we'll explore how to integrate sophisticated time series analysis into Flink SQL applications. We'll start by introducing the newly developed ML_FORECAST function, which brings ARIMA modeling capabilities to streaming data. We'll then demonstrate the ML_ANOMALY_DETECTION function and show how it can be combined with Kafka-sourced data streams for real-time anomaly detection. Finally, we'll build a complete streaming application that combines both functions to forecast metrics and detect anomalies in a continuous manner. By the end of the session, attendees will understand how to leverage these powerful new functions to build production-ready continuous forecasting and anomaly detection systems using just Flink SQL.

Presenters

Siddharth Bedekar

Breakout Session

Kafka Superpowers for Your Jupyter Notebook and Python

You love Jupyter Notebooks and Python? This session shows you how to also do all kinds of Kafka-related tasks - directly in your Jupyter Notebook. The key for a seamless Kafka/Python/Jupyter Notebook experience is the Open Source library "kafi", the "Swiss Army Knife" for Kafka. We will show you how to cover all of the following use cases (and more) with "kafi" in your Jupyter Notebook, each with only a tiny number of lines of Python code: * Kafka administration * Schema Registry administration * Kafka Backups * Simple stream processing * Microservices/agents * Building a bridge from Kafka to Pandas dataframes/files (e.g. turn a Kafka topic into a Pandas dataframe or Parquet file) After this session, many Kafka use cases that might have been terrifying for you before will have turned into easy as pie.

Presenters

Ralph M. Debusmann

Breakout Session

Seamless Authentication to Confluent Cloud Using OAuth/OIDC with Okta as Identity Provider

Seamless authentication to Kafka can be efficiently achieved through the integration of OAuth 2.0 and OpenID Connect (OIDC), enabling secure, token-based access to Kafka clusters. By leveraging these protocols, organizations can significantly enhance their security posture while simplifying identity management. OAuth 2.0 provides a robust framework for token-based authentication, reducing the reliance on long-term user credentials and mitigating the risks associated with credential exposure. This integration allows businesses to centralize authentication, improve access control, and ensure that only authorized users and applications can interact with Kafka resources. With OAuth 2.0 and OIDC, organizations can enforce role-based access control (RBAC), which ensures that users and applications only have access to the resources they need. This level of granularity in access management helps prevent unauthorized access and minimizes the potential attack surface. Throughout the integration process, key concepts of OAuth 2.0 and OIDC will be covered, along with practical steps for configuring them within Kafka. By the end of the session, participants will understand how to implement OAuth 2.0 and OIDC to streamline authentication, improve security, and simplify Kafka client access management in enterprise environments, all while maintaining a high level of control and compliance.

Presenters

Shivaprakash Akki

Breakout Session