Python Streaming Analytics Leveraging the Composable Data Stack
Breakout Session
Modern streaming workloads do not need heavyweight stream processors to power analytics and ML on Kafka data. This Breakout Session presents a composable architecture where Kafka is the event backbone. Rust, Apache Arrow, and Python-based analytics libraries form a focused, production-ready data stack. The core theme is simple. Orchestrate vectorized operations and let the right tool own the right concern.
The talk targets an audience comfortable with data engineering and analytics platforms. It explains how IO and serialization are handled in Rust. CPU-bound operations such as aggregations, joins, and feature engineering run in Arrow-native libraries like Polars or Pandas. Data moves as zero-copy Arrow tables between Rust and Python. This design removes repeated serialization and avoids ad hoc in-memory formats.
The material is highly relevant for teams building real-time analytics and ML features on Kafka. These teams want low-latency, high-throughput pipelines without a monolithic stream processing framework. Attendees will see how a columnar, vectorized execution model on top of Kafka can still feel familiar to data analysts. It keeps the workflow close to “just analytics on tables,” even in a continuous streaming environment.
Audience takeaways include several concrete patterns. They will learn how to organize responsibilities between Kafka, Rust services, and Python analytics. They will see how zero-copy Arrow interchange simplifies cross-language pipelines. They will also learn how to keep the mental model simple while still meeting production performance and reliability requirements.
Arthur Andres
Tradewell Technologies Inc