Background

Breakout Session

Real-time Adaptive Controls for Kafka Consumers

Modern Kafka consumers are equipped with hundreds of tunables. There are a lot of these tunables such as worker pool sizes, autoscaling policies, throttlers and circuit breakers that directly effect the consumer resilience. Finding ideal initial values for these tunables requires deep technical expertise. Also, these workloads change over time, requiring regular effort to re-tune stale parameters. As a consequence, configuration errors have become a source of operational toil and one of the major causes of overload, cascading service and system failures across the industry. Consumers should aim to expose a minimal configuration surface by dynamically adjusting parameters based on observations. Praveen will provide a deep-dive into how CrowdStrike is using real-time Adaptive Controls(inspired from TCP congestion control) to dynamically tune these parameters for improved resiliency using real-time sampling of errors and latencies, removing the need for periodic adjustment. He will also discuss lessons learned deploying the feature to CrowdStrike’s massive production systems that handles multiple trillions of events per day without causing any incidents.

Praveen Yedidi

CrowdStrike

Download