Breakout Session

Kafka Streams at Scale: Avoiding Common Pitfalls

< All 2024 Sessions

At LittleHorse, we use Kafka Streams extensively for a use-case that requires transactional semantics, high throughput, low latency, large amounts of state storage, and high availability. Seems like a tall ask, eh?

In short, Kafka Streams has delivered everything we need in order to satisfy those hefty requirements. However, along the way we encountered many gotcha's (or, perhaps I should say, "learning moments") which made us all the wiser in our journey to what we have today: a rock-solid application built entirely upon Kafka Streams.

In this talk, I will start with a very brief overview of Kafka Streams architecture and how LittleHorse uses it. I will then share with you some of the less-obvious pieces of Streams knowledge that have helped us on our way.

How long should your session.timeout.ms be? Should you use static membership? What causes rebalancing storms? Why is my application OOM'ing even though I'm not out of heap space on the JVM? Why is it a bad idea to make remote network calls in your processors? How do I keep track of my standby tasks? What happens when I add instances?

We will answer all of the above questions—and more—in this talk, allowing you to run successful Kafka Streams applications at scale.

Mateo Rojas

LittleHorse Enterprises LLC

Download