How We Replaced Node.js with Apache Flink for Real-Time Deduplication and Cut Costs by 7x
Lightning Talk
ShareChat is one of the largest social media platforms in India, with over 180 million monthly active users.
We had a high-throughput real-time stream (>200K RPS) processing using a Node.js + Redis-based deduplication with a 24-hour window.
In this talk, I'll walk you through how we transitioned to an Apache Flink-based solution, the challenges we faced, and the strategies that led to a 7x cost reduction.
Topics Covered:
1. State Management at Scale:
- Our early attempts to structure Flink state efficiently to handle massive-scale deduplication.
- Lessons learned in making the job manageable and performant despite the huge state size.
2. Autoscaling Challenges:
- How we leveraged the Flink Kubernetes Operator to enable autoscaling.
- Why autoscaling initially increased duplication—and how we solved it.
3. When Async API Matters in Apache Flink:
- Understanding the role of Async I/O in Flink.
- How it impacts performance and resource efficiency in real-time streaming.
4. How We Achieved 7x Cost Savings
Andrei Manakov
Sharechat