From Events to Insights: Kafka’s Role in Myntra’s Real-Time Data Revolution

Lightning Talk

In today’s fast-paced world, where actionable business insights drive competitive advantage,

tapping into dynamic real-time streams marks the evolution of data-driven decision-making and

revolutionizing business intelligence.

Traditional batch-based data pipelines slowed down decision-making, causing delays in business

insights, and limiting our ability to respond in real time.

Join this session to learn, how at Myntra, we revamped our data infrastructure by transforming

batch-based pipelines into a robust, real-time streaming architecture, reducing latency from hours

to mere minutes.

This session will also delve into how we leveraged Kafka, Spark Structured Streaming, and

Delta Lake to create a scalable, low-latency ingestion pipeline. By implementing exactly-once

semantics and optimizing data flows, we achieved the reliability and scalability needed to power

mission-critical use cases.We’ll also explore how this transformation addressed the inherent

limitations of traditional batch systems, enabling data freshness, operational agility, and the

delivery of actionable near real-time business insights. These advancements have redefined how

Myntra supports its dynamic ecosystem, driving unprecedented agility.

The audience will gain actionable strategies for building real-time streaming pipelines,

overcoming data freshness challenges, and unlocking the potential of near real-time

insights to fuel innovation and growth at scale.

Key highlights:

1. Kafka-Centric Streaming Architecture: Delve into the architectural design where Kafka

powers seamless integration between streaming and batch workflows,efficiently handling

millions of events/minute.

2. Data Freshness & Completeness Challenges: Understand how Myntra ensures data freshness

and completeness using write ahead logs, micro-batch freshness propagation.

3. Operational Innovations with Delta and Spark: Explore how Apache Spark enabled efficient

real-time ingestion, exactly-once semantics and fault tolerance in high-throughput.


Shrvan Warke

Myntra