Next-Gen RAG Architectures for Streaming Vector Data

Lightning Talk

Real-time retrieval-augmented generation (RAG) is poised to revolutionize how businesses leverage streaming vector data, but many current RAG architectures fall short of meeting the demands of real-time use cases. These architectures, originally designed for batch-based workflows, struggle with latency issues that prevent applications like real-time personalization, financial analysis, and fleet optimization from achieving their full potential.

In this session, we’ll introduce an emerging real-time RAG reference architecture - originally designed by Uber - designed specifically to handle the complexities of streaming vector data. We’ll explore how this architecture overcomes the limitations of traditional RAG systems by enabling real-time analysis on freshly created vector embeddings.

Attendees will leave this session with actionable insights into building and deploying real-time RAG systems, unlocking new possibilities for applications that demand both speed and accuracy in vector-driven analysis.

Chad Meley

StarTree

Bengaluru 2025

London 2025

Next-Gen RAG Architectures for Streaming Vector Data

Lightning Talk

Download Slide Deck

Chad Meley

StarTree