Evolution of Relevance Engineering to Context Engineering
Session Abstract
As search powers RAG and agentic systems, relevance goals shift from ranking documents to assembling effective context. This talk explores how traditional lexical, semantic, and hybrid relevance changes when feeding LLMs, with lessons on chunking and snippet extraction, diversification, evaluation, and more.
Session Description
As RAG and agentic search mature, retrieval extends from being primarily a relevance concern to something that also drives latency, cost, and system reliability at scale. We’re very good at traditional search relevance. We know how to tune BM25, semantic, and hybrid search, and rerank aggressively. But when search results are used as context for LLMs, many of our familiar assumptions start to break down.
Once retrieval feeds a reasoning system instead of a human, the definition of “relevant” quietly changes. The goal is no longer to rank the best documents, but to assemble the right context. Chunking, snippet extraction, and diversification become critical relevance skills. On top of that, relevance misses have real cost, cascading into additional tool calls, increased token utilization, added latency, and hallucinations.
In this talk, we’ll explore how relevance shifts when the objective moves from “return the best documents” to “construct effective context for reasoning”. We’ll walk through how traditional lexical, semantic, and hybrid relevance techniques behave when used in RAG and agentic workflows, highlighting both where they still work and where they fail in subtle and surprising ways.
Along the way, we’ll cover chunking and snippet extraction, result diversification, and how evaluation needs to evolve when the ranked list is no longer the end product. The talk closes with lessons learned from real-world systems, common roadblocks teams encounter when making this transition, and concrete tips for adapting existing search pipelines to serve LLM-driven applications more effectively.