Lexical Love: Rediscovering the Power of Lexical Search in RAG
John Berryman • Location: Theater 5 • Back to Haystack 2025
“Retrieval-Augmented Generation (RAG) is often built around semantic search, where documents are chunked, embedded as vectors, and retrieved based on their meaning. While this approach is powerful, it also comes with significant challenges—large indexes, clunky filtering mechanisms, and a lack of transparency in search results. Perhaps most critically, semantic search struggles with exact matches, making it difficult to retrieve specific IDs, phrases, or jargon words that weren’t present in the original model’s training data.
In this talk, we’ll explore the role of lexical search in RAG workflows, highlighting how it can solve many of these issues. We’ll start with an overview of how lexical search works, including indexing, analysis, and search techniques like Boolean queries, faceted search, and phrase matching. We’ll contrast it with semantic search, explaining when and why you might want to use lexical search instead of vector-based methods.
From there, we’ll walk through a practical implementation of lexical search in RAG. Using real-world examples, we’ll demonstrate how to index data, structure search queries to maximize relevance, and integrate lexical search into a RAG pipeline. We’ll also show how language models can interact with search results dynamically, refining queries and applying filters in response to user input.
Of course, lexical search isn’t a silver bullet. We’ll discuss its limitations. And then we’ll briefly introduce some of the hybrid approaches—ways to combine the strengths of both lexical and semantic search and possibly get the best of both worlds.
By the end of this session, you’ll have a clear understanding of how lexical search fits into RAG, when to use it, and how to implement it effectively. If you’re working with LLM applications and want to make search more precise, transparent, and adaptable, this talk is for you.”

John Berryman
Arcturus LabsJohn Berryman is the founder and principal consultant of Arcturus Labs, where he specializes in AI application development (Agency and RAG). As an early engineer on GitHub Copilot, John contributed to the development of its completions and chat functionalities, working at the forefront of AI-assisted coding tools. John is coauthor of Prompt Engineering for LLMs (O'Reilly). Before his work on Copilot, John's focus was search technology. His diverse experience includes helping to develop next-generation search system for the US Patent Office, building search and recommendations for Eventbrite, and contributing to GitHub's code search infrastructure. John is also coauthor of Relevant Search (Manning), a book that distills his expertise in the field. John feels fortunate to have worked at the intersection of cutting-edge AI applications and foundational search technologies, giving him the opportunity to contribute to innovation in both LLM applications and information retrieval.