Agentic Tuning: Search Relevance on Autopilot
Session Abstract
Search relevance tuning is notoriously difficult, often requiring a deep understanding of Lucene scoring, complex query DSLs, and iterative manual testing.
This session introduces Agentic Relevance Tuning—a framework that leverages LLM-based agents to automate the full search lifecycle making search tuning faster, more accurate, and accessible.
Session Description
Search relevance tuning is notoriously difficult, often requiring a deep understanding of Lucene scoring, complex query DSLs, and iterative manual testing. While tools like OpenSearch User Behavior Insights (UBI) and the Search Relevance Workbench provide the data and the environment for improvement, the leap from “analyzing data” to “deploying a fix” remains a significant hurdle for many.
This session introduces Agentic Relevance Tuning (ART)—a framework that leverages LLM-based agents to automate the full search lifecycle. We demonstrate how to move beyond manual experimentation by building an infrastructure of specialized agents that identify issues, hypothesize improvements, and orchestrate offline and online evaluation.
Attendees will learn how to:
- Identify Opportunities: Use OpenSearch UBI to automatically detect relevance gaps through user signals.
- Automate Evaluation: Leverage the Search Relevance Workbench for offline “judge” agents to run automated benchmarks and identify winning hypotheses.
- Close the Loop: Transition from issue to resolution using a conversational interface that lowers the technical barrier for non-experts.
- Validate in Production: Deploy agent-orchestrated interleaved A/B testing to ensure real-world improvements.
By combining the analytical power of modern search engines with the reasoning capabilities of agents, we can make search tuning faster, more accurate, and accessible to the entire business—not just the search experts.