Commoditizing Inference: Why Your Query Language Should Speak AI

Aurélien Foucret • Location: TUECHTIG • Back to Haystack EU 2024

AI mode inference is making its way into every modern search stack, powering semantic retrieval, result re-ranking, and text generation. In this talk, we’ll explore what it means to commoditize inference as a query-native primitive, just like filters or scoring functions. After a quick overview of Elasticsearch’s inference APIs, we’ll walk through how inference can be invoked directly from the query DSL. We’ll discuss the benefits of integrating these primitives into the query layer (simplicity, composability, and accessibility) as well as the tradeoffs compared to managing inference in your application code. To finish, we’ll then introduce new inference primitives in Elasticsearch Query Language (ES|QL) (EXT_EMBEDDING, COMPLETION, and RERANK) and show how they bridge the gap between low-level control of the code and declarative expressiveness of a DSL. The session will be practical and example-driven with plenty of example for search practitioner and analyst.

Watch the Video

Aurélien Foucret

Elastic

Aurélien Foucret is a Principal Software Engineer at Elastic, with expertise in search relevance, inference, and ES|QL. He joined Elastic in 2019 after over a decade focused on eCommerce search. Aurélien is passionate about building practical search experiences and has worked on projects such as Learning to Rank and introducing inference-based primitives in ES|QL. When he’s not improving search, he’s renovating his home, running, or perfecting sourdough bread.