Haystack US 2023

Talks from the Search & Relevance Community at the Haystack Conference!

The conference sessions were held at the Violet Crown movie theater in central Charlottesville and streamed live via Zoom.

This was our Event Safety Guide and Code of Conduct.

Day 1, Tuesday, April 25th, 2023

Time Track 1 Track 2
8:00-9:00am EDT Registration

Location: Entrance of the Violet Crown

9:00-9-15am EDT Welcome to Haystack!

Charlie Hull
Location: Theater 5

9:15-10:00am EDT Opening Keynote - Relevance in the Age of Generative Search

The search relevance landscape is rapidly shifting. Due to the rise of Transformers and Large Language Models (LLMs) and the more recent emergent capabilities shown with Foundation Models, public search engines are now rushing to integrate models like chatGPT directly into the search experience as a form of "generative search". These models can perform abstractive question answering, summarization, and even new content generation, but when untethered from underlying search results, often hallucinate bogus or misleading information. As search relevance practitioners, it's important for us to know how these technologies work and how to best integrate them into search experiences to drive accurate, relevant results. In this talk, we'll walk through code examples and strategies to integrate this emerging class of language models into our search applications, covering the limitations of the models amidst other relevance techniques as well as the amazing new capabilities they enable.

Trey Grainger
Location: Theater 5

10:15-11:00am EDT Learning to hybrid search: combining BM25, neural embeddings and customer behavior into an ultimate ranking ensemble

Traditional term search has good precision but lacks semantics. Neural search is good at semantics but misses customer behavior. LTR approach adapts to customer behavior, but only if your baseline retrieval is good enough. The current hype about neural search can make an impression that it's the ultimate solution for all problems of legacy term search and LTR. You only need to do a *very simple* thing of fine-tuning a neural network to notice all the dependencies between queries, documents and customer behavior on all the data you have. But what if instead of replacing A with B, you can combine the strengths of all the approaches? In this talk, we will take an example of an e-commerce search with an Amazon's ESCI dataset and compare traditional text matching and LTR approaches with neural search methods on real data. We will show how combining multiple old, and new approaches in a in a single hybrid system can deliver an even better result than each of them separately.

Roman Grebennikov
Location: Theater 5

Talking to Non-Searchers about Search Relevance

One of the hardest things about getting into the search space is not actually search relevance. It’s educating all the “non-searchers” on what search relevance is, how it works, and why it’s important. Join us to learn how to talk through the parts of search with everyone from your CEO to external clients.

David Tippett & Stavros Macrakis
Location: Theater 7

11:15am-12:00pm EDT Creating Representative Query Sets for Offline Evaluation

The scaling of AI/ML teams at Getty Images has resulted in an increased demand for experimentation. As an organization, we seek to better understand the implications of an experiment before proceeding to a customer-facing A/B test, and also to reduce the list of A/B test candidates to something more manageable. Offline testing is a tool that allows us to understand the impacts of sort algorithms on our guardrail metrics, and in some ways to estimate the impact to high-level customer metrics such as conversion and interaction. For offline testing to be indicative of online results, query sets need to be constructed that are representative of customer activity across a spectrum of query attributes. In this presentation, I will discuss a simple method to construct minimal, randomly sampled query sets that are representative across many attributes.

Karel Bergmann
Location: Theater 5

How far is the Empire State Building from the Eiffel Tower? 0.22

... and both are 0.14 from the keyword "summit". This talk describes how we use semantic vectors to improve the discovery experience of our online travel marketplace. We apply such text-based embeddings for recommendations, ranking, and ad optimisation. The applied algorithms have evolved over time, starting with a slightly twisted application of word-2-vec to fine-tuning pre-trained BERT models on our own data. We will try to share helpful advice from the application side like how we applied simple vector arithmetics to improve results.

Ansgar Gruene
Location: Theater 7

12:00pm-1:30pm EDT Lunch

Find lunch at one of the many options available on Charlottesville’s Downtown Mall

Location: Your choice!

1:30-2:15pm EDT Dive into NLP with the Elastic Stack

Natural language processing has been changing radically since the advent of deep learning and new language models. How can we practically use these tools in our products, especially when we are more of a developer than a Data Scientist? We will see that we can already use many of the features brought by these models directly in Elasticsearch and its suite: - Semantic vectorization "on the fly" (via Ingest Pipeline) - Semantic search (semantic vector-based) - Named Entities Recognition - Multimodal search (e.g. retrieve images from text query) - Automatic classification We will quickly introduce these concepts and their practical applications. We'll show that we do not necessarily need to revolutionize the technical stack to interact more naturally with our users.

Lucian Precup & Pietro Mele
Location: Theater 5

Breaking Search Performance Limits with Domain-Specific Computing

The demand for advanced search is higher than ever: organization data is growing exponentially, while millions to billions of documents are becoming standard and the search queries become more resource-intensive, leading to longer query times and higher latency. Yet, it is becoming extremely difficult to achieve search at scale due to cloud computing costs, software limits, and real-time requirements. Furthermore, general purpose CPUs are limited in processing power and parallelism, especially when dealing with complex algorithms such as TF/IDF, k/ANN, etc. which makes it nearly impossible to reach consistent real-time latencies of under 100ms. To solve this, a new approach is to program a dedicated chip, based on cloud FPGAs, which is highly parallelizable and designed for high-throughput / low latency search workloads. This technologic step function breaks today’s limits and allow latencies that are 100x faster, at billion-scale, and at a fraction of existing hardware costs.

Ohad Levi
Location: Theater 7

2:30-3:15pm EDT Vector Search for Clinical Decisions

EBSCO's Clinical Decisions is committed to providing health care professionals with precise answers to their clinical questions. Achieving a balance between precision and recall can be especially difficult for complex queries. Practitioners expect highly relevant search results but also appreciate supplementary results when appropriate. Our existing Elasticsearch engine and knowledge graph was failing to achieve the desired results for these complex queries, typically expressed as long-form natural language. Join us as we outline our journey to deploy a high-quality vector search solution to production for EBSCO's DynaMed and Dynamic Health products.  We will detail the domain, problem space, previous failed attempts, technology choices, model selection, relevance testing methodology, validation with stakeholders, and rollout to our existing customers.

Erica Lesyshyn & Max Irwin
Location: Theater 5

Top 8 search topics to teach your team members

Imagine you are (like) me, a seasoned search relevance engineer who embarks on a new journey. You start a project to implement your company's new killer search service. You go to the project's kickoff and start talking about all the essential search relevance features the teams need to implement. And then, nothing. That intelligent backend engineer, nothing. That funny product owner, nothing. The analytics engineer, indeed, nothing. Imagine you are an intelligent backend engineer and excited about a new project. You are going to build a new killer search service. This seasoned search relevance engineer starts blabbering about Elasticsearch, Solr, Vector Search, Hybrid Search, BM25, facets, and filters at the project kickoff. You have no clue what that person is telling you. You gaze at the person and hope to hear something familiar. Sounds familiar? Then this talk is for you. Learn about working together on search relevance projects.

Jettro Coenradie
Location: Theater 7

3:30-5:00pm EDT Lightning Talks

Quick discussions about anything around search relevance!


Location: Theater 5

5:30-6:30pm EDT Haystack Reception (included with registration)

All attendees are welcome. The location is Kardinal Hall. It is about a 10 minute walk from the conference venue.

Location: Kardinal Hall

6:30-8:00pm EDT Dinner (included with registration)

All attendees are welcome. The location is Kardinal Hall. It is about a 10 minute walk from the conference venue.

Location: Kardinal Hall

Day 2, Wednesday, April 26th, 2023

Time Track 1 Track 2
8:00-9:00am EDT Coffee

Location: Entrance of the Violet Crown

9:00-9:15am EDT Welcome Back

Location: Theater 5

9:15-10:00am EDT AMA with the authors of AI-Powered Search

Trey Grainger, Doug Turnbull, and Max Irwin will all be attending Haystack this year, and we will host a Q&A / Ask Me Anything on AI-powered search. The final chapter of our AI-powered search book (https://www.manning.com/books/ai-powered-search) will be released in ebook form right before Haystack (we're just polishing off some additional on foundation models / LLMs), and the book will be coming out about a month later in print.

Trey Grainger & Doug Turnbull & Max Irwin
Location: Theater 5

10:15-11:00am EDT Better Semantic Search with Hybrid (Sparse-Dense) Search

Vector search has become increasingly popular, especially with the recent growth of dense embedding models. However, these models require large amounts of data for training and fine-tuning, which is problematic when data is scarce and domain-specific terminology is crucial. Before dense embedding models were widely used, keyword-based algorithms like TF-IDF and BM25, which produce sparse embeddings, were the go-to solutions. While these algorithms perform well, they don’t allow us to query naturally, as we often don’t know the exact terms we’re looking for. On the other hand, dense embeddings allow us to search based on the intended "semantic meaning" rather than the exact term. Hybrid search aims to combine the strengths of sparse and dense embedding models. This approach has the potential to significantly improve vector search accuracy and usefulness in a wide range of situations. In this talk, we’ll learn how we can leverage hybrid search to build better semantic search applications

Roie Schwaber-Cohen
Location: Theater 5

Exploiting Citation Networks in Large Corpora to Improve Relevance on Broad Queries

We at Lexum host and manage legal databases: largely unstructured text corpora comprised of millions of documents, each several thousand words long. In such corpora, broad search queries such as “eavesdropping” or “residential eviction enforcement” are difficult to rank. Thousands of documents discuss these topics intently, but which should the user see first? We assert that ranking by authority is intuitive and meets most users’ expectations. We have created an algorithm that analyzes a corpus’s citation network and identifies the most cited documents in the context of the user’s query. Heavily cited documents are inferred to be more authoritative. This approach can even rescue relevant documents that were initially missed because they do not contain the query’s terms. We will present the math behind our algorithm, our Lucene/Solr implementation, and how we put the algorithm into production by merging traditional ranking methods with this new ranking approach.

Marc-André Morissette
Location: Theater 7

11:15am-12:00pm EDT Enterprise Search Relevance at Box: Simplicity

At Box, we serve millions of users across many thousands of enterprises. The set of enterprises served span the scale from startups and non-profits to some of the largest corporations in the world. The documents in each enterprise are in different languages and media types, and require the strongest privacy and security controls. How do we deliver relevant search results, across such a diverse set of users, cultures, and documents in a scalable way? How do we preserve the privacy and security for our customers? This talk will focus on our current solution: a Solr search engine and a Tensorflow Ranking model to rerank the Top K results per query.Over the years, we have tuned our Solr query parameters using a mix of heuristics, data analysis and genetic algorithms. From this initial set of documents that could be returned to the user, we extract a number of numeric and binary features, add in recency information, and feed this into our global reranking algorithm.

Jay Franck
Location: Theater 5

A Cheap Trick for Semantic Question Answering for the GPU challenged

The ability to handle long question style queries is often de rigueur for modern search engines. Search giants such as Bing and Google are addressing this by building Large Language Models (LLMs) into their search pipelines. Unfortunately, this approach requires large investments in infrastructure and involves high operational costs. It can also lead to loss of confidence when the LLM hallucinates non-factual answers. A best practice for designing search pipelines is to make the search layer as cheap and fast as possible, and move heavyweight operations into the indexing layer. With that in mind, we present an approach that combines the use of LLMs during indexing to generate questions from passages, and matching them to incoming questions during search, using either text based or vector based matching. We believe this approach can provide good quality question answering capabilities for search applications and address the cost and confidence issues mentioned above.

Sujit Pal
Location: Theater 7

12:00pm-1:30pm EDT Lunch

Find lunch at one of the many options available on Charlottesville’s Downtown Mall

Location: Your choice!

1:30-2:30pm EDT Women of Search present Building Recommendation Systems with Vector Search

Erika will give an update from the Women of Search group formed in Relevance Slack and will then deliver her talk: "Bad search and recommendation is a major loss for e-commerce businesses. Haystack 2022 keynote speaker Dmitry Kan stated that “nearly $300 billion is lost each year from bad online search experiences.” With such a high impact, search and recommendation is an important area to focus on. To begin, we need to represent items and users. They are both typically represented as vectors and indexed for fast computation. Ref2Vec is a feature in Weaviate that converts users to vectors. Ref2Vec presents a graph-structured interface for connecting users and their online interactions to create a digital fingerprint. We can construct a bipartite graph between users and products and represent the user as the average representation of “liked” products. We then achieve recommendation by searching with the user vector as the query. Listeners will gain an understanding of how vector search impacts recommendation and learn how to build a recommendation system."

Erika Cardenas
Location: Theater 5

2:45-3:30pm EDT Elasticsearch: What if your database was a search engine?

Elasticsearch is a search and data analysis engine for indexing and visualizing data. But also, it is among the top 10 data management systems in terms of popularity among the community. Its NoSQL features certainly help: the ability to store very large volumes of data, the flexibility of its schema, the rich API, read performance as well as some relational features like joins and nested documents. In this talk, we'll go over the features that turn Elasticsearch into a real database and explain how they're used in different use cases.

Benjamin Dauvissat
Location: Theater 5

Populating and leveraging semantic knowledge graphs to supercharge search

Detecting, acquiring and organizing knowledge from text using automated NLP methods enables us to search for “things”, rather than merely “strings.” We will show how to recognize conceptual entities and relationships from documents written in English. We then insert these objects into a semantic knowledge graph. Doing so unlocks new options. We may use this system offline to help build synonym lists or help us enrich documents. Using it online, we may do query expansion or relaxation, tune boosts, choose an optimal query handler, or perform user intent recognition. We will show that it is useful to load additional text to teach the system about the existence of additional things and how they relate to common concepts in our domain. Before, obscure long-tail queries had no chance of matching our domain focused corpus; now, we reduce the frequency of zero result searches, capturing new business. Subject matter experts may explore the graph and make additions and corrections.

Chris Morley
Location: Theater 7

3:45-4:30pm EDT Stop Hallucinations and Half-Truths in Generative Search

You integrated a Large Language Model into your search system to interpret, answer questions, and summarize over search results. Congratulations, you are now running Generative Search! Then, disaster strikes: The LLM shows users a hallucinated, non-factual answer, even though your relevance system worked perfectly and you grounded the model with search results! The trust painstakingly built up by your search stack is gone in an instant, and users may abandon your platform. Your generative search story doesn’t need to end like this. Many strategies, both pre- and post-generation, can mitigate the risk of showing users answers that are false or partly true with respect to search results. In this talk we will explore examples of the problem, understand the root causes, and dive into proven solutions. Techniques covered include reranking, user warnings, fact-checking systems, and LLM usage patterns, prompting, and fine-tuning.

Colin Harman
Location: Theater 5

Selecting the RIGHT Measures for Your Search Product

We spend a lot of time, energy and money trying to improve the quality of our search products. But how do you know whether your search improvements are paying off? If you don’t have search measures, you really should. If you do have search measures, you really should make sure they are the right ones. This talk will encourage you to rethink how you approach search measurement and optimization.

Tito Sierra
Location: Theater 7

4:30-4:45pm EDT Closing

Location: Theater 5