Search Engines: Combining Inverted and ANN Indexes for Scale

Anubhav Bindlish • Location: TUECHTIG • Back to Haystack EU 2023

Search engines have traditionally employed inverted indexes to quickly filter documents. With the rise of vector embeddings and large language models, search engines are now adding ANN indexes.

Combining inverted indexes and ANN indexes into the same system introduces a number of implementation challenges including:

How to handle the large amount of RAM required to hold vector data and indexed structures
How to distribute an ANN graph across multiple shards and avoid expensive reindexing
How to update vector embeddings or metadata quickly
How to avoid contention between heavy indexing and vector search

We will discuss these challenges and how to elegantly design a system that can efficiently leverage multiple indexes in parallel for hybrid search. We’ll also discuss how combining traditional approaches and new approaches to search can yield an even better result than using two different database solutions.

Download the Slides Watch the Video

Anubhav Bindlish

Rockset

Anubhav joined Rockset as a software engineer in 2021, and has been working in the data indexing and query execution space. Prior to this he worked at Meta Platforms (Facebook) for 5 years. Here he worked in the Integrity Infrastructure team building a platform that employed ML rules to keep bad actors off Facebook.