Enterprise Search Relevance at Box: Simplicity

Jay Franck • Location: Theater 5 • Back to Haystack 2023

At Box, we serve millions of users across many thousands of enterprises. The set of enterprises served span the scale from startups and non-profits to some of the largest corporations in the world. The documents in each enterprise are in different languages and media types, and require the strongest privacy and security controls. How do we deliver relevant search results, across such a diverse set of users, cultures, and documents in a scalable way? How do we preserve the privacy and security for our customers? This talk will focus on our current solution: a Solr search engine and a Tensorflow Ranking model to rerank the Top K results per query. Over the years, we have tuned our Solr query parameters using a mix of heuristics, data analysis and genetic algorithms. From this initial set of documents that could be returned to the user, we extract a number of numeric and binary features, add in recency information, and feed this into our global reranking algorithm.

Download the Slides Watch the Video

Jay Franck

Box

After exploring the Universe with Big Data as an Astronomer, Jay moved to the exciting field of Machine Learning. His first role was at PlayStation building Relevance systems for games in the store, as well as Game Help and Activities. His most recent role is at Box, improving Search Relevance for Customers across industries and countries. He values simple, cost-effective solutions over complicated 'state-of-the-art' systems.