Tim Allison • Location: Theater 4 • Back to Haystack 2020
A carefully built, well applied and intuitively surfaced taxonomy can really help us reach the holy grail of search, namely increased recall and precision in results. This talk looks at how we construct taxonomies at LexisNexis that match content volume and user need to provide topics that are granular enough to be actionable, insightful and helpful in decision-making. It covers both rule-based and machine learning classification approaches to content enrichment and demonstrates how we combine these to achieve compelling and differentiating accuracy. It looks at the various ways we have surfaced taxonomy in our products both to enable our users to get to the comprehensive information they want faster and increasingly to drive powerful analytics. Finally, we look to the future and consider how we can increase usage of our taxonomies both explicitly and ‘hands-free’ in our products to continue to boost relevance, improve findability and better understand query intent.
Tim has been working in natural language processing since 2002. In the last 5+ years, his focus has shifted to content/metadata extraction (and evaluation), advanced search and relevance tuning. Tim is the founder of Rhapsode Consulting LLC, and he currently works as a data scientist at NASA's Jet Propulsion Laboratory. Tim is a member of the Apache Software Foundation (ASF), the chair/VP of Apache Tika, and a committer on Apache OpenNLP (2020), Apache Lucene/Solr (2018), Apache PDFBox (2016) and Apache POI (2013). Tim holds a Ph.D. in Classical Studies, and in a former life, he was a professor of Latin and Greek.