The Relevance of Solr's Semantic Knowledge Graph
Trey Grainger • Back to Haystack 2018
The Semantic Knowledge Graph is an Apache Solr plugin that can be used to discover and rank the relationships between any arbitrary queries or terms within the search index. It is a relevancy swiss army knife, able to discover related terms and concepts, disambiguate different meanings of terms given their context, cleanup noise in datasets, discover previously unknown relationships between entities across documents and fields, rank lists of keywords based upon conceptual cohesion to reduce noise, summarize documents by extracting their most significant terms, generate recommendations and personalized search, and power numerous other applications involving anomaly detection, significance/relationship discovery, and semantic search. This talk will walk you through how to setup and use this plugin in concert with other open source tools (probabilistic query parser, SolrTextTagger for entity extraction) to parse, interpret, and much more correctly model the true intent of user searches than traditional keyword-based search approaches.
View the SlidesTrey is the SVP of Engineering at Lucidworks, where he leads their engineering efforts around both Apache Lucene/Solr, as well as Lucidwork’s commercial product offerings. Trey is also the co-author of the book Solr in Action, as well as a published researcher and frequent public speaker on data science topics related to search, analytics, recommendation systems, and natural language processing. Trey previously served as Director of Engineering at CareerBuilder, developing semantic search, recommendations, and data analytics products powering billions of searches a month across billions of documents. Trey holds degrees from Georgia Tech (MBA in Management of Technology), Furman University (Bachelors in CS, Business, Philosophy), and has also completed Masters-level work in Information Retrieval and Web Search from Stanford University.