Real-Time Entity Resolution with Elasticsearch

Dave Moore • Back to Haystack 2018

10,000 people share my full name. How can I search information about me and not the other 9,999? Entity resolution gives you the power to disambiguate this search. It's a way to find different records about the same thing, while excluding similar records about different things. And it has an amazing ability to track the changes of an identity, like when you change your name or address.

Disambiguation is vital to anything that needs to run a case study. Things like customer intelligence, patient identification, fraud investigation, and so on. You need to know everything about one thing - and only that thing. This is hard when your data is a mess, or when those names and addresses just won't stop changing.

Elasticsearch can resolve entities in real-time with clever uses of structured search. I'll give a light introduction to the concept, dive into best practices, and bring it to life with examples that have worked in the field - from people to companies to devices. You'll learn when to use real-time entity resolution and how to do it right."

Dave Moore is a solutions architect at Elastic, where he helps people succeed with real-time search and analytics at scale. In his past life he provided expertise on identification technologies to federal and enterprise customers. Using Hadoop and Spark, he designed and implemented large scale entity resolution systems including the patient identification system used by one of the largest health care companies in the world. Now he is applying that expertise with the Elastic Stack.