Elasticsearch provides a great feature called Search-as-you-type providing us a way to implement search engine that looks like Google.In my recent project we had same requirement ,we were having n number of name/state/city/company/person in our Database and were required to support ad-hoc queries we choose ES as our tool to realize this use case.
Elasticsearch provides many features out of the box one of the feature ,i like most is autocomplete ,i remember days when we use to implement this feature using AJAX and running queries like %ELASTIC% to get the suggestions from DB ,but ES has a different approach to this problem using appropriate analyzers while indexing data into ES we can build the data in such a way that we don’t have to perform phrase query ,but by using exact matching queries we can get this functionality implemented.
For example Say we have Movie Data base ,with below entries
Reservoir Dogs
Airplane
Doctor Zhivago
The Deer Hunter
The Lord of the Rings
Using standard analyzer we will have below inverted index
Now say if we have all things implemented using standard analyzer,when we will type Th we will be suggested with nothing provided we are not querying to get all words that start with T* as it will be inefficient because there no such token Th in our inverted index,when we type The we will get two suggestion using match query The Deer Hunter and The Lord of the Rings but we wanted these suggestion to popup as soon as we type Th
To support this Elasticsearch provides us with n-grams analyzer ,for search-as-you-type, we use a specialized form of n-grams called edge n-grams. Edge n-grams are anchored to the beginning of the word. Edge n-gramming the word quick would result in this:(taken from Es Guide)
q
qu
qui
quic
quick
Using this analyzer when we index movie document with movie The Deer Hunter we will get following n-grams
T
Th
The
D
De
Dee
Deer
H
Hu
Hun
Hunt
Hunte
Hunter
These n-grams are actually the tokens present in our inverted index and when we type T we will make a exact matching query for term T and will returned by two document id’s and their contents as requested
For full implementation of above behavior and mapping to be used ,Find below link