"Quick"), remove terms (eg stopwords like
"the"etc) or add terms (eg synonyms like
englishanalyzer comes with a set of English stopwords -- common words like
thewhich don't have much impact on relevance -- which it removes, and it is able to stem English words because it understands the rules of English grammar. + The
englishanalyzer would produce the following: + set, shape, semi, transpar, call, set_tran, 5 + Note how
"set_trans"have been stemmed to their root form.
datefield contains an exact value: the single term
_allfield is a full text field, so the analysis process has
2014, it matches all 12 tweets, because all of them contain the term
2014-09-15, it first analyzes the query string to produce a query which matches any of the terms
15. This also matches all 12 tweets, because all of them contain the term
2014-09-15, it looks for that exact date, and finds one tweet only:
2014, it finds no documents because none contain that exact date:
analyzeAPI to see how text is analyzed. Specify which analyzer to use in the query string parameters, and the text to analyze in the body:
tokenis the actual term that will be stored in the index. The
positionindicates the order in which the terms appeared in the original text. The
end_offsetindicate the character positions that the original word occupied in the original string.
analyzeAPI is really useful tool for understanding what is happening inside Elasticsearch indices, and we will talk more about it as we progress.
stringfield and analyzes it with the