Elasticsearch Full Text Queries vs Term Level Queries

New documents in Elasticsearch are analyzed and stored as inverted indexes for faster searching. While "text" fields are analyzed as part of the analysis process, "keyword" fields and other exact match type fields are stored as is.

Full text queries will do analysis on the query string before executing. The term-level queries operate on the exact terms that are stored in the inverted index (without analyzing), however, will normalize any keyword fields with normalizer property. 

The full text queries are usually used for running queries on full text fields like the body of an email. Term level queries are usually used for structured data like numbers, dates, and enums, rather than full text fields. 

 

Example (Code)

This example uses accounts json imported from this link.

Consider following two queries:

Case 1 - Full text query

GET /accounts/_search
{
  "query": {
    "match": {
      "city":"Hamilton"
    }
  }
}

This will return record with city Hamilton.

 

Case 2 - Term Query

GET /accounts/_search
{
  "query": {
    "term": {
      "city":"Hamilton"
    }
  }
}

Note:

  1. This will not return anything. The "city" used in the query is a "text" field. All "text" fields are stored into the inverted index after analysis process (default analyzer lowercases text). Full text queries (e.g. match queries) go through same analysis process before matching. However, term queries do not go through the analysis process again.

  2. In this case, if you change city value to all small case ("city":"hamilton"), this will return the document. Because this is how it is actually stored in the inverted index.

  3. Unlike "text" fields, "keyword" fields are not analyzed. By default a "keyword" subfield is always created for "text" fields, which can be used for exact matches. Just change "city" to "city.keyword" in the above example and it will return result.

 

Example (Query APIs)

The queries that come under full text query group are:

  1. match - Searches the analyzed text. All fields specified in "match" query string are ORed and search. We can change the operator to OR.

  2. match_phrase - Searches the analyzed text. All fields specified in "match" query string are searched as a phrase. 

  3. match_phrase_prefi

  4. multi_match - Run same query on multiple fields.

  5. common_term

  6. query_string

  7. simple_query_string

Note: See referenced pages for explanations.

 

The queries that come under term level query group are:

  1. term - Look for exact values.

  2. terms - Look if any exact values in a list match.

  3. terms_set

  4. range - Look for numbers or dates in a given range.

  5. exists - Find documents where a field exist.

  6. prefix

  7. wildcard

  8. regexp

  9. fuzzy

  10. type

  11. ids

Note: See referenced pages for explanations.

Learn Serverless from Serverless Programming Cookbook

Contact

Please first use the contact form or facebook page messaging to connect.

Offline Contact
We currently connect locally for discussions and sessions at Bangalore, India. Please follow us on our facebook page for details.
WhatsApp (Primary): (+91) 7411174113
Phone (Escalations): (+91) 7411174114

Business newsletter

Complete the form below, and we'll send you an e-mail every now and again with all the latest news.

About

CloudMaterials is my blog to share notes and learning materials on Cloud and Data Analytics. My current focus is on Microsoft Azure and Amazon Web Services (AWS).

I like to write and I try to document what I learn to share with others. I believe that knowledge is useless unless you share it; the more you share, the more you learn.

Recent comments

Photo Stream