[Recipes] Elasticsearch Query DSL - Pagination and Sorting

Problem: 

Use Elasticsearch Query DSL to perform pagination and sorting within the query context.

Solution Summary: 

Elasticsearch provides "from" and "size" parameters to do pagination. It returns "size" number of elements from number specified by "from" (from=0 is element 1, from=2 is element 2 etc.)

If "from" is not specified, it defaults to 0. If "size: is not specified default is 10. 

Sorting is done with help of "sort" element. For ascending order, we can use "asc", and for descending we use "desc".

Prerequisites: 

Set up accounts index from accounts.json as explained here.

Solution Steps: 

Case 1 – Search that returns documents 11 through 20 (Pagination)

GET /accounts/_search
{
  "query": { "match_all": {} },
  "from": 10,
  "size": 10
}

Note: 

  1. match_all query is simply a search for all documents in the specified index.

  2. If from is not specified, it defaults to 0. If size is not specified default is 10. 

 

Case 2 - Sort the results by account balance in descending order (Sorting)

GET /accounts/_search
{
  "query": { "match_all": {} },
  "sort": { "balance": { "order": "desc" } }
}

Note:

  1. The sort expression can be written in short hand form as: "sort": { "balance": "desc" }. 

  2. For ascending order, we can use asc instead of desc.

  3. Above query will return only 10 records as the default size is 10.

  4. You c

 

Case 3 - Sorting on more than one field

GET /accounts/_search
{
  "query": { "match_all": {} },
  "sort": { 
    "age": {"order" : "asc"},
    "balance": { "order": "desc" }
  }
}

Note: 

  1. You can also sort on multi-value fields.

 

Additional Notes

  1. Elasticsearch queries are stateless and does not use any cursers like relational databases. So if new elements are added or removed that matches the query criteria between queries, you might see elements already returned, or even miss some documents. 

  2. Deep pagination (e.g. high "from" values) can affect performance, as Elasticsearch will have to retrieve and sort all elements.

  3. It is a good practice to put an upper bound for your searches.

  4. Analyzed "text" type douments cannot be used for sorting (by default) as they are stored as individual terms in the inverted index and not as entire strings.

    1. If you try, you will get an exception: illegal_argument_exception: Fielddata is disabled on text fields by default. Set fielddata=true on [city] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.

 

TODO

  1. Sort on multi-value fields.

  2. Create two indexes and write a search query (GET /_search) to search across indexes.

Recipe Tags: 

Learn Serverless from Serverless Programming Cookbook

Contact

Please first use the contact form or facebook page messaging to connect.

Offline Contact
We currently connect locally for discussions and sessions at Bangalore, India. Please follow us on our facebook page for details.
WhatsApp (Primary): (+91) 7411174113
Phone (Escalations): (+91) 7411174114

Business newsletter

Complete the form below, and we'll send you an e-mail every now and again with all the latest news.

About

CloudMaterials is my blog to share notes and learning materials on Cloud and Data Analytics. My current focus is on Microsoft Azure and Amazon Web Services (AWS).

I like to write and I try to document what I learn to share with others. I believe that knowledge is useless unless you share it; the more you share, the more you learn.

Recent comments

Photo Stream