Problem:
Use Elasticsearch Query DSL to perform pagination and sorting within the query context.
Solution Summary:
Elasticsearch provides "from" and "size" parameters to do pagination. It returns "size" number of elements from number specified by "from" (from=0 is element 1, from=2 is element 2 etc.)
If "from" is not specified, it defaults to 0. If "size: is not specified default is 10.
Sorting is done with help of "sort" element. For ascending order, we can use "asc", and for descending we use "desc".
Prerequisites:
Set up accounts index from accounts.json as explained here.
Solution Steps:
Case 1 – Search that returns documents 11 through 20 (Pagination)
GET /accounts/_search
{
"query": { "match_all": {} },
"from": 10,
"size": 10
}
Note:
-
match_all query is simply a search for all documents in the specified index.
-
If from is not specified, it defaults to 0. If size is not specified default is 10.
Case 2 - Sort the results by account balance in descending order (Sorting)
GET /accounts/_search
{
"query": { "match_all": {} },
"sort": { "balance": { "order": "desc" } }
}
Note:
-
The sort expression can be written in short hand form as: "sort": { "balance": "desc" }.
-
For ascending order, we can use asc instead of desc.
-
Above query will return only 10 records as the default size is 10.
-
You c
Case 3 - Sorting on more than one field
GET /accounts/_search
{
"query": { "match_all": {} },
"sort": {
"age": {"order" : "asc"},
"balance": { "order": "desc" }
}
}
Note:
-
You can also sort on multi-value fields.
Additional Notes
-
Elasticsearch queries are stateless and does not use any cursers like relational databases. So if new elements are added or removed that matches the query criteria between queries, you might see elements already returned, or even miss some documents.
-
Deep pagination (e.g. high "from" values) can affect performance, as Elasticsearch will have to retrieve and sort all elements.
-
It is a good practice to put an upper bound for your searches.
-
Analyzed "text" type douments cannot be used for sorting (by default) as they are stored as individual terms in the inverted index and not as entire strings.
-
If you try, you will get an exception: illegal_argument_exception: Fielddata is disabled on text fields by default. Set fielddata=true on [city] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.
-
TODO
-
Sort on multi-value fields.
-
Create two indexes and write a search query (GET /_search) to search across indexes.
Recent comments