Problem:
Divide the entire range of values into a series of intervals.
Solution Summary:
A histogram divide the entire range of values into a series of intervals.
We can use "min_doc_count" to specify minimum number documents that needs to be present in each bucket.
We can use "extended_bounds" and use its "min" and "match" properties to set a lower and upper limit. You will also need to set "min_doc_count" to 0, to see buckets with no values.
Date Histogram is similar to Histogram but we specify an expression for the "interval" with values: year, quarter, month, week, day, hour, minute, second.
Prerequisites:
Set up accounts index from accounts.json as explained in the link.
Solution Steps:
Case 1 - Simple Histogram For Age
GET accounts/_search
{
"aggs": {
"age_distribution": {
"histogram": {
"field": "age",
"interval": 10
}
}
},
"size": 0
}
Response contains:
...
"aggregations": {
"age_distribution": {
"buckets": [
{
"key": 20,
"doc_count": 451
},
{
"key": 30,
"doc_count": 504
},
{
"key": 40,
"doc_count": 45
}
]
}
}
Case 2 - Do not return buckets with less than 100 records
We will use "min_doc_count" for setting minimum document counts in a bucket. We will also use an interval of 2.
GET accounts/_search
{
"aggs": {
"age_distribution": {
"histogram": {
"field": "age",
"interval": 2,
"min_doc_count": 100
}
}
},
"size": 0
}
Response contains:
...
"aggregations": {
"age_distribution": {
"buckets": [
{
"key": 30,
"doc_count": 108
},
{
"key": 32,
"doc_count": 102
},
{
"key": 34,
"doc_count": 101
}
]
}
}
Case 3 - Extend Boundaries
Will use "extended_bounds" and also set "min_doc_count" as 0 to see results.
GET accounts/_search
{
"aggs": {
"age_distribution": {
"histogram": {
"field": "age",
"interval": 10,
"min_doc_count": 0,
"extended_bounds": {
"min": 10,
"max": 50
}
}
}
},
"size": 0
}
Response contains:
...
"aggregations": {
"age_distribution": {
"buckets": [
{
"key": 10,
"doc_count": 0
},
{
"key": 20,
"doc_count": 451
},
{
"key": 30,
"doc_count": 504
},
{
"key": 40,
"doc_count": 45
},
{
"key": 50,
"doc_count": 0
}
]
}
}
Case 4 - Date Histogram
GET accounts/_search
{
"aggs": {
"opening_date_distribution": {
"date_histogram": {
"field": "opening_date",
"interval": "quarter"
}
}
},
"size": 0
}
Response Contains:
"aggregations": {
"opening_date_distribution": {
"buckets": [
{
"key_as_string": "2018/01/01 00:00:00",
"key": 1514764800000,
"doc_count": 6
},
{
"key_as_string": "2018/04/01 00:00:00",
"key": 1522540800000,
"doc_count": 4
}
]
}
}
TODO
- Explore the use of "offset" with histogram and date_histogram.
Recent comments