[Recipes] Bucket Aggregation with Custom Bucket Rules

Problem: 

Define custom bucket rules to put documents into different buckets.

Solution Summary: 

We can use filters aggregation to define custom rules for each bucket. 

Prerequisites: 

Set up accounts index from accounts.json as explained in the link

Solution Steps: 

Case 0 - Buckets for Male and Female without Filters

GET accounts/_search 
{
  "aggs" : {
    "terms_gender" : {
      "terms" : {
        "field" : "gender.keyword"
      }
    }
  },
  "size": 0
}

Note: This is a normal bucket aggregation query for reference. First we will do a similar bucketting with "filters" (case 1) and then do a bit more complex one (case 2).

Response:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1000,
    "max_score": 0,
    "hits": []
  },

  "aggregations": {
    "terms_gender": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "M",
          "doc_count": 507
        },
        {
          "key": "F",
          "doc_count": 493
        }
      ]
    }
  }
}

Note: The initial part in italics is same for all cases (except "took" time). So will remove that from subsequent responses displayed. 

 

Case 1 - Buckets for Male and Female with Filters

GET accounts/_search 
{
  "aggs" : {
    "terms_gender_filter" : {
      "filters": {
        "filters": {
          "male_bukcet": {
            "match" : {
              "gender": "M"
            }
          },
          "female_bucket" : {
            "match": {
              "gender" : "F"
            }
          }
        }
      }
    }
  },
  "size": 0
}

 

Response contains:

...

"aggregations": {
    "terms_gender_filter": {
      "buckets": {
        "male_bukcet": {
          "doc_count": 507
        },
        "female_bucket": {
          "doc_count": 493
        }
      }
    }
  }

}

 

Case 2 - One bucket with Female, One Bucket with Male and Female

GET accounts/_search 
{
  "aggs" : {
    "terms_gender_filter" : {
      "filters": {
        "filters": {
          "male_bukcet": {
            "match" : {
              "gender": "M"
            }
          },
          "male_female_bucket" : {
            "match": {
              "gender" : "M F"
            }
          }
        }
      }
    }
  },
  "size": 0
}

 

Response contains: 

"aggregations": {
    "terms_gender_filter": {
      "buckets": {
        "male_bukcet": {
          "doc_count": 507
        },
        "male_female_bucket": {
          "doc_count": 1000
        }
      }
    }
  }
}

 

Case 3 - Running Sub Aggregations with Custom Bucket Rule

GET accounts/_search 
{
  "aggs" : {
    "terms_gender_filter" : {
      "filters": {
        "filters": {
          "male_bukcet": {
            "match" : {
              "gender": "M"
            }
          },
          "male_female_bucket" : {
            "match": {
              "gender" : "M F"
            }
          }
        }
      },
      "aggs" : {
        "avg_balance" : {
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  },
  "size": 0
}

Note: Nested aggregation is inside "terms_gender_filter", which is inside top level "aggs".

 

Response contains:

"aggregations": {
    "terms_gender_filter": {
      "buckets": {
        "male_bukcet": {
          "doc_count": 507,
          "avg_balance": {
            "value": 25803.800788954635
          }
        },
        "male_female_bucket": {
          "doc_count": 1000,
          "avg_balance": {
            "value": 25714.837
          }
        }
      }
    }
  }

Recipe Tags: 

Learn Serverless from Serverless Programming Cookbook

Contact

Please first use the contact form or facebook page messaging to connect.

Offline Contact
We currently connect locally for discussions and sessions at Bangalore, India. Please follow us on our facebook page for details.
WhatsApp (Primary): (+91) 7411174113
Phone (Escalations): (+91) 7411174114

Business newsletter

Complete the form below, and we'll send you an e-mail every now and again with all the latest news.

About

CloudMaterials is my blog to share notes and learning materials on Cloud and Data Analytics. My current focus is on Microsoft Azure and Amazon Web Services (AWS).

I like to write and I try to document what I learn to share with others. I believe that knowledge is useless unless you share it; the more you share, the more you learn.

Recent comments

Photo Stream