DynamoDB Global Secondary Index (GSI)

  1. an index with a partition or a partition-and-sort key that can be different from those on the table.
  2. "global" because queries on the index can span all items in a table, across all partitions.

  3. GSI behavior is similar to that of a DynamoDB table: You can query a GSI using its partition key element, with conditional filters on the GSI sort key element.

  4. GSIs support non-unique attributes, which increases query flexibility by enabling queries against any non-key attribute in the table.

  5. GSIs are useful for tracking relationships between attributes that have a lot of different values.

    1. For example, in a table for Customer with CustomerID as the primary partition key, ZipCode can be the partition key for a GSI to efficiently query for all customers with a given zip code.

  6. A GSI does not need to have a sort key element; we may have only a partition key (may not be unique).

    1. Unlike the primary key on a table, a GSI index does not require the indexed attributes to be unique.

  7. GSIs associated with a table can be created/updated/deleted at any time, even after table creation.

  8. A query on a GSI can only return attributes that were specified to be included (projected) in the GSI at creation time.

    1. Applications that need additional data from the table, can retrieve the primary key from the GSI and then use either the GetItem or BatchGetItem APIs to retrieve the desired attributes from the table.

    2. As GSI’s are eventually consistent, applications that use this pattern have to accommodate item deletion (from the table) in between the calls to the GSI and GetItem/BatchItem.

  9. LSIs are updated automatically when the primary index is updated, but asynchronously.

    1. Hence, GSIs support only eventual consistency.

  10. GSIs manage throughput independently of the table they are based on.

    1. When you enable Auto Scaling for a new or existing table from the console, you can optionally choose to apply the same settings to GSIs, or provision different throughput for tables and global secondary indexes manually.

 

Global Secondary Index - Provisioned Throughput

Different throughput for tables and global secondary indexes may be required in following scenarios:

  • A GSI that contains a small fraction of the table items needs a much lower write throughput compared to the table.

  • A GSI that is used for infrequent item lookups needs a much lower read throughput, compared to the table.

  • A GSI used by a read-heavy background task may need high read throughput for a few hours per day.

A query to a GSI consumes read capacity units, based on the size of the items examined by the query.

You are charged for the aggregate provisioned throughput for a table and its GSIs by the hour.

You are also charged for the data storage taken up by the GSI as well as standard data transfer (external) fees.  

Storage costs for a GSI are based on the total number of bytes stored in that GSI. This includes the GSI key and projected attributes and values, and an overhead of 100 bytes for indexing purposes.

If a GSI’s provisioned throughput is exhausted, then subsequent writes to the table will be throttled, even if the table has available write capacity units.

Tables with GSIs have the same daily limits on the number of throughput change operations as normal tables.

Performance considerations of the primary key of a DynamoDB table also apply to GSI keys: A GSI assumes a relatively random access pattern across all its keys. To get the most out of secondary index provisioned throughput, you should select a GSI partition key attribute that has a large number of distinct values, and a GSI sort key attribute that is requested fairly uniformly, as randomly as possible.

With Auto Scaling, it is recommended that you apply the same settings to GSI as the table.

When you provision manually, it is highly recommended that you provision additional write throughput that is separate from the throughput for the index for adding the new index. You would have to dial back the additional write throughput you provisioned for adding an index, once the index creation process is complete.

You can dial up or dial down the provisioned write throughput for index creation at any time during the creation process.

To change your GSI’s provisioned throughput capacity, you can use the DynamoDB Console or the UpdateTable API or the PutScalingPolicy API for updating Auto Scaling policy settings.

 

Global Secondary Index  - Cloudwatch Metrics

Tables with GSI will provide aggregate metrics for the table and GSIs, as well as breakouts of metrics for the table and each GSI.

Reports for individual GSIs will support a subset of the CloudWatch metrics that are supported by a table, including:

  • Read Capacity (Provisioned Read Capacity, Consumed Read Capacity)

  • Write Capacity (Provisioned Write Capacity, Consumed Write Capacity)

  • Throttled read events

  • Throttled write events

Learn Serverless from Serverless Programming Cookbook

Contact

Please first use the contact form or facebook page messaging to connect.

Offline Contact
We currently connect locally for discussions and sessions at Bangalore, India. Please follow us on our facebook page for details.
WhatsApp (Primary): (+91) 7411174113
Phone (Escalations): (+91) 7411174114

Business newsletter

Complete the form below, and we'll send you an e-mail every now and again with all the latest news.

About

CloudMaterials is my blog to share notes and learning materials on Cloud and Data Analytics. My current focus is on Microsoft Azure and Amazon Web Services (AWS).

I like to write and I try to document what I learn to share with others. I believe that knowledge is useless unless you share it; the more you share, the more you learn.

Recent comments

Photo Stream