Cutting Costs in DynamoDB

August 1, 2022

Amazon DynamoDB is a managed NoSQL database service that provides fast and predictable performance with seamless scalability.   The main purpose of DynamoDB is to offload the administrative burdens of operating and scaling a distributed database without worrying about hardware provisioning, software setup and maintenance, and scalability. 

Costs Generated by DynamoDB 

Even though DynamoDB does offer those advantages and the ability to send unstructured data to be stored with minimal effort, the way you query this data might come with a huge cost. 
With DynamoDB, you not only pay for reads and writes (through reading capacity units RCUs and write capacity units WCUs), but you also pay for additional things to be able to query your data. 
You pay for global secondary indexes (GSIs for short) which are separate DynamoDB tables that contain a subset of attributes from the source table. It contains an alternate primary key to support query operations. 
As your data set grows, your GSI tables will grow simultaneously, but you end up paying for additional tables when using more than one column as your key to your GSI table to be able to query the requested data as expected. 
You might end up using multiple DynamoDB tables, and each of those tables will contain GSI tables to be able to query different sets of data to your liking. 
These tables and GSI tables end up charging you extra, but you are able to achieve similar outcomes by cutting your costs by using the Single Table Design on your DynamoDB table.

Single Table Design

The Single Table Design method requires you to use one DynamoDB table and one GSI table. To be able to query your data, you must register your data using two important key pairs which are: a primary key PK and a sort key SK. 
The primary key is a required key requested by DynamoDB, but the sort key is the key to be used for your GSI table to be able to query. 
Instead of creating physical DynamoDB tables and assigning GSI tables to each table, you create virtual tables in the same DynamoDB table and differentiate them using a Table property, and since DynamoDB is a NoSQL database, then you can insert your data without worrying about any schema validation issues. 
Example of a regular DynamoDB record: 

                    {
                         Key: "1234",
                         Title: "test data",
                         CreatedBy:"user-123",
                         Content: "this is a test post" 
                    }

Example of a Single Table Design DynamoDB record: 

                    {
                         PK: "Post#1234",
                         SK: "User#user-123",
                         Title: "test data",
                         Content: "this is a test post",
                         Table: "Posts"
                    }

The key difference here is that instead of putting a GSI table to be able to query the Key and CreatedBy, you now query on the PK-SK GSI table. Hence, you can focus on utilizing those keys to query on instead of creating multiple GSI tables to be able to query your requested data (keep in mind that you are allowed 20 GSI tables per DynamoDB table).

Problems With Single Table Design

One issue with utilizing Single Table Design is that you are required to write additional software code to be able to cut costs on your DynamoDB table. While there are ready-made open source solutions that do a good portion of the required job, that doesn’t mean they will provide a ready solution that fits everyone’s needs. 
It requires you to think about how you will structure your data on DynamoDB to be able to query your data as expected. This would become less cumbersome if you structured your code properly, ideally structuring your modules and considering each module as a virtual table on DynamoDB. 
You might consider using composite primary keys and composite sort keys to increase your querying ability rather than creating new GSI tables. 
Example of a composite key: 

                    Post#12345#User#user-1234

You would write additional code to be able to split and combine several values to create a composite key. 

When Not to Consider DynamoDB  

If you have a good amount of relationships and key pairs to query in your datasets, then it is recommended to use a SQL database like AWS Aurora which preserves the relationship between your datasets and lets you query as much as you want based on a monthly basis without worrying about reading and write capacity units. 
It requires you to think about how you will structure your data on DynamoDB to be able to query your data as expected. This would become less cumbersome if you structured your code properly, ideally structuring your modules and considering each module as a virtual table on DynamoDB. 
If your data is not structured but you have a lot of records to query that might exceed DynamoDB’s allowed read capacity units, then consider doing ETL jobs on your data using AWS Glue and then loading it on AWS Redshift to query it. 

Mohammad Kalaaji

Cloud Software Engineer

Cutting Costs in DynamoDB Using Single Table Design

Cutting Costs in DynamoDB Using Single Table Design