AWS DynamoDB has two key concepts related to table design or creating new table. A beginner with DynamoDB is found to be wondering on whether to use a partition key or composite partition key when creating a new table. Composite partition key is also termed as composite primary key or hash-range key.
In this post, you will learn about some of the following:
- Is the partition key same as the primary key of a table?
- What are composite partition key or composite primary key?
- When to use partition key vs composite partition key?
This post presumes that you have got a good understanding of partition concept and how DynamoDB stores the table items in one or more partitions based on the partition key. Here is a good explanation on Choosing the right DynamoDB partition key.
Is the partition key same as the primary key of a table?
In case, the table has only a partition key, it must be unique in the whole table. In other words, no two items in the table can have the same partition key value. From that perspective, in such senarios, partiton key is same as primary key used in traditional RDBMS. Partition key of an item is also termed as hash key or hash attribute.
For example, a user table can have only a partition key which can be user email id or address. Any item in the user table can be immediately accessed by providing the email address of the user.
Another examples of recommended partition keys are employee number, customer id etc.
What are composite partition or composite primary key?
Composite partition key is also referred to as composite primary key or hash-range key. When the table has both, partition key and sort key, it is called as composite partition key. In other words, a composite partition key comprises of two attributes such as partition key and sort key. Sort key of an item is also termed as range key or range attribute.
With composite partition key, DynamoDB determines the hash of the partition in which item needs to be stored based on the item’s partition key, and, put the record in the partition in sorted manner based on the sort key. All items with the same partition key are stored together, in sorted order by sort key value.
Let’s take a look at the following example which represents a table that stores data related users’ job history.
email_id | company | from_date | to_date |
---|---|---|---|
aks@gmail.com | ge electric | 2013-01-25 | 2015-05-01 |
aks@gmail.com | microsoft | 2015-08-10 | 2016-03-05 |
abc@gmail.com | 2011-04-13 | 2014-03-05 | |
abc@gmail.com | 2014-04-01 | 2017-01-01 |
In the above example, email_id is the partition key and company is the sort key. You may note some of the following:
- There cab be multiple items having same partition key. For example, there are two items with partition key as aks@gmail.com.
- Given previous point, it is imperative that sort key such as company must have unique or different values.
- At any point of time, the combination of partition and sort key must be unique.
- You can get a particular item by providing both email_id and company
- You can get entire job history for a user by providing information for just user_id.
When to use partition key vs composite partition or primary key?
The following can be used for deciding when to have only partition key and when to have both, partition and sort key.
- When there are unique or distinct item in the table which can be identified by a given id, one can go for just partition key. For example, account table, user table etc.
- When there can be multiple entries related to a particular entity, or, in other words, when there can be similar items in the table, one can go for composite primary or composite partition or hash-range keys. For example, order status table which can consist of order_id as partition key and date_time as sort key with additional attribute as status. Another example is author table comprising of author name as partition key and book title as sort key.
Further Reading / References
Summary
In this post, you learned about when to use partition key and when create composite partition or composite primary key in a DynamoDB table.
Did you find this article useful? Do you have any questions or suggestions about this article in relation to understanding difference between partition key and composite partition key in DynamoDB? Leave a comment and ask your questions and I shall do my best to address your queries.
- Difference: Binary vs Multiclass vs Multilabel Classification - September 13, 2024
- Sklearn LabelEncoder Example – Single & Multiple Columns - September 13, 2024
- ROC Curve & AUC Explained with Python Examples - September 8, 2024
Leave a Reply