Skip to main content

Amazon DynamoDB Data Modeling

Course

Intro Video

Photo of Mark Richman

Mark Richman

AWS Training Architect II in Content

Length

09:22:42

Difficulty

Advanced

Videos

36

Hands-on Labs

5

Course Details

Welcome to the Amazon DynamoDB Data Modeling course.

In this course, you will learn:

NoSQL concepts and DynamoDB's features How to optimize DynamoDB for: Runtime performance Cost DynamoDB data modeling patterns 1:N, N:M, hierarchical, GSI overloading, and more Write sharding, sparse indexes, and materialized aggregations Strategies to migrate data from RDBMS to DynamoDB

You'll further solidify your learning with real-world examples and hands-on labs.

By the end of this course, you'll have the necessary skills and knowledge to effectively build scalable, cost-efficient, high-performance applications using DynamoDB.

Thank you for taking the course — let's get started!

Syllabus

Introduction

Introduction

Course Introduction

00:03:10

Lesson Description:

Welcome to the Amazon DynamoDB Data Modeling course. In this course, you will learn:NoSQL concepts and DynamoDB's features How to optimize DynamoDB for:Runtime performance CostDynamoDB data modeling patterns1:N, N:M, hierarchical, GSI overloading, and more Write sharding, sparse indexes, and materialized aggregations Strategies to migrate data from RDBMS to DynamoDBYou'll further solidify your learning with real-world examples and hands-on labs. By the end of this course, you'll have the necessary skills and knowledge to effectively build scalable, cost-efficient, high-performance applications using DynamoDB.

About the Training Architect

00:00:58

Lesson Description:

A little bit about me — your instructor for this course! My name is Mark Richman, and I'll be your training architect for this course. I'm really excited to be helping you on your training journey. I'd love to hear from you, so please feel free to reach out to me in our Slack community or on social media. I'm @mrichman on Twitter, GitHub, and just about everywhere else.

NoSQL Concepts

00:06:33

Lesson Description:

In this lesson, we discuss relational databases and their inherent faults. Next, we explain the "sparse data" problem and how NoSQL — and DynamoDB in particular — solves this problem. Finally, we define the goals of data modeling and how it impacts the performance and cost profile of your DynamoDB-based applications.

Understanding DynamoDB Data Structures

DynamoDB and Tables

00:11:32

Lesson Description:

In this lesson, we cover what DynamoDB is, how it combines the best of wide column and key-value data stores, and some basic terminology. We also discuss the two capacity modes (On-Demand vs. Provisioned) and how billing is calculated.

Items and Attributes

00:02:51

Lesson Description:

In this brief lesson, we are going to reintroduce some DynamoDB semantics you should already be comfortable with: items and attributes.Reserved Words in DynamoDBItem Size Limits

Partition and Sort Keys

00:03:23

Lesson Description:

In this lesson, we discuss partition and sort keys. We cover their purpose and how they relate to performance with respect to partitions.

Data Types

00:02:58

Lesson Description:

In this lesson, we discuss the available data types in DynamoDB.

Indexes

00:09:32

Lesson Description:

Indexes in DynamoDB are very different from those in relational databases. Indexes store your data in different ways to facilitate different access patterns. In this lesson, we discuss global secondary indexes (GSIs) and local secondary indexes (LSIs) — including how to create them and their features and limitations — and walk through a usage example in the AWS Management Console.

Demo: Creating a Table

00:07:01

Lesson Description:

In this lesson, we learn how to create a table in DynamoDB. We will supply partition and sort keys and create a few items. We also see to how perform query and scan operations on our table, as well as how to apply filter expressions.

Basics of DynamoDB Data Modeling

Partition Keys

Partition Keys

00:06:48

Lesson Description:

This lesson introduces the technical implementation of partition keys, including what they are and how they are used with DynamoDB's hashing function to store and retrieve data. We'll also look at how partitions influence performance and support our applications' access patterns. We'll see how partitions work to:Help distribute our workloads evenly Spread reads and writes across partitions for performance What partition key design choices can cause performance issues like throttling

Common Patterns

NoSQL Data Modeling Concepts

00:04:10

Lesson Description:

With NoSQL databases and DynamoDB in particular, we take a very different approach to data modeling. In this lesson, we look at some data modeling concepts you'll need to understand to become successful with DynamoDB.

One-to-Many Relationships

00:08:46

Lesson Description:

In this lesson, we take a look at modeling one-to-many relationships within a single DynamoDB table.GitHub

Adjacency Lists and Many-to-Many Relationships

00:10:59

Lesson Description:

Adjacency lists are a design pattern that provide a way to represent graph data (nodes and edges) in DynamoDB. We'll use this pattern to implement many-to-many relationships in a single DynamoDB table.Adjacency List Design Pattern

Hierarchical Data

00:14:52

Lesson Description:

In this lesson, we're going to model hierarchical data in DynamoDB, all while searching for diapers at our local Target store.

GSI Overloading

00:08:30

Lesson Description:

In this lesson, we take a look at a technique called GSI overloading, which allows us to hold many different types of data items at one time — all in a single DynamoDB table.

Write Sharding

00:12:47

Lesson Description:

In this lesson, we cover a technique called write sharding. One way to better distribute writes across a partition key space in DynamoDB is to expand the space. For write-heavy items, you can avoid hot keys by applying a random suffix to partition keys.GitHub

Item Versioning

00:10:54

Lesson Description:

In this lesson, we explore a technique for storing version information for DynamoDB items. This is useful for a version control system, like a source code repository, or any other system that tracks changes to artifacts.TransactWriteItems APIACID Transactions

Sparse Indexes

00:04:42

Lesson Description:

In this lesson, we talk about sparse indexes and how we can use them to speed up queries and save some money.

Materialized Aggregations

00:09:14

Lesson Description:

In this lesson, we cover how to use the data coming from DynamoDB streams to aggregate our voting data in real time and use Lambda to scale up without any operational overhead.GitHub

Hands-on Labs are real live environments that put you in a real scenario to practice what you have learned without any other extra charge or account to manage.

00:45:00

Hands-on Labs are real live environments that put you in a real scenario to practice what you have learned without any other extra charge or account to manage.

00:45:00

Hands-on Labs are real live environments that put you in a real scenario to practice what you have learned without any other extra charge or account to manage.

00:45:00

Analyzing Data Workloads

Qualifying Your Workload’s Requirements

Simple Case Study

00:02:04

Lesson Description:

The next several lessons are going to cover a process of identifying and defining data access requirements of an application. This lesson provides an overview of a simple case study to illustrate the techniques used in this module of the course.

Defining an Entity–Relationship Model

00:05:10

Lesson Description:

In this lesson, we learn about the entity–relationship modeling process, showing how to map data entities and the relationships between those entities.

Documenting Access Patterns

00:04:15

Lesson Description:

To start designing a DynamoDB table that performs well and minimizes cost, we need to first identify the access patterns required by the applications it needs to support. In this lesson, we walk through the process of identifying key access patterns your DynamoDB tables will support.

Data Lifecycle

00:07:21

Lesson Description:

In this lesson, we talk about the various phases of the data lifecycle — the flow of data from creation and storage to when that data ultimately becomes obsolete and is destroyed.

Interpreting Access Patterns

Optimizing for Read/Write Workloads

00:04:04

Lesson Description:

In this lesson, we talk about read-heavy and write-heavy workloads, as well as what the data model considerations are for each case.

Volume, Velocity, and Variety

00:18:10

Lesson Description:

In the world of big data, you'll inevitably hear about the various "Vs" — these are the dimensions we consider when modeling and managing our data.GitHubBig O notationMutagen

Scenario 1 - User Profile Microservice

Scenario 1 - User Profile Microservice

Scenario

00:04:01

Lesson Description:

In this section of the course, we're going to step through a process of identifying and defining data access requirements of an application. Our example application is a microservice that provides flight information to travel agents.Sample data

Entities and Access Patterns

00:02:04

Lesson Description:

In this brief lesson, we will identify the access patterns in our application. From these access patterns, we can extract the domain entities, as well as the various queries that will inform our indexes.

Data Model

00:21:16

Lesson Description:

In this lesson, we will:Explore our source CSV flight data. Define a DynamoDB data model to suit our access patterns. Deploy a SAM application for our microservice. Load data into DynamoDB. Test our microservice via the API Gateway endpoint.Installing the AWS SAM CLI

Hands-on Labs are real live environments that put you in a real scenario to practice what you have learned without any other extra charge or account to manage.

02:00:00

Scenario 2 - Relational to DynamoDB Migration

Scenario 2 - Relational to DynamoDB Migration

Scenario

00:02:44

Lesson Description:

In this section of the course, we'll discuss a strategy for migrating from an RDBMS to DynamoDB that takes into account all the concepts we've covered so far. We'll migrate a sample database from MySQL into DynamoDB using the AWS Database Migration Service (DMS).MySQL Employees Sample Database AWS Database Migration Service (DMS)

Planning Phase

00:05:04

Lesson Description:

The first part of the planning phase is to identify the goals of the data migration. We will review a number of potential problems you might encounter when migrating to DynamoDB, along with their impact to your migration strategy.AWS Snowball DynamoDB Encryption at Rest

Data Analysis Phase

00:05:36

Lesson Description:

The purpose of the data analysis phase is to understand the composition of the source data and to identify the data access patterns used by the application. This information is required input into the data modeling phase. It is also essential for understanding the cost and performance of running a workload on DynamoDB.

Data Modeling Phase

00:13:11

Lesson Description:

In this phase, the table, partition and sort keys, and secondary indexes will be defined. The data model produced in this phase must support the data access patterns described in the data analysis phase.

Testing Phase

00:03:01

Lesson Description:

The testing phase is the most important part of the migration strategy. It is during this phase that the entire migration process will be tested end-to-end. In this lesson, we'll cover the basic tests that a comprehensive test plan should minimally contain.

Data Migration Phase

00:18:09

Lesson Description:

In this lesson, we'll work through the data migration phase. Using the AWS Database Migration Service (DMS), we'll extract our source data from MySQL, transform it into the structure supporting our new DynamoDB data model, and load that data into DynamoDB. Access Patterns and Supporting IndexesFind employee by employee IDBase table, query on emp_noFind all employees for a given last nameGSI PK: last_nameFind all employees for a given titleGSI PK: titleFind all employees for a given departmentGSI PK: department_nameFind all employees with salary >= 130,000GSI PK: salary_to_date; SK: salary Query where salary_to_date=9999-01-01 and salary>=130000Find all current department managersGSI PK: department_manager; SK: manager_name Query GSI once for each department_nameEmployees DynamoDB Table Definition

./create.py
MySQL Employees Sample Database AMI ID: amzn_lin2_mysql57_employees (ami-0aa8df173b9f8c0d3)Minimum t3a.2xlarge Minimum 10 GB EBS gp2 Security group:SSH (22/tcp) MySQL (3306/tcp)Connect via ssh -i <keypair.pem> ec2-user@<public ip>. MySQL read-only user: * Username: cloud_user * Password: bettertogether
mysql -ucloud_user -pbettertogether employees

mysql> show tables;
+----------------------+
| Tables_in_employees  |
+----------------------+
| current_dept_emp     |
| departments          |
| dept_emp             |
| dept_emp_latest_date |
| dept_manager         |
| employees            |
| salaries             |
| titles               |
+----------------------+
8 rows in set (0.00 sec)
MySQL root user (local only): * Username: root * Password: Str0ngpass!
mysql -uroot -pStr0ngpass! employees
Create Materialized View in MySQL Note: You must connect to MySQL as root in order to create tables or views.
CREATE VIEW dms_source AS
SELECT
  employees.*,
  titles.title,
  titles.from_date AS title_from_date,
  titles.to_date AS title_to_date,
  salaries.salary,
  salaries.from_date AS salary_from_date,
  salaries.to_date AS salary_to_date,
  dept_emp.dept_no AS department_number,
  departments.dept_name AS department_name,
  dept_emp.from_date AS dept_from_date,
  dept_emp.to_date AS dept_to_date,
  CONCAT_WS(' ', manager.first_name, manager.last_name) AS manager_name,
  manager.emp_no AS manager_emp_no
FROM employees
  INNER JOIN titles ON employees.emp_no = titles.emp_no
  INNER JOIN salaries ON employees.emp_no = salaries.emp_no
  INNER JOIN dept_emp ON employees.emp_no = dept_emp.emp_no
  INNER JOIN departments ON dept_emp.dept_no = departments.dept_no
  INNER JOIN dept_manager dm ON dm.dept_no = departments.dept_no
  INNER JOIN employees manager ON manager.emp_no = dm.emp_no;
CREATE TABLE dms_source_materialized SELECT * FROM dms_source;
The CREATE TABLE command should take approximately three minutes to complete.
mysql> CREATE TABLE dms_source_materialized SELECT * FROM dms_source;
Query OK, 13849320 rows affected (2 min 51.05 sec)
Records: 13849320  Duplicates: 0  Warnings: 0

mysql> select count(emp_no) from dms_source_materialized;
+---------------+
| count(emp_no) |
+---------------+
|      13849320 |
+---------------+
1 row in set (8.69 sec)
Create DMS Access Role Name: DMSDynamoDBFullAccess Permissions: AmazonDynamoDBFullAccess Note the service access role ARN. DMS Replication Instance Name: instance1 Class: dms.t2.medium Engine version: 3.3.1 (or latest) Create Source Endpoint Name: employees-mysql Engine: MySQL Username: cloud_user Create Target Endpoint Name: employees-dynamodb DMS Task Name: employeestest Rules: rules.json Testing the Access Patterns
./query.py
Can't See DMS Task CloudWatch Logs?Create an IAM role with trusted entity DMS. Attach the AmazonDMSCloudWatchLogsRole managed policy. Save the role as dms-cloudwatch-logs-role.Tip: Increasing DMS Task Logging Level to Debug https://aws.amazon.com/premiumsupport/knowledge-center/dms-enable-debug-logging/
aws dms modify-replication-task 
  --replication-task-arn <task-arn> 
  --replication-task-settings file://task-settings-debug.json

Evolving a Data Model

00:07:53

Lesson Description:

So, what happens when our DynamoDB table is in a live production environment and we need to support a new access pattern? What's the developer experience here, and how do we update our application without interruption?

Hands-on Labs are real live environments that put you in a real scenario to practice what you have learned without any other extra charge or account to manage.

01:00:00

What's Next?

What's Next?

What's Next?

00:01:52

Lesson Description:

Thanks for taking this course! Please don't forget to press the thumbs-up button on all the lessons you liked! In this lesson, we look at some other courses you may enjoy next. Related CoursesAmazon DynamoDB Deep Dive AWS Certified SysOps Administrator - Associate (SOA-C01) Automating AWS with Lambda, Python, and Boto3

Take this course and learn a new skill today.

Transform your learning with our all access plan.

Start 7-Day Free Trial