Skip to main content

Using Data Pipeline to Export a Table from DynamoDB

Hands-On Lab

 

Photo of Craig Arcuri

Craig Arcuri

AWS Training Architect II in Content

Length

01:00:00

Difficulty

Advanced

In this hands-on lab, the student will perform two tasks related to DynamoDB: Using Data Pipeline to export a table from DynamoDB and Configuring DynamoDB Autoscaling. The lab comes provisioned with a populated DynamoDB table. The student will walk through setting up the Data Pipeline to export this table to an S3 Bucket. Configuring and running Data Pipeline can take up to 15 minutes. While waiting for the Data Pipeline process to complete, the student will walk through configuring the DynamoDB table for Autoscaling. After the Data Pipeline job completes, the student can verify successful completion of the job by viewing the contents of the DynamoDB table in an S3 folder.

What are Hands-On Labs?

Hands-On Labs are scenario-based learning environments where learners can practice without consequences. Don't compromise a system or waste money on expensive downloads. Practice real-world skills without the real-world risk, no assembly required.

Using Data Pipeline to Export a Table from DynamoDB

Introduction

In this hands-on lab, the student will perform two tasks related to DynamoDB: Using Data Pipeline to export a table from DynamoDB and Configuring DynamoDB Autoscaling. The lab comes provisioned with a populated DynamoDB table. The student will walk through setting up the Data Pipeline to export this table to an S3 Bucket. Configuring and running Data Pipeline can take up to 15 minutes. While waiting for the Data Pipeline process to complete, the student will walk through configuring the DynamoDB table for Autoscaling. After the Data Pipeline job completes, the student can verify successful completion of the job by viewing the contents of the DynamoDB table in an S3 folder.

Solution

Please log in to the live environment with the cloud_user credentials provided.

Make sure you are using us-east-1 as your region throughout the lab.

Create one Data Pipeline

  1. From the AWS Management Console Dashboard go in to Data Pipeline.
    • Click Get started Now.
    • Enter the Data Pipeline name:
    • Name: DynamoExport
    • Choose the template Export DynamoDB Table to S3.
    • Enter the Source DynamoDB table name:
    • Source DynamoDB table name: LinuxAcademy
    • Select Output Folder.
    • Click the folder icon to the right of the field and select the S3 Bucket option from the list.
    • Click Run on pipeline activation.
    • Select S3 Location for Log.
    • Click the folder icon to the right of the field and select the S3 Bucket option from the list.
    • Select Custom for the IAM Role choose EMR_Default, and the provided role for the second.
    • Click Edit In Architect.
    • In Resources, change instance type to m4.large.
    • Add a custom field for subnetID, enter subnetID.
    • Open a new tab and navigate to the VPC service
    • Click Subnets
    • Copy one of the Subnet ID's from the list (either one will work)
    • Back on the previous tab, paste this Subnet ID into the newly added custom field
    • On the Activities tab, change Resize Cluster Before Running to False.
    • Click Save.
    • Click Activate.

It may take 10-15 minutes for the Data Pipeline to finish creating.

DynamoDB Auto Scaling

  1. Navigate to the DynamoDB service.
  2. Click Tables
  3. Click the LinuxAcademy table name
  4. On the Capacity tab, click the checkbox to enable Read capacity and Write capacity.
    • Check the checkbox for Same settings as read capacity
    • Set Target utilization to 50%
  5. Select the checkbox for Existing role with pre-defined policies
    • Type cfst and select the Scaling Role option
  6. Click Save

Configure SNS Topic

  1. Right-click the existing DynamoDB tab in your browser and choose Duplicate to open a new tab.
  2. Navigate to the Simple Notification Service
  3. Click Get Started
  4. Click Create topic
    • Name: DynamoASG
    • Click Create topic
  5. Back in the DynamoDB service tab, click the Alarms tab
  6. Click Create alarm
    • SNS Topic: DynamoASG
    • Whenever: Consumed read capacity
    • Is: >= 50
    • Click Create Alarm

Demonstrating Auto Scaling is not enabled automatically

  1. Click Create new table
    • Table name: testASG
    • Primary key: testASG
    • Click Create
  2. Auto Scaling can not be enabled by default. We can, however, enable Auto Scaling after the table is created.

Check for S3 Folder from Data Pipeline Export

Once our Data Pipeline has changed to a FINISHED state:

  1. From the AWS Management Console Dashboard go in to S3.
    • Click on the S3 Bucket.
    • Verify that 2 folders have been created.

Conclusion

Congratulations, you've completed this hands-on lab!