Skip to main content

Auto Scaling and High Availability

Hands-On Lab


Photo of

Training Architect





EC2 is a great platform for hosting websites and web apps, but what happens when the app becomes too popular for your existing instance? Obviously, it can be replaced by a larger instance, but what if the demand is variable? That's where autoscaling comes in! With autoscaling, your app can be scaled out with extra servers to split the load until it decreases. Once the load decreases, your app can return to the normal number of servers and your app won't be left with an abundance of excess, and expensive, resources. The app will also benefit from more servers to receive requests if one or more of the other servers fail. This is high availability and is crucial in web apps of today. This Learning Activity will walk you through configuring autoscaling and high availability in order to teach the important methods to maintaining a dynamic and resilient app.

What are Hands-On Labs?

Hands-On Labs are scenario-based learning environments where learners can practice without consequences. Don't compromise a system or waste money on expensive downloads. Practice real-world skills without the real-world risk, no assembly required.


This lab covers the following objectives:

  • Scaling policy types (simple scaling and step scaling policy types)
  • Adjustment types
  • Instance warm up time
  • Scaling based on CloudWatch metrics and alarms

Creating Alarms Based on Metrics

Among other uses, alarms allow us to modify our infrastructure depending on application demands.

For example, if we have a CPU-intensive application and expect demand to grow or shrink, we can create alarms based on our instance's CPU metrics to scale up or down. If CPU usage is over 70% for 5 minutes, we can add an instance to our Auto Scaling group. If CPU usage is below 40% usage for 5 minutes, then remove an instance. In addition to adding better availability and performance to our application, this type of setup is cost effective due to removing unnecessary resources.

These alarms rely on CloudWatch and can be created/managed from the CloudWatch dashboard but can also be managed from the Auto Scaling dashboard for convenience. Let's begin by creating the alarms described above from the Auto Scaling dashboard:

  • Type ec2 into the AWS Services search bar and navigate to the EC2 dashboard.
  • Click the Auto Scaling Groups link under the Auto Scaling label in the navigation list on the left side.

You will see a pre-configured Auto Scaling group listed here. We will now add a scaling policy to this Auto Scaling group and create an alarm.

  • In the details pane below, click the Scaling Policies tab.
  • Click the Add Policy button.
  • At the bottom, click the Create a Scaling policy with Steps link.
  • Type in an appropriate Name of scale-up.
  • Click the Create new alarm link to the right of the Execute policy when setting.

Let's create an alarm based off the example described above:

  • Uncheck the Send a notification to setting for this lab.
  • Take note of the available metrics for the Whenever setting. For this example, we will use the Average of CPU Utilization
  • Is should be set to >= 70.
  • For at least 1 consecutive period(s) of 5 Minutes.
  • The default Name of alarm will suffice for this lab, so click the Create Alarm button to continue.

You will see a window informing you that the alarm was created successfully. Dismiss it with the Close button. The new alarm has now been automatically selected for the Execute policy when setting. Let's finish configuring this policy:

  • Set Take the action to Add 1 instances when 70 <= CPUUtilization < +infinity

For this lab, we will be creating a simple scaling policy, but take note of the Add step option. A simple scaling policy is the most basic option, only scaling up or down to a fixed number of instances and allowing a pre-determined amount of time for warm up. If we use a step scaling policy, however, we can respond to an alarm breach more agressively. The settings we've defined so far will respond the same to a small surge that barely triggers the alarm as it does to a large surge (90% CPU Utilization, for instance). Adding a step, however, allows the response to be more clever (by adding even more instances if such a surge happens, for example). We will stick to a simple scaling policy for this lab.

  • Click the Create a simple scaling policy link at the bottom.

In either case, we must define an amount of time that the policy should wait before deciding to add another instance. Since new instances take some time to start running, they will need some time before they can alleviate some of the CPU load. If the scaling policy didn't wait for the new instances to begin, it would still see the high CPU load and spin up more instances (drastically increasing costs). Let's use a fairly common time of 300 seconds:

  • Set And then wait to 300 seconds.
  • We are done configuring this particular policy, so click the Create button to create and add it.

We can create another policy to scale down when possible to save cost:

  • Click the Add policy button.
  • Type an appropriate name of scale-down.
  • Click the Create new alarm link.
  • Uncheck the notification option at the top.
  • Configure the settings:
    • Whenever: Average of CPU Utilization
    • Is: &lt;= 40 Percent
    • For at least 1 consecutive period(s) of 5 Minutes
  • The defaut name is fine, so click the Create Alarm button.
  • Click the *Create a simple scaling policy link at the bottom.
  • We want to set the Take the action option to Remove 1 percent of group.
  • For the And then wait setting, type 200 seconds.
  • Click the Create button to create and add this scaling policy.