Skip to main content

AIOps Essentials (Autoscaling Kubernetes with Prometheus Metrics)

Course

Intro Video

Photo of John Marx

John Marx

Training Architect

Length

05:29:03

Difficulty

Beginner

Videos

22

Hands-on Labs

4

Course Details

This course establishes a baseline for AIOps by utilizing Prometheus for managing time series metrics produced by Node Exporter and cAdvisor. The course guides the student through the fundamental concepts required for AIOps and the use of streaming metrics to influence autoscaling. The culmination of the course is the integration of the Prometheus rules with the Kubernetes APIServer to scale nodes in an active Kubernetes cluster.

Interactive Diagram: https://interactive.linuxacademy.com/diagrams/AIOpsEssentials.html

Syllabus

AIOps Essentials (Autoscaling Kubernetes)

Introduction

Introduction to Author

00:00:36

Lesson Description:

This is a video introducing the course author.

Introduction to This Course

00:05:34

Lesson Description:

This introduction goes over what is covered and what is not in this brief five hour course. Three suggested courses are given if the student desires more in depth coverage of Kubernetes, Prometheus, and/or Python. This introduction demonstrated the interactive study guide and shows the proof-of-concept architecture that will be demonstrated in this course.

Autoscaling a Cluster vs. Scaling an Infrastructure

00:07:05

Lesson Description:

This lesson identifies the differences between Kubernetes Autoscaling techniques and what is needed in a hyrbid cloud or multicloud context. A brief explanation of Kubernetes Autoscaling is contrast with cloud orchestration and the need for AIOps to govern scaling of multiple cloud environments.

The Case for AIOps

00:06:12

Lesson Description:

This lesson describes in further detail how Agile and DevOps has created the need for deployment automation. The deployment of workloads to on-premises data centers (clouds) as well as hybrid and multicloud architectures are discussed.

Machine Learning and Predictive Analytics

00:03:04

Lesson Description:

This video provides a brief definition of what Machine Learning is and how it is applied in this course for an AIOps use case.

Monitoring and Metrics

Prometheus

00:01:55

Lesson Description:

This video introduces Prometheus, one of the software modules we will use in this course.

Prometheus Node Exporter

00:01:21

Lesson Description:

This lesson introduces the Node Exporter component to the student and explains its role in the overall architecture.

Google cAdvisor

00:01:39

Lesson Description:

This lesson introduces Google's cAdvisor module and explains its role in the architecture for this course.

Prometheus Node Exporter and cAdvisor Demo for Lab Prep

00:03:10

Lesson Description:

This brief video informs the student of some resources available to them if they are new to Linux Academy's lab environment.

Hands-on Labs are real live environments that put you in a real scenario to practice what you have learned without any other extra charge or account to manage.

01:00:00

Exporting Metrics For AIOps

Data Taxonomy

00:07:21

Lesson Description:

This video discusses the need for establishing a Data Taxonomy for log and metrics aggregation. The hierarchy of cloud infrastructures, business contexts, and cluster architectures are all covered as a potential means of classifying and categorizing diverse input streams.

Relabeling With Prometheus

00:04:46

Lesson Description:

This video covers the use of Prometheus Relabeling to add metadata to time series data and create the taxonomy required for enterprise aggregation. Two scrape configurations are reviewed: one being EC2, and the other is Kubernetes.

Aggregating Time Series Data

00:05:13

Lesson Description:

This lesson covers aggregating data with Prometheus. This is known as "Federation" within Prometheus. A sample architecture is reviewed and a sample configuration file is given.

Using the Prometheus API

00:02:04

Lesson Description:

This lesson briefly introduces the architecture of how a Python client may be used to pull metrics from the Prometheus API. A specific example is covered in the lab, so this lesson introduces the concepts that are then covered in a hands-on way in the lab.

Hands-on Labs are real live environments that put you in a real scenario to practice what you have learned without any other extra charge or account to manage.

01:00:00

Alerts and Triggers

The Problem With Noise

00:05:25

Lesson Description:

This lesson is an introduction to our discussion of alerts and triggers. This lesson covers the challenges of using alerts in elastic infrastructures. This lesson also covers why when enterprises scale is deployed, alerts and manual intervention are no longer a feasible way to scale capacity.

Using Rules In Prometheus

00:02:50

Lesson Description:

This lesson covers Prometheus Recording Rules and shows a sample of their use.

Using Dashboards for Alerting

00:02:55

Lesson Description:

This lesson covers the use of dashboards and provides Grafana as an example. The architecture that might be employed to further refine Prometheus metrics prior to storage is covered.

Using Linear Regression With Kubernetes

Machine Learning Fundamentals

00:08:51

Lesson Description:

This video explains the Machine Learning concepts that are relevant to the lab and proof-of-concept that this course covers.

Using Python to Predict Scale

00:03:26

Lesson Description:

This lesson discusses why Python is a particularly useful language for Machine Learning. The third-party Python libraries used in this course are also discussed.

Hands-on Labs are real live environments that put you in a real scenario to practice what you have learned without any other extra charge or account to manage.

01:00:00

Scaling an Infrastructure

Scaling Nodes in a Kubernetes Cluster

00:08:18

Lesson Description:

This lesson covers the topic of scaling capacity in a Kubernetes cloud. The use of automated installers and configuration management tooling is introduced.

Scaling a Hybrid Cloud With ML

00:04:05

Lesson Description:

In this lesson, we review the architecture of our proof-of-concept and explain how it might be expanded to accommodate more complex hybrid cloud architectures.

Hands-on Labs are real live environments that put you in a real scenario to practice what you have learned without any other extra charge or account to manage.

01:00:00

Summation

Conclusion and Next Steps

00:01:43

Lesson Description:

This is a brief summation and encourages the student for further study and involvement in the open-source AIOps community.

Credits and Resources

00:01:20

Lesson Description:

This video mentions two books that were used to create this course content, and may prove useful to the student for further study.

Take this course and learn a new skill today.

Transform your learning with our all access plan.

Start 7-Day Free Trial