Skip to main content

Prometheus Deep Dive

Course

Intro Video

Photo of Will Boyd

Will Boyd

DevOps Team Lead in Content

Length

24:00:00

Difficulty

Advanced

Videos

43

Hands-on Labs

26

Course Details

This course will provide an in-depth look at the Prometheus open-source monitoring and alerting tool. We will discuss how to install, configure, and run the various components of the Prometheus ecosystem. We will talk about how to monitor systems and applications with Prometheus, how to query Prometheus data, and how to build visual representations of metric data. We will also cover advanced topics such as high availability, federation, and the use of Prometheus client libraries to add monitoring capabilities to your own code. This course is designed to provide you with an in-depth knowledge of Prometheus that will allow you to succeed with Prometheus in the real world.

Syllabus

Getting Started

Course Introduction

00:02:02

Lesson Description:

In this video, we will introduce the Prometheus Deep Dive course. We will briefly talk about what Prometheus is at a high level, and we will discuss the topics that will be covered in this course. Relevant DocumentationPrometheus Documentation

About the Training Architect

00:00:27

Lesson Description:

This video will introduce Will Boyd, the training architect who will be guiding you through the Prometheus Deep Dive course.

Prometheus Basics

What Is Prometheus?

00:02:54

Lesson Description:

Prometheus can be a powerful addition to your DevOps infrastructure. In this lesson, we start with the basics, talking about what Prometheus is and a brief overview of what it can be used for. We also introduce the history and background of Prometheus. Relevant DocumentationPrometheus Overview

Prometheus Architecture — Bird's-Eye View

00:04:24

Lesson Description:

Prometheus consists of a variety of different components that all play different roles. In this lesson, we talk about the architecture of a Prometheus system. We introduce the various components of Prometheus, which will be discussed in more detail throughout the course. We will also discuss the architecture diagram included in the Prometheus documentation. Relevant DocumentationPrometheus Overview

Prometheus Use Cases — Strengths and Limitations

00:06:24

Lesson Description:

When using Prometheus, it is important to understand its strengths and the kinds of use cases it can address. It is also important to understand the limitations of Prometheus and when it might not be the best tool. In this lesson, we provide scenarios to help clarify some of the use cases where Prometheus can be a good fit. We also talk about some situations where an alternative tool might be better. Relevant DocumentationPrometheus Overview

Installation and Configuration

Building a Prometheus Server

00:07:58

Lesson Description:

There are several ways you can go about installing and running Prometheus. In this lesson, we briefly discuss a few available options. We then walk through the process of installing Prometheus on an Ubuntu server using pre-compiled binaries. Pre-compiled Prometheus binaries are available for download at prometheus.io/download/. Prometheus source code can be found at github.com/prometheus/prometheus. Relevant DocumentationPrometheus InstallationLesson Reference Create a Cloud Playground server:Distribution: Ubuntu 18.04 Bionic Beaver LTS Size: Small Tag: PrometheusCreate a user, group, and directories for Prometheus:

sudo useradd -M -r -s /bin/false prometheus
sudo mkdir /etc/prometheus /var/lib/prometheus
Download and extract the pre-compiled binaries:
wget https://github.com/prometheus/prometheus/releases/download/v2.16.0/prometheus-2.16.0.linux-amd64.tar.gz
tar xzf prometheus-2.16.0.linux-amd64.tar.gz prometheus-2.16.0.linux-amd64/
Move the files from the downloaded archive to the appropriate locations and set ownership:
sudo cp prometheus-2.16.0.linux-amd64/{prometheus,promtool} /usr/local/bin/
sudo chown prometheus:prometheus /usr/local/bin/{prometheus,promtool}
sudo cp -r prometheus-2.16.0.linux-amd64/{consoles,console_libraries} /etc/prometheus/
sudo cp prometheus-2.16.0.linux-amd64/prometheus.yml /etc/prometheus/prometheus.yml
sudo chown -R prometheus:prometheus /etc/prometheus
sudo chown prometheus:prometheus /var/lib/prometheus
Briefly test your setup by running Prometheus in the foreground:
prometheus --config.file=/etc/prometheus/prometheus.yml
Create a systemd unit file for Prometheus:
sudo vi /etc/systemd/system/prometheus.service
Define the Prometheus service in the unit file:
[Unit]
Description=Prometheus Time Series Collection and Processing Server
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus 
    --config.file /etc/prometheus/prometheus.yml 
    --storage.tsdb.path /var/lib/prometheus/ 
    --web.console.templates=/etc/prometheus/consoles 
    --web.console.libraries=/etc/prometheus/console_libraries

[Install]
WantedBy=multi-user.target
Start and enable the Prometheus service:
sudo systemctl daemon-reload
sudo systemctl start prometheus
sudo systemctl enable prometheus
Make an HTTP request to Prometheus to verify it is able to respond:
curl localhost:9090
You can also access Prometheus in a browser using the server's public IP address: http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090.

Configuring Prometheus

00:07:33

Lesson Description:

Prometheus has a wide variety of configuration options that can allow you to customize its behavior to meet your needs. While there are too many options to cover all of them in detail, it is useful to be aware of how you can go about configuring Prometheus. In this lesson, we discuss the Prometheus configuration file and where to find detailed information on configuration options in the Prometheus documentation. We also demonstrate the process of making a configuration change to a Prometheus server. Relevant DocumentationPrometheus Configuration Example Prometheus Config FileLesson Reference Log in to your Prometheus server. Edit the Prometheus configuration file:

sudo vi /etc/prometheus/prometheus.yml
Locate the global.scrape_interval setting and change it to 10s:
global:
  scrape_interval: 10s
Reload the Prometheus configuration:
sudo killall -HUP prometheus
Another way to reload the configuration is to simply restart Prometheus (you do not need to do this if you used the killall -HUP method):
sudo systemctl restart prometheus
Query the Prometheus API to verify your changes took effect:
curl localhost:9090/api/v1/status/config
You should see global:n scrape_interval: 10s in the output.

Configuring an Exporter

00:12:31

Lesson Description:

In order to fully utilize Prometheus, you will need to configure exporters. Exporters are sources of metric data that Prometheus periodically collects. In this lesson, we set up monitoring for a Linux server. We will install Node Exporter on the server and configure Prometheus to scrape metrics from that exporter. This will enable us to query Prometheus for the new Linux server's metric data. Relevant DocumentationScrape Config Monitoring a Linux HostLesson Reference Configure a New Server to Be Monitored Create a new Cloud Playground server:Distribution: Ubuntu 18.04 Bionic Beaver LTS Size: Small Tag: Linux ServerLog in to the new server. We will configure this new server for Prometheus monitoring using Node Exporter. Create a user for Node Exporter:

sudo useradd -M -r -s /bin/false node_exporter
Download and extract the Node Exporter binary:
wget https://github.com/prometheus/node_exporter/releases/download/v0.18.1/node_exporter-0.18.1.linux-amd64.tar.gz
tar xvfz node_exporter-0.18.1.linux-amd64.tar.gz
Copy the Node Exporter binary to the appropriate location and set ownership:
sudo cp node_exporter-0.18.1.linux-amd64/node_exporter /usr/local/bin/
sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter
Create a systemd unit file for Node Exporter:
sudo vi /etc/systemd/system/node_exporter.service
Define the Node Exporter service in the unit file:
[Unit]
Description=Prometheus Node Exporter
Wants=network-online.target
After=network-online.target

[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter

[Install]
WantedBy=multi-user.target
Start and enable the node_exporter service:
sudo systemctl daemon-reload
sudo systemctl start node_exporter
sudo systemctl enable node_exporter
You can retrieve the metrics served by Node Exporter like so:
curl localhost:9100/metrics
Configure Prometheus to Scrape Metrics Log in to your Prometheus server and configure Prometheus to scrape metrics from the new server. Edit the Prometheus config file:
sudo vi /etc/prometheus/prometheus.yml
Locate the scrape_configs section and add a new entry under that section. You will need to supply the private IP address of your new playground server for targets.
...

- job_name: 'Linux Server'
  static_configs:
  - targets: ['<PRIVATE_IP_ADDRESS_OF_NEW_SERVER>:9100']

...
Reload the Prometheus config:
sudo killall -HUP prometheus
Navigate to the Prometheus expression browser in your web browser using the public IP address of your Prometheus server: <PROMETHEUS_SERVER_PUBLIC_IP>:9090. Run some queries to retrieve metric data about your new server:
up
node_filesystem_avail_bytes

Prometheus Data Model

What Is Time-Series Data?

00:03:46

Lesson Description:

All Prometheus data is fundamentally stored in the form of a time series. In this lesson, we discuss what time-series data is and how the concept applies in the context of Prometheus. This will provide you with a good foundational understanding of how metric data functions in Prometheus. Relevant DocumentationWikipedia: Time Series Prometheus Data Model

Metrics and Labels

00:06:59

Lesson Description:

Prometheus uses a combination of metrics and labels to uniquely identify each set of time-series data. In this lesson, we discuss metric names and labels. We also briefly demonstrate a few simple queries using metric names and labels to give you a general idea of how they are used. Relevant DocumentationPrometheus Metric Names and LabelsLesson Reference Access the Prometheus expression browser for your Prometheus server in a web browser. Be sure to use the public IP address of your Prometheus server:

http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090
Run a simple query using a metric name:
node_cpu_seconds_total
Add a label to the query to retrieve usage data only for a specific CPU:
node_cpu_seconds_total{cpu="0"}
Add additional labels to the query to retrieve an even narrower dataset:
node_cpu_seconds_total{cpu="0",mode="idle"}

Metric Types

00:12:34

Lesson Description:

Prometheus exporters use a handful of metric types to represent different kinds of data. In this lesson, we will discuss what metric types are. We will also discuss and demonstrate examples of the four Prometheus metric types. Relevant DocumentationMetric Types Histograms and SummariesLesson Reference Access the Prometheus expression browser for your Prometheus server in a web browser. Be sure to use the public IP address of your Prometheus server:

http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090
Query a range of data on a counter. Note the values never decrease:
node_cpu_seconds_total[5m]
Query a range of data on a gauge. Note the values can both increase and decrease:
node_memory_MemAvailable_bytes[5m]
Query a histogram's buckets:
prometheus_http_request_duration_seconds_bucket{handler="/metrics"}
Check the histogram's sum metric:
prometheus_http_request_duration_seconds_sum{handler="/metrics"}
Query the histogram's count metric:
prometheus_http_request_duration_seconds_count{handler="/metrics"}
Query a quantile metric:
go_gc_duration_seconds{job="Linux Server"}
Query a quantile metric's sum:
go_gc_duration_seconds_sum{job="Linux Server"}
Query a quantile metric's count:
go_gc_duration_seconds_count{job="Linux Server"}

Querying

Introduction to Prometheus Querying

00:01:30

Lesson Description:

In this section, we will talk about performing queries using the Prometheus Query Language (PromQL). This lesson introduces the topic for the section and provides a brief overview of what Prometheus Query Language is and how it is used. Relevant DocumentationQuerying Prometheus

Query Basics

00:07:51

Lesson Description:

Prometheus Query Language provides a robust interface for working with your metric data. In this lesson, we will introduce the basic concepts of writing Prometheus Queries. We will demonstrate these concepts by writing queries using the Prometheus Expression Browser. Relevant DocumentationQuerying PrometheusLesson Reference Access the Prometheus expression browser for your Prometheus server in a web browser. Be sure to use the public IP address of your Prometheus server:

http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090
Run a simple query using a time-series selector:
node_cpu_seconds_total
Use a time-series selector with a label:
node_cpu_seconds_total{cpu="0"}
Run some queries to experiment with various types of label matching:
node_cpu_seconds_total{cpu="0"}
node_cpu_seconds_total{cpu!="0"}
node_cpu_seconds_total{mode=~"s.*"}
node_cpu_seconds_total{mode=~"user|system"}
node_cpu_seconds_total{mode!~"user|system"}
Use a range vector selector to select time-series values over a period of two minutes:
node_cpu_seconds_total{cpu="0"}[2m]
Select data from the past using an offset modifier:
node_cpu_seconds_total{cpu="0"} offset 1h
Combine a range vector selector with an offset modifier:
node_cpu_seconds_total{cpu="0"}[5m] offset 1h

Query Operators

00:12:24

Lesson Description:

PromQL includes a wide variety of operators you can use to manipulate data in your queries. In this lesson, we provide an overview of what the various types of operators can do. We also demonstrate how to use operators by building some queries. Relevant DocumentationOperatorsLesson Reference Access the Prometheus expression browser for your Prometheus server in a web browser. Be sure to use the public IP address of your Prometheus server:

http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090
Use an arithmetic operator:
node_cpu_seconds_total * 2
Experiment with combining datasets using an arithmetic operator:
node_cpu_seconds_total{mode="system"} + ignoring(mode) node_cpu_seconds_total{mode="user"}
node_cpu_seconds_total{mode="system"} + on(cpu) node_cpu_seconds_total{mode="user"}
Use a comparison operator to filter results:
node_cpu_seconds_total == 0
Use the bool keyword to display the results of a comparison:
node_cpu_seconds_total == bool 0
Experiment with combining datasets using logical set operators:
node_cpu_seconds_total
node_cpu_guest_seconds_total
node_cpu_seconds_total and node_cpu_guest_seconds_total
node_cpu_seconds_total or node_cpu_guest_seconds_total
Use an aggregation operator:
node_cpu_seconds_total{mode="idle"}
avg(node_cpu_seconds_total{mode="idle"})
sum(node_cpu_seconds_total{mode="idle"})

Query Functions

00:04:16

Lesson Description:

Prometheus functions provide a wide range of built-in functionality that can greatly simplify the process of writing queries. In this lesson, we introduce Prometheus query functions. We will provide a few examples of functions and demonstrate their usage by executing some queries that include those functions. Relevant DocumentationFunctionsLesson Reference Access the Prometheus expression browser for your Prometheus server in a web browser. Be sure to use the public IP address of your Prometheus server:

http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090
Use the clamp_max function to clamp result values to a set maximum:
clamp_max(node_cpu_seconds_total{cpu="0"}, 1000)
Use the rate function to get the rate of increase in total CPU seconds:
rate(node_cpu_seconds_total[1h])

Prometheus HTTP API

00:06:08

Lesson Description:

One way to execute queries is through the Prometheus HTTP API. This API is a great way to interact with Prometheus from the context of your own custom tools and applications. In this lesson, we introduce the HTTP API and demonstrate the process of running some basic queries via the API. Relevant DocumentationHTTP APILesson Reference Log in to your Prometheus server. Make an HTTP request to the /api/v1/query endpoint to execute a query via the HTTP API:

curl localhost:9090/api/v1/query?query=node_cpu_seconds_total
For queries that contain certain characters, you may need to URL-encode the query:
curl localhost:9090/api/v1/query --data-urlencode "query=node_cpu_seconds_total{cpu="0"}"
Use the /api/v1/query_range endpoint to query a range vector. Supply start and end parameters to specify the start and end of the time-range. The step parameter determines how detailed the results will be. A step duration of 1m will return values for every one minute during the specified time range:
start=$(date --date '-5 min' +'%Y-%m-%dT%H:%M:%SZ')
end=$(date +'%Y-%m-%dT%H:%M:%SZ')
curl "localhost:9090/api/v1/query_range?query=node_cpu_seconds_total&start=$start&end=$end&step=1m"

Introduction to Visualization

Prometheus Visualization Methods

00:01:51

Lesson Description:

Prometheus is great at collecting metric data, but raw data is not necessarily useful in real-world scenarios. Visualization allows you to transform and display your data in useful ways that can drive decisions and action. In this section, we discuss a variety of methods you can use to visualize your metric data. Relevant DocumentationExpression Browser Grafana Console Templates

Native Visualization Methods

Expression Browser

00:04:38

Lesson Description:

One of the easiest ways to view your Prometheus metric data is to use Prometheus expression browser. It allows you to run simple ad-hoc queries and view the results. In this lesson, we discuss what the expression browser is and demonstrate how it can be used. Relevant DocumentationExpression BrowserLesson Reference Access expression browser in a web browser at http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090/graph. Run a simple query to retrieve some time-series data:

node_cpu_seconds_total[5m]
You should see several metrics, each with multiple time-series entries. Experiment with the Moment controls to change the moment in time for your query and view results at different points in time. Switch to the Graph view. Write a query that shows current CPU usage in different modes:
rate(node_cpu_seconds_total[5m])
Feel free to examine and experiment with the displayed graph to gain insight into your data!

Console Templates

00:07:02

Lesson Description:

While expression browser provides a great way to run ad-hoc queries, it is often necessary to have a more permanent view of your data you can quickly access at any time. Console templates allow you to display your data in a customizable way that can be easily accessed in a web browser. In this lesson, we talk about console templates and demonstrate the creation process. Relevant DocumentationConsole TemplatesLesson Reference Log in to your Prometheus server and create a console template file:

vi /etc/prometheus/consoles/disk-io.html
Implement a basic console template that displays some data:
{{template "head" .}}
{{template "prom_content_head" .}}

<h1>Disk IO Rate</h1>
<br />
Current Disk IO: {{ template "prom_query_drilldown" (args "rate(node_disk_io_time_seconds_total{job='Linux Server'}[5m])") }}

{{template "prom_content_tail" .}}
{{template "tail"}}
View your console template in a web browser:
http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090/consoles/disk-io.html

Console Template Graph Library

00:04:23

Lesson Description:

Console templates are a great way to provide a customized and permanent view of your important metric data. Often, you may wish to display your data in the form of a graph. Graphs make it easy to gain insight into your data at a glance. Prometheus console templates include a graph library that simplifies the process of displaying basic graphs within your templates. In this lesson, we will discuss the graph library and demonstrate the process of adding a graph to a console template. Relevant DocumentationGraph LibraryLesson Reference Log in to your Prometheus server and edit your console template file:

vi /etc/prometheus/consoles/disk-io.html
Add a graph to your console template:
{{template "head" .}}
{{template "prom_content_head" .}}

<h1>Disk IO Rate</h1>
<br />
Current Disk IO: {{ template "prom_query_drilldown" (args "rate(node_disk_io_time_seconds_total{job='Linux Server'}[5m])") }}
<br />
<br />
<div id="diskIoGraph"></div>
<script>
new PromConsole.Graph({
  node: document.querySelector("#diskIoGraph"),
  expr: "rate(node_disk_io_time_seconds_total{job='Linux Server'}[5m])"
})
</script>

{{template "prom_content_tail" .}}
{{template "tail"}}
View your console template in a web browser:
http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090/consoles/disk-io.html

Grafana

What Is Grafana?

00:01:00

Lesson Description:

Prometheus is a great tool for collecting metrics, but its visualization capabilities are limited. Grafana can serve as a useful addition to your monitoring ecosystem, providing advanced and customizable visualizations and dashboards on top of your Prometheus metric data. In this lesson, we introduce Grafana and discuss how it fits together with Prometheus. Relevant DocumentationGrafana What is Grafana?

Installing and Configuring Grafana

00:06:30

Lesson Description:

Grafana is a great tool for visualizing your Prometheus data. In this lesson, we demonstrate the process of installing Grafana and configuring it to pull metrics from a Prometheus server. Relevant DocumentationInstall on Debian or UbuntuLesson Reference Create a Grafana server. Recommended cloud playground settings:Distribution: Ubuntu 18.04 Bionic Beaver LTS Size: Small Tag: GrafanaLog in to your new server. Install some required packages:

sudo apt-get install -y apt-transport-https software-properties-common wget
Add the GPG key for the Grafana OSS repository, and then add the repository:
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
sudo add-apt-repository "deb https://packages.grafana.com/oss/deb stable main"
Install the Grafana package:
sudo apt-get update
sudo apt-get install grafana=6.6.2
Enable and start the grafana-server service:
sudo systemctl enable grafana-server
sudo systemctl start grafana-server
Make sure the service is in the Active (running) state:
sudo systemctl status grafana-server
You can also verify Grafana is working by accessing it in a web browser at http://<GRAFANA_SERVER_PUBLIC_IP>:3000. Log in to Grafana with the username admin and password admin. Reset the password when prompted. Click Add data source. Select Prometheus. For the URL, enter http://<PROMETHEUS_SERVER_PRIVATE_IP>:9090. Be sure to supply the unique private IP address of your Prometheus server. Click Save & Test. You should see a banner that says Data source is working. Test your setup by running a query to get some Prometheus data. Click the Explore icon on the left. In the PromQL Query input, enter a simple query, such as up. Execute the query. You should see some data appear.

Building Prometheus Dashboards in Grafana

00:05:54

Lesson Description:

Now that you have Grafana installed and configured, you are ready to use it to visualize your Prometheus data. In this lesson, we demonstrate the process of setting up a Grafana dashboard to display metrics related to a Linux server monitored by Prometheus. Relevant DocumentationUsing Grafana with PrometheusLesson Reference Access Grafana in a browser at http://<GRAFANA_SERVER_PUBLIC_IP>:3000. Log in if necessary. Create a New Dashboard Click the Create button on the left, and then select Dashboard. Click the Save Dashboard button near the top right. For the dashboard name, enter Linux Server and then save. Add a Server Status Panel Click the Add Panel button near the top right, and then Add Query. For the PromQL query, enter up{job="Web Server"}. Click the Visualization icon. Click the visualization type dropdown that currently says Graph, and change it to Singlestat. Under Value Mappings, enter two value to text mappings:1 -> Up 0 -> DownClick the General icon, and change the panel title to Server Status. Click the back button in the top left. You should see your dashboard, and the Server Status panel should say Up. Add a Disk IO Rate Panel Click the Add Panel button near the top right, and then Add Query. For the PromQL query, enter rate(node_disk_io_time_seconds_total{job="Linux Server"}[5m]). Click the General icon, and change the panel title to Disk IO Rate. Click the back button in the top left. You should see your dashboard, and there should be a graph showing disk IO rate. Rearrange your panels by dragging and dropping them if desired. Click the Save Dashboard button near the top right, and then Save to save your changes.

Exporters

Introduction to Exporters

00:01:13

Lesson Description:

Prometheus exporters allow you to collect metrics from a wide variety of systems and applications so they can be scraped by a Prometheus server. In this section, we will discuss different kinds of exporters and dive more deeply into the process of scraping metrics from various sources. This lesson briefly recaps what Prometheus exporters are in preparation for this deeper dive. Relevant DocumentationExporters

Application Monitoring

00:09:53

Lesson Description:

Prometheus can monitor a wide variety of applications using various exporters. In this lesson, we explore the process of monitoring applications and demonstrate how to set up monitoring for the Apache web server application. Relevant DocumentationExporters Apache Exporter for PrometheusLesson Reference Configure the Apache Server to Provide Metrics Log in to your Linux server. Install Apache:

sudo apt-get update
sudo apt-get install -y apache2
Make a request to Apache to verify it is up and running:
curl localhost:80
Download and install the Apache Exporter binary:
sudo useradd -M -r -s /bin/false apache_exporter
wget https://github.com/Lusitaniae/apache_exporter/releases/download/v0.7.0/apache_exporter-0.7.0.linux-amd64.tar.gz
tar xvfz apache_exporter-0.7.0.linux-amd64.tar.gz
sudo cp apache_exporter-0.7.0.linux-amd64/apache_exporter /usr/local/bin/
sudo chown apache_exporter:apache_exporter /usr/local/bin/apache_exporter
Set up a systemd service for Apache Exporter:
sudo vi /etc/systemd/system/apache_exporter.service
[Unit]
Description=Prometheus Apache Exporter
Wants=network-online.target
After=network-online.target

[Service]
User=apache_exporter
Group=apache_exporter
Type=simple
ExecStart=/usr/local/bin/apache_exporter

[Install]
WantedBy=multi-user.target
Start and enable the apache_exporter service:
sudo systemctl enable apache_exporter
sudo systemctl start apache_exporter
Make sure Apache Exporter is working:
sudo systemctl status apache_exporter
curl localhost:9117/metrics
Configure Prometheus to Scrape Metrics from Apache Log in to your Prometheus server. Edit the Prometheus config:
sudo vi /etc/prometheus/prometheus.yml
Under the scrape_configs section, add a scrape configuration for the Apache Exporter. Use the private IP address of your Linux/Apache server for the target:
- job_name: 'Apache'
    static_configs:
    - targets: ['<APACHE_SERVER_PRIVATE_IP>:9117']
Restart Prometheus to load the new configuration:
sudo systemctl restart prometheus
Use the expression browser to verify you can see Apache metrics in Prometheus. You can access the expression browser in a web browser at http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090. Run a query to view some Apache metric data:
apache_workers

Jobs and Instances

00:07:00

Lesson Description:

Prometheus scrapes use the concepts of jobs and instances to identify specific exporters. In this lesson, we discuss the relationship between jobs, instances, and exporters. We also explore how jobs and instances show up in Prometheus data. Relevant DocumentationJobs and InstancesLesson Reference You can access the Prometheus expression browser in a web browser. Use the public IP address of your Prometheus server:

http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090
Run a query to retrieve some data that includes metrics from multiple jobs:
up
Query data for a specific job:
up{job="Linux Server"}
Query data for a specific instance:
up{instance="localhost:9090"}
Query some additional metadata about Prometheus scrapes:
scrape_duration_seconds

Prometheus Pushgateway

Introduction to Pushgateway

00:03:04

Lesson Description:

Normally, Prometheus uses a pull-based model to gather metrics. However, in some circumstances a push-based model is needed. Prometheus Pushgateway provides a solution for these cases. In this lesson, we introduce Prometheus Pushgateway and discuss its role in the Prometheus ecosystem. Relevant DocumentationPushing Metrics Prometheus Pushgateway When to Use Pushgateway

Installing Pushgateway

00:07:40

Lesson Description:

Prometheus Pushgateway acts as a middleman between Prometheus and short-lived processes that require a push-based method for getting metrics into Prometheus. In this lesson, we demonstrate the process of installing Pushgateway. Then, we configure our Prometheus server to scrape metrics from Pushgateway. Relevant DocumentationPushing Metrics Prometheus PushgatewayLesson Reference Install Pushgateway Log in to the Prometheus server. Create a user and group for Pushgateway:

sudo useradd -M -r -s /bin/false pushgateway
Download and install the Pushgateway binary:
wget https://github.com/prometheus/pushgateway/releases/download/v1.2.0/pushgateway-1.2.0.linux-amd64.tar.gz
tar xvfz pushgateway-1.2.0.linux-amd64.tar.gz
sudo cp pushgateway-1.2.0.linux-amd64/pushgateway /usr/local/bin/
sudo chown pushgateway:pushgateway /usr/local/bin/pushgateway
Create a systemd unit file for Pushgateway:
sudo vi /etc/systemd/system/pushgateway.service
[Unit]
Description=Prometheus Pushgateway
Wants=network-online.target
After=network-online.target

[Service]
User=pushgateway
Group=pushgateway
Type=simple
ExecStart=/usr/local/bin/pushgateway

[Install]
WantedBy=multi-user.target
Start and enable the pushgateway service:
sudo systemctl enable pushgateway
sudo systemctl start pushgateway
Verify the service is running and it is serving metrics:
sudo systemctl status pushgateway
curl localhost:9091/metrics
Configure Prometheus to Scrape Metrics from Pushgateway Edit the Prometheus config:
sudo vi /etc/prometheus/prometheus.yml
Under the scrape_configs section, add a scrape configuration for Pushgateway. Be sure to set honor_labels: true:
- job_name: 'Pushgateway'
    honor_labels: true
    static_configs:
    - targets: ['localhost:9091']
Restart Prometheus to load the new configuration:
sudo systemctl restart prometheus
Use the expression browser to verify you can see Pushgateway metrics in Prometheus. You can access the expression browser in a web browser at http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090. Run a query to view some Pushgateway metric data:
pushgateway_build_info

Pushing Data to Pushgateway

00:06:51

Lesson Description:

Prometheus Pushgateway provides a push-based method for sending metrics to Prometheus for special cases that need such a method. In this lesson, we explore the process of pushing metrics to Prometheus via Pushgateway. We demonstrate the process of sending metric data through Pushgateway using the Pushgateway API. Relevant DocumentationPushing Metrics Prometheus PushgatewayLesson Reference Log in to your Prometheus/Pushgateway server. Make a curl request to send some metrics to the Pushgateway API:

echo "value_of_pi 3.14" | curl --data-binary @- http://localhost:9091/metrics/job/my_job
Query the Pushgateway metrics endpoint to see the metric you submitted:
curl localhost:9091/metrics
Make another curl request with a more complex set of metrics, this time specifying an instance:
cat << EOF | curl --data-binary @- http://localhost:9091/metrics/job/my_job/instance/my_instance
# TYPE temperature gauge
temperature{location="room1"} 31
temperature{location="room2"} 33
# TYPE my_metric gauge
# HELP my_metric An example.
my_metric 5
EOF
Check the Pushgateway metrics endpoint again:
curl localhost:9091/metrics
Look for your newly pushed metrics in Prometheus using the expression browser (http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090). Run some queries to locate your data:
value_of_pi
temperature
my_metric

Recording Rules

Introduction to Recording Rules

00:02:04

Lesson Description:

Prometheus recording rules provide an additional layer of control over your data, allowing you to periodically pre-calculate new metrics using your own PromQL expressions. In this lesson, we discuss recording rules and how they are used within Prometheus. Relevant DocumentationRecording Rules

Configuring Recording Rules

00:05:13

Lesson Description:

Recording rules are a useful tool for pre-calculating and storing the results of customizable PromQL expressions. They allow you to use Prometheus more efficiently by creating new time-series metrics to store calculated data rather than re-calculating every time a query is run. In this lesson, we explore the process of setting up recording rules, and we demonstrate the process of creating a recording rule. Relevant DocumentationRecording RulesLesson Reference Log in to your Prometheus server. Create a new directory to store rule files:

sudo mkdir -p /etc/prometheus/rules
Edit the Prometheus configuration:
sudo vi /etc/prometheus/prometheus.yml
Locate the rule_files: section and add the rules directory as a new entry:
...

rule_files:
  - "/etc/prometheus/rules/*"

...
Create a new rule file:
sudo vi /etc/prometheus/rules/linux_server_rules.yml
Implement a rule to calculate and store CPU usage data:
groups:
- name: linux_server
  interval: 15s
  rules:
  - record: linux_server:cpu_usage
    expr: sum(rate(node_cpu_seconds_total{job="Linux Server",mode!='idle'}[5m])) * 100 / 2
Restart Prometheus to load the configuration changes:
sudo systemctl restart prometheus
Access the expression browser at http://<PROMETHEUS_PUBLIC_SERVER_IP>:9090/graph. Execute a query to view the data calculated by your recording rule:
linux_server:cpu_usage

Alertmanager Setup and Configuration

What Is Alertmanager?

00:02:40

Lesson Description:

Prometheus metrics can provide powerful insights into your systems that can help you make decisions, but it is often not enough to simply be able to look at your data. In the real world, you will likely need real-time alerts to let you know when something is happening that you might need to address. Prometheus Alertmanager can be very useful in this regard. In this lesson, we will introduce Alertmanager and discuss its role in your monitoring and alerting ecosystem. Relevant DocumentationAlertmanager Alerting Overview

Installing Alertmanager

00:09:10

Lesson Description:

Prometheus Alertmanager provides useful functionality for managing alerts triggered by Prometheus metric data. However, before you can use Alertmanager, you need to get it up and running. In this lesson, we will install Alertmanager and configure our Prometheus server to connect to it. Relevant DocumentationAlertmanager Alertmanager GitHubLesson Reference Install and Configure Alertmanager and amtool Log in to your Prometheus server. Create a user and group for Alertmanager:

sudo useradd -M -r -s /bin/false alertmanager
Download and install the Alertmanager binaries, move the files into the appropriate locations, and set ownership:
wget https://github.com/prometheus/alertmanager/releases/download/v0.20.0/alertmanager-0.20.0.linux-amd64.tar.gz
tar xvfz alertmanager-0.20.0.linux-amd64.tar.gz
sudo cp alertmanager-0.20.0.linux-amd64/{alertmanager,amtool} /usr/local/bin/
sudo chown alertmanager:alertmanager /usr/local/bin/{alertmanager,amtool}
sudo mkdir -p /etc/alertmanager
sudo cp alertmanager-0.20.0.linux-amd64/alertmanager.yml /etc/alertmanager
sudo chown -R alertmanager:alertmanager /etc/alertmanager
Create a data directory for Alertmanager:
sudo mkdir -p /var/lib/alertmanager
sudo chown alertmanager:alertmanager /var/lib/alertmanager
Create a configuration file for amtool:
sudo mkdir -p /etc/amtool
sudo vi /etc/amtool/config.yml
Enter the following content in the amtool config file:
alertmanager.url: http://localhost:9093
Create a systemd unit file for Alertmanager:
sudo vi /etc/systemd/system/alertmanager.service
[Unit]
Description=Prometheus Alertmanager
Wants=network-online.target
After=network-online.target

[Service]
User=alertmanager
Group=alertmanager
Type=simple
ExecStart=/usr/local/bin/alertmanager 
  --config.file /etc/alertmanager/alertmanager.yml 
  --storage.path /var/lib/alertmanager/

[Install]
WantedBy=multi-user.target
Start and enable the alertmanager service:
sudo systemctl enable alertmanager
sudo systemctl start alertmanager
Verify the service is running and you can reach it:
sudo systemctl status alertmanager
curl localhost:9093
You can also access Alertmanager in a web browser at http://<PROMETHEUS_SERVER_PUBLIC_IP>:9093. Verify amtool is able to connect to Alertmanager and retrieve the current configuration:
amtool config show
Configure Prometheus to Connect to Alertmanager Edit the Prometheus config:
sudo vi /etc/prometheus/prometheus.yml
Under alerting, add your Alertmanager as a target:
alerting:
  alertmanagers:
  - static_configs:
    - targets: ["localhost:9093"]
Restart Prometheus to reload the configuration:
sudo systemctl restart prometheus
Verify Prometheus is able to reach the Alertmanager. Access Prometheus in a web browser at http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090. Click Status > Runtime & Build Information. Verify your Alertmanager instance appears under the Alertmanagers section.

Alertmanager Configuration

00:04:49

Lesson Description:

Alertmanager includes a number of global configuration options you can use to customize its behavior. In order to manage an Alertmanager instance, you need to be aware of Alertmanager's configuration file and how to go through the process of managing its configuration. In this lesson, we will discuss the process of making changes to your Alertmanager configuration and demonstrate this process by implementing a simple configuration change. Relevant DocumentationAlertmanager ConfigurationLesson Reference Log in to the Prometheus server. Edit the Alertmanager configuration file:

sudo vi /etc/alertmanager/alertmanager.yml
Set global.resolve_timeout to 10m:
global:

  ...

  resolve_timeout: 10m

...
Restart Alertmanager to load the new configuration:
sudo systemctl restart alertmanager
Note: If you wish, you can load the new configuration without restarting Alertmanager:
sudo killall -HUP alertmanager
Verify your configuration is valid by ensuring Alertmanager is running. Access Alertmanager in your browser at http://<PROMETHEUS_SERVER_PUBLIC_IP>:9093. Once you can see Alertmanager in your browser, click Status and look for your new resolve_timeout configuration in the Config section.

High Availability and Alertmanager

00:10:34

Lesson Description:

Alertmanager is a useful tool for handling Prometheus alerts, but what if your Alertmanager instance goes down? In such a scenario, you could miss out on critical alerts that need to be addressed. Luckily, Alertmanager can run in a multi-instance cluster, making it more highly available. In this lesson, we will explore what a highly available Alertmanager configuration looks like by setting up an additional Alertmanager instance to run in a cluster with our existing Alertmanager. Relevant DocumentationAlertmanager — High Availability Alertmanager GitHub — High AvailabilityLesson Reference Install Alertmanager on a New Server Set up a new server. You may wish to give it a tag of Prometheus 2 if you plan to use this same server for an additional Prometheus instance in the future. Cloud Playground settings:Distribution: Ubuntu 18.04 Bionic Beaver LTS Size: Small Tag: Prometheus 2Log in to the new server. Create a user and group for Alertmanager:

sudo useradd -M -r -s /bin/false alertmanager
Download and install the Alertmanager binaries, move the files into the appropriate locations, and set ownership:
wget https://github.com/prometheus/alertmanager/releases/download/v0.20.0/alertmanager-0.20.0.linux-amd64.tar.gz
tar xvfz alertmanager-0.20.0.linux-amd64.tar.gz
sudo cp alertmanager-0.20.0.linux-amd64/{alertmanager,amtool} /usr/local/bin/
sudo chown alertmanager:alertmanager /usr/local/bin/{alertmanager,amtool}
sudo mkdir -p /etc/alertmanager
sudo cp alertmanager-0.20.0.linux-amd64/alertmanager.yml /etc/alertmanager
sudo chown -R alertmanager:alertmanager /etc/alertmanager
Create a data directory for Alertmanager:
sudo mkdir -p /var/lib/alertmanager
sudo chown alertmanager:alertmanager /var/lib/alertmanager
Create a configuration file for amtool:
sudo mkdir -p /etc/amtool
sudo vi /etc/amtool/config.yml
Enter the following content in the amtool config file:
alertmanager.url: http://localhost:9093
Create a systemd unit file for Alertmanager:
sudo vi /etc/systemd/system/alertmanager.service
For the --cluster.peer flag, enter the private IP address of your first Prometheus/Alertmanager server:
[Unit]
Description=Prometheus Alertmanager
Wants=network-online.target
After=network-online.target

[Service]
User=alertmanager
Group=alertmanager
Type=simple
ExecStart=/usr/local/bin/alertmanager 
  --config.file /etc/alertmanager/alertmanager.yml 
  --storage.path /var/lib/alertmanager/ 
  --cluster.peer=<ALERTMANAGER_1_SERVER_PRIVATE_IP>:9094

[Install]
WantedBy=multi-user.target
Start and enable the alertmanager service:
sudo systemctl enable alertmanager
sudo systemctl start alertmanager
Verify the service is running and you can reach it:
sudo systemctl status alertmanager
curl localhost:9093
Configure Your Existing Alertmanager Instance to Run in a Cluster Log in to your first Prometheus/Alertmanager server. Edit the Alertmanager unit file:
sudo vi /etc/systemd/system/alertmanager.service
Add a --cluster.peer flag to the ExecStart section. Include the private IP address of your second Alertmanager server:
...

ExecStart=/usr/local/bin/alertmanager 
  --config.file /etc/alertmanager/alertmanager.yml 
  --storage.path /var/lib/alertmanager/ 
  --cluster.peer=<ALERTMANAGER_2_SERVER_PRIVATE_IP>:9094

...
Reload and restart the alertmanager service:
sudo systemctl daemon-reload
sudo systemctl restart alertmanager
To verify your cluster is working, access both instances in a browser with the address http://<PUBLIC_IP>:9093. One one instance, click Silences and create a new silence. Click Silences on the other instance, and verify the silence you created appears. Configure Prometheus to Connect to Both Alertmanager Instances Log in to the Prometheus server. Edit the Prometheus configuration file:
sudo vi /etc/prometheus/prometheus.yml
Add the new Alertmanager (<ALERTMANAGER_2_PRIVATE_IP>:9093) to the list of Alertmanager targets.
alerting:
  alertmanagers:
  - static_configs:
    - targets: ["localhost:9093", "<ALERTMANAGER_2_PRIVATE_IP>:9093"]
Restart Prometheus to reload the config:
sudo systemctl restart prometheus
Access Prometheus Server in a browser at http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090. Click Status > Runtime & Build Information. Verify both of your alert managers appear under the Alertmanagers section.

Prometheus Alerts

Alerting Rules

00:09:40

Lesson Description:

Before we can manage alerts with Prometheus Alertmanager, we must first issue the alerts from Prometheus itself. This can be done using alerting rules. In this lesson, we will explore alerting rules and demonstrate the process of creating a new alert in Prometheus. Relevant DocumentationAlerting RulesLesson Reference Create an Alert Log in to the Prometheus server. Edit the Prometheus config file:

sudo vi /etc/prometheus/prometheus.yml
Add a path for rules files to the Prometheus config:
rule_files:
- "/etc/prometheus/rules/*.yml"
Create the rules directory:
sudo mkdir -p /etc/prometheus/rules/
Create a new rules file for your alerting rule:
sudo vi /etc/prometheus/rules/my-alerts.yml
Implement an alerting rule to issue an alert when the Linux server goes down:
groups:
- name: linux-server
  rules:
  - alert: LinuxServerDown
    expr: up{job="Linux Server"} == 0
    labels:
      severity: critical
    annotations:
      summary: Linux Server Down
Restart Prometheus to reload the configuration:
sudo systemctl restart prometheus
Access Prometheus in a browser at http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090. Click Alerts. You should see your LinuxServerDown alert listed. Test the Alert Log in to your Linux server that is being monitored by Prometheus and stop the node_exporter service to simulate the server going down (alternatively, you can just stop the server itself in Cloud Playground, but it may take longer):
sudo systemctl stop node_exporter
After a few moments, check the Alertmanager page in your browser (http://<PROMETHEUS_SERVER_PUBLIC_IP>:9093). If the alert is firing, you should see the LinuxServerDown alert appear. Start node_exporter again and watch the alert disappear in Alertmanager:
sudo systemctl start node_exporter

Managing Alerts

00:13:59

Lesson Description:

Once you have alerts firing in Prometheus, you can use Alertmanager to manage your alerts effectively. In this lesson, we will discuss and demonstrate some useful features of Alertmanager that can make your alerts more effective. Relevant DocumentationAlertmanager Alertmanager ConfigurationLesson Reference Set Up Test Alerts Log in to your Prometheus server. Create an alerting rules file with some test alerts:

sudo vi /etc/prometheus/rules/test-alerts.yml
groups:
- name: test-alerts
  rules:
  - alert: Server1Down
    expr: 1
    labels:
      severity: critical
      service: linuxserver
    annotations:
      summary: Server 1 Down
  - alert: Server2Down
    expr: 1
    labels:
      severity: critical
      service: linuxserver
    annotations:
      summary: Server 2 Down
  - alert: DownstreamServiceDown
    expr: 1
    labels:
      severity: critical
    annotations:
      summary: A downstream service is broken.
Restart Prometheus to load the new alerting rules:
sudo systemctl restart prometheus
If you access Alertmanager in a browser at http://<PROMETHEUS_SERVER_PUBLIC_IP>:9093, you should see these three alerts active. Set Up Routing to Group Similar Alerts Edit your Alertmanager config:
sudo vi /etc/alertmanager/alertmanager.yml
Add a new routing node to group the Server.*Down alerts:
route:

  ...

  routes:
  - receiver: 'web.hook'
    group_by: ['service']
    match_re:
      alertname: 'Server.*Down'
Load the new configuration:
sudo killall -HUP alertmanager
Refresh Alertmanager in your browser. The two Server.*Down should be combined into one group. Set Up an Inhibition to Suppress the DownstreamServiceDown Alert Edit your Alertmanager config:
sudo vi /etc/alertmanager/alertmanager.yml
Add an inhibit rule to suppress the DownstreamServiceDown alert when either of the Server.*Down alerts are active:
inhibit_rules:

  ...

  - source_match_re:
      alertname: 'Server.*Down'
    target_match:
      alertname: 'DownstreamServiceDown'
Load the new configuration:
sudo killall -HUP alertmanager
Refresh Alertmanager in your browser. The DownstreamServiceDown alert should no longer appear. You can click the Inhibited box to make it visible again. Temporarily Silence an Alert Access Alertmanager in a web browser at http://<PROMETHEUS_SERVER_PUBLIC_IP>:9093. Expand the service="linuxserver" group. Locate the alert with alertname="Server1Down", and click the Silence button for that alert. Fill out the Creator and Comment fields, and then click Create. If you return to the main Alertmanager page, the Server1Down alert should no longer appear.

Using Multiple Prometheus Servers

High Availability

00:09:33

Lesson Description:

A single Prometheus server could become a single point of failure if it goes down, causing you to be unable to access critical metric data when you need it. The simplicity of Prometheus' design makes it relatively easy to create new Prometheus servers for high availability. In this lesson, we will discuss what high availability looks like for Prometheus, and we will demonstrate it by building a second Prometheus server. Relevant DocumentationCan Prometheus be made highly available?Lesson Reference If you already built a second server for an additional Alertmanager instance, feel free to use that same server for this lesson. Otherwise, you will need to create a second server with the following settings:Distribution: Ubuntu 18.04 Bionic Beaver LTS Size: Small Tag: Prometheus 2Install Prometheus on the Second Server Log in to your second server. Create a user, group, and directories for Prometheus:

sudo useradd -M -r -s /bin/false prometheus
sudo mkdir /etc/prometheus /var/lib/prometheus
Download and extract the pre-compiled binaries:
wget https://github.com/prometheus/prometheus/releases/download/v2.16.0/prometheus-2.16.0.linux-amd64.tar.gz
tar xzf prometheus-2.16.0.linux-amd64.tar.gz prometheus-2.16.0.linux-amd64/
Move the files from the downloaded archive to the appropriate locations and set ownership:
sudo cp prometheus-2.16.0.linux-amd64/{prometheus,promtool} /usr/local/bin/
sudo chown prometheus:prometheus /usr/local/bin/{prometheus,promtool}
sudo cp -r prometheus-2.16.0.linux-amd64/{consoles,console_libraries} /etc/prometheus/
sudo cp prometheus-2.16.0.linux-amd64/prometheus.yml /etc/prometheus/prometheus.yml
sudo chown -R prometheus:prometheus /etc/prometheus
sudo chown prometheus:prometheus /var/lib/prometheus
Briefly test your setup by running Prometheus in the foreground:
prometheus --config.file=/etc/prometheus/prometheus.yml
Create a systemd unit file for Prometheus:
sudo vi /etc/systemd/system/prometheus.service
Define the Prometheus service in the unit file:
[Unit]
Description=Prometheus Time Series Collection and Processing Server
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus 
    --config.file /etc/prometheus/prometheus.yml 
    --storage.tsdb.path /var/lib/prometheus/ 
    --web.console.templates=/etc/prometheus/consoles 
    --web.console.libraries=/etc/prometheus/console_libraries

[Install]
WantedBy=multi-user.target
Copy All Prometheus Configurations from Server 1 to Server 2 Log in to your first Prometheus server. Edit your Prometheus config to change any localhost references that might not work properly on the new server:
sudo vi /etc/prometheus/prometheus.yml
Change references to localhost except for the target of the prometheus job to the private IP address of your first Prometheus server. Copy prometheus.yml to Prometheus Server 2:
scp /etc/prometheus/prometheus.yml cloud_user@<PROMETHEUS_SERVER_2_PRIVATE_IP>:/home/cloud_user
Copy your rules files to Prometheus Server 2:
scp /etc/prometheus/rules/* cloud_user@<PROMETHEUS_SERVER_2_PRIVATE_IP>:/home/cloud_user
Log in to Prometheus Server 2 and move prometheus.yml to the appropriate location:
sudo mv ~/prometheus.yml /etc/prometheus/prometheus.yml
Create the rules directory, and then move the rules file to the appropriate location.
sudo mkdir -p /etc/prometheus/rules
sudo mv ~/*.yml /etc/prometheus/rules
Start the Second Prometheus Instance On Prometheus Server 2, start and enable Prometheus:
sudo systemctl enable prometheus
sudo systemctl start prometheus
Access Prometheus Server 2 in a browser at http://<PROMETHEUS_SERVER_2_PUBLIC_IP>:9090. Run a query to verify it is working:
up
You should see up data for all jobs that were previously set up on Prometheus Server 1. You can also click Alerts to verify the alerts from Prometheus Server 1 appear.

Federation

00:09:26

Lesson Description:

Prometheus supports the ability to pull metric data from one Prometheus server to another. This allows you to have local Prometheus servers monitoring a small set of applications and services, while also passing that data to other Prometheus servers for aggregation and/or centralization. This process is known as federation. In this lesson, we will discuss how federation can be used with Prometheus, and we will demonstrate how to federate data between Prometheus servers. Relevant DocumentationFederation Scaling and Federating PrometheusLesson Reference To federate data, you will need to build a new Prometheus server with the following settings:Distribution: Ubuntu 18.04 Bionic Beaver LTS Size: Small Tag: Federal Prometheus ServerNote: You may need to delete an existing server to make room for this new server. You can delete either the Grafana or Prometheus 2 server. Install Prometheus on the Federal Prometheus Server Log in to your new Federal Prometheus Server. Create a user, group, and directories for Prometheus:

sudo useradd -M -r -s /bin/false prometheus
sudo mkdir /etc/prometheus /var/lib/prometheus
Download and extract the pre-compiled binaries:
wget https://github.com/prometheus/prometheus/releases/download/v2.16.0/prometheus-2.16.0.linux-amd64.tar.gz
tar xzf prometheus-2.16.0.linux-amd64.tar.gz prometheus-2.16.0.linux-amd64/
Move the files from the downloaded archive to the appropriate locations and set ownership:
sudo cp prometheus-2.16.0.linux-amd64/{prometheus,promtool} /usr/local/bin/
sudo chown prometheus:prometheus /usr/local/bin/{prometheus,promtool}
sudo cp -r prometheus-2.16.0.linux-amd64/{consoles,console_libraries} /etc/prometheus/
sudo cp prometheus-2.16.0.linux-amd64/prometheus.yml /etc/prometheus/prometheus.yml
sudo chown -R prometheus:prometheus /etc/prometheus
sudo chown prometheus:prometheus /var/lib/prometheus
Briefly test your setup by running Prometheus in the foreground:
prometheus --config.file=/etc/prometheus/prometheus.yml
Create a systemd unit file for Prometheus:
sudo vi /etc/systemd/system/prometheus.service
Define the Prometheus service in the unit file:
[Unit]
Description=Prometheus Time Series Collection and Processing Server
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus 
    --config.file /etc/prometheus/prometheus.yml 
    --storage.tsdb.path /var/lib/prometheus/ 
    --web.console.templates=/etc/prometheus/consoles 
    --web.console.libraries=/etc/prometheus/console_libraries

[Install]
WantedBy=multi-user.target
Add Configuration to Federate Data from Another Prometheus Server Edit the Prometheus config on your Federal Prometheus Server:
sudo vi /etc/prometheus/prometheus.yml
Add the /federate endpoint on your first Prometheus server as a new scrape target. Make sure you use the private IP address of your first Prometheus server for the target:
scrape_configs:

  ...

  - job_name: 'federate'
    scrape_interval: 15s
    honor_labels: true
    metrics_path: '/federate'
    params:
      'match[]':
        - '{job!~"prometheus"}'
    static_configs:
    - targets:
      - '<PROMETHEUS_SERVER_1_PRIVATE_IP>:9090'
Start and enable Prometheus:
sudo systemctl enable prometheus
sudo systemctl start prometheus
Access your Federal Prometheus Server in a browser at http://<FEDERAL_PROMETHEUS_SERVER_PUBLIC_IP>:9090. Run a query to pull some data about your jobs:
up
You should see data about jobs run by your first Prometheus server.

Security

Prometheus Security Assumptions

00:04:45

Lesson Description:

Prometheus primarily relies on external configuration for security. Therefore, it is important to understand the security assumptions made by Prometheus in order to implement Prometheus in a secure manner. In this lesson, we will discuss some of these security assumptions and their implications when it comes to using Prometheus in the real world. Relevant DocumentationPrometheus Security Model

Client Libraries

Introduction to Prometheus Client Libraries

00:01:45

Lesson Description:

Prometheus is a great tool for collecting data about your applications, but in order for it to collect metrics, the data must first be made available by the application itself. When building your own custom applications, you can instrument your code to collect metric data and expose it to Prometheus. The Prometheus client libraries make this process significantly easier. In this lesson, we will introduce and discuss these Prometheus client libraries. Relevant DocumentationClient Libraries Writing Client Libraries

Using the Prometheus Java Client Library

00:11:00

Lesson Description:

The Prometheus client libraries can greatly simplify the process of collecting and exposing metrics in your custom code. In this lesson, we will demonstrate how to use one of these client libraries: the Prometheus client library for Java. We will add instrumentation using this client library to some simple Java code and show how Prometheus can be configured to scrape the resulting metrics. Relevant DocumentationJava Client Library Client LibrariesLesson Reference Log in to the Prometheus server. Note: You can clone the Java code to your local machine and work with it in the IDE or text editor of your choice, but if you want to pull metrics into Prometheus, you will likely need to copy the project to the Prometheus server for that step. Clone the example project from GitHub:

git clone https://github.com/linuxacademy/content-prometheusdd-java-client-lib-example.git
Change directory into the project:
cd content-prometheusdd-java-client-lib-example
Install Java:
sudo apt-get update
sudo apt-get install -y openjdk-8-jdk
Run the project. You should see it begin counting, printing the current count to the console:
./gradlew run
You can stop the application with Ctrl+C. Edit the project's build.gradle:
vi build.gradle
Add the Prometheus Java client library dependencies:
dependencies {
    implementation 'io.prometheus:simpleclient:0.8.1'
    implementation 'io.prometheus:simpleclient_httpserver:0.8.1'

    ...
}
Edit the project's Main class:
vi src/main/java/com/linuxacademy/prometheusdd/clientlibexample/Main.java
Add a counter to expose the current count as a Prometheus metric called current_count. Also, create an HTTPServer to provide an endpoint Prometheus can scrape metrics from:
package com.linuxacademy.prometheusdd.clientlibexample;

import io.prometheus.client.Counter;
import io.prometheus.client.exporter.HTTPServer;
import java.io.IOException;

public class Main {

    static final Counter currentCount = Counter.build()
        .name("current_count").help("Current count.").register();

    public static void main(String[] args) throws InterruptedException {
        try {
            HTTPServer server = new HTTPServer(8081);
        } catch (IOException e) {
            System.out.println("Failed to start metrics endpoint.");
            e.printStackTrace();
        }

        System.out.println("Counting to 1000...");
        for (int i = 0; i <= 1000; i++) {
            System.out.println(i);
            currentCount.inc();
            Thread.sleep(1000);
        }
        System.out.println("Done counting!");
    }

}
Run your code:
./gradlew run
Leave the code running. Check your metrics endpoint in a browser at http://<PROMETHEUS_SERVER_PUBLIC_IP>:8081. You should see your current_count metric. Edit your Prometheus configuration file to add your application as a new scrape target:
sudo vi /etc/prometheus/prometheus.yml
Add a scrape config to scrape metrics for your app:
scrape_configs:

  ...

  - job_name: 'Java Counter Example'
    static_configs:
    - targets: ['localhost:8081']
Restart Prometheus to reload the config:
sudo systemctl restart prometheus
Note: If your Java app stops (e.g., because it finished counting), you will need to run it again for Prometheus to scrape metrics from it. Access Prometheus in a browser at http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090. Run a query to see your current_count metric:
current_count

Final Steps

What's Next?

00:01:09

Lesson Description:

Congratulations on finishing the Prometheus Deep Dive course! In this video, we will briefly discuss some of the other courses you may be interested in.

Take this course and learn a new skill today.

Transform your learning with our all access plan.

Start 7-Day Free Trial