Monitor systems for vital characteristics

Hands-On Lab

 

Photo of Michael Christian

Michael Christian

Course Development Director in Content

Length

01:30:00

Difficulty

Intermediate

In this exercise, you will need to configure monitoring on a system with Performance Co-Pilot.

What are Hands-On Labs?

Hands-On Labs are scenario-based learning environments where learners can practice without consequences. Don't compromise a system or waste money on expensive downloads. Practice real-world skills without the real-world risk, no assembly required.

Monitor systems for vital characteristics

Introduction

In this exercise, you will need to configure monitoring on a system with Performance Co-Pilot.

You've been asked to configure a system to provide live and historical metrics of its CPU load, disk I/O, and network traffic.

Solution

Start by logging in to the lab server using the credentials provided on the hands-on lab page:

ssh cloud_user@PUBLIC_IP_ADDRESS

Become the root user:

sudo su -

Install Performance Co-Pilot

Install pcp and pcp-system-tools:

yum -y install pcp pcp-system-tools

Enable and start the pmcd and pmlogger services.

systemctl enable pmcd pmlogger && systemctl start pmcd pmlogger

Take a baseline of CPU load

Take a baseline of the kernel.all.load metric for 10 seconds and put this into the file /home/cloud_user/kernel.all.load.txt.

You can do this using the pmval or pmrep command:

pmval -T 10s kernel.all.load > /home/cloud_user/kernel.all.load.txt

Or:

pmrep -T 10s kernel.all.load > /home/cloud_user/kernel.all.load.txt

View the contents of this file to ensure our command ran as intended:

cat /home/cloud_user/kernel.all.load.txt

Take a baseline of disk I/O

Take a baseline of the disk.partitions.total_rawactive metric for 10 seconds and put this into the file /home/cloud_user/disk.partitions.total_rawactive.txt.

You can do this using the pmval or pmrep command:

pmval -T 10s disk.partitions.total_rawactive > /home/cloud_user/disk.partitions.total_rawactive.txt

Or:

pmrep -T 10s disk.partitions.total_rawactive > /home/cloud_user/disk.partitions.total_rawactive.txt

Generate some disk I/O and CPU load

By now, pmlogger has been running for a few minutes. Generate some load so that we can look at it in the archive.


Before and after each of the commands that generate load, make a note of the system time. You can do so using the command:

date

Generate some CPU load

Run the following command to generate some CPU load for 1 minute:

date && timeout -sHUP 1m openssl speed

Generate some disk I/O

Run the following command to generate some disk I/O:

date && fallocate -l 1G /home/cloud_user/bigfile && shred -zvu -n 1 /home/cloud_user/bigfile

Make a note of the start and end times from the commands above. We'll need them to know when to look for the increases in resource usages.

Verify the CPU and disk load in the pcp archive file

Get the pcp archive file:

pcp | grep logger

Look in the archive log directory and make note of the archive files:

ls -lh /var/log/pcp/pmlogger/ip-10-0-1-10.ec2.internal/

Depending on how long you've taken to do these tasks, the archive log may have rolled over to a new file. The format of the filename is YYYYMMDD.HH.MM. Using your notes of when you ran the CPU and disk I/O commands, determine which file to use.


Display the kernel.load.all values from the selected archive log in 1 minute increments:

> Note: You can use pmval or pmrep here, with these particular metrics, I find pmrep to be easier to read.

pmrep -t 1m -a /var/log/pcp/pmlogger/ip-10-0-1-10.ec2.internal/<FILE> kernel.all.load

Display the kernel.load.all values from the selected archive log in 1 minute increments:

pmrep -t 1m -a /var/log/pcp/pmlogger/ip-10-0-1-10.ec2.internal/<FILE> kernel.all.load

Conclusion

Congratulations, you've completed this hands-on lab!