The situation of Ms. Vy is increasingly complex, causing our lives to be turned upside down, but it is not possible that you should let go or leave everything behind. original position, but there will also be people transformed into a different person: younger, more energetic, creative and more bloody battle. Determined to overcome any epidemic!
Back to today’s main topic related to Capacity Planning. There are many components in a Linux operating system that can affect system performance. Active monitoring of (monitoring) components, will be the only way to protect your system. I will mention the tools and utilities to help you easily monitor the system.
1. CPU Monitoring
It is important not only to determine if the CPU (s) are overloaded, but also to what causes the overload. For example, is it a user process or a system process? Or if working in a virtual machine environment, is the hypervisor the cause? Identifying answers to the same questions mentioned will help you overcome system performance issues.
1.1. Basic CPU Load Information
uptime, tells us how long the system has been running. and give us a quick look at the number of users on the system, and the average load (load average) at the nearest 1, 5 and 15 minutes.
[email protected] :~# uptime
06:39:36 up 214 days, 21:35, 1 user, load average: 0.00, 0.01, 0.00
The concept of average load is confusing for administrators, this value is designed to describe CPU load.
With a single CPU system. For example:
* load average=0.50⇒load average=0.50⇒ 50% CPU được sử dụng trong khoảng. thời gian đó.
* load average=1.50⇒load average=1.50⇒ CPU bị overtasked, các process requests bị kẹt ở queue vì CPU đang bận handling process khác.
With a system from 2 CPUs. For example:
* load average=0.50⇒load average=0.50⇒ 25% CPUs được sử dụng trong khoảng thời gian đó.
* load average=1.50⇒load average=1.50⇒ 75% CPUs được sử dụng trong khoảng thời gian đó.
1.2. Detailed CPU Load Information
Another useful tool for monitoring CPUs is the iostat. This tool is also used for monitoring disk I / O
The first part of the output provides brief information about the system including:
Kernel verion : 4.4.16-1.el7.elrepo.x86_64
Hostname : techzones
Date of report : March 21, 2020
Kernel type : x86_64
Number of CPUs : 24 CPUs
The next section is a report on statistics related to CPUs.
% user – This value represents the percentage of CPU used when the application is run at user-level (processes run by normal user accounts)
% nice – Frequently run statements by users with nice commands to change CPU proccess priority
% system – This value represents the percentage of CPU used by kernel processes.
% iowait – This value represents the percentage of CPU used when the CPU is waiting for disk I / O operation to complete before moving on to the next action.
% steal – This value is only associated with virtual CPUs. In some cases, the virtual CPU must wait for the hypervisor to process requests from other virtual CPUs. This value indicates the percentage of time the hypervisor waits for the virtual CPU’s request to process.
% idle – This value represents the percentage of CPU time not processing any request? Sit and play
I will take another example of iostat with high iowait value:
In this example, the output of% iowait is higher than normal. If there are issues that make the process slow, the service responds slowly, this is also worth considering
In operation, iostat has 2 very common options.
iterval : Time (units: seconds (s)) between iostat commands.
count : The number of times to report.
In addition to iostat, we also have 1 guest command that sar. However, the sar command displays information about every 10 minutes.
2. Memory Monitoring
A system equipped with a high-speed CPU can still become stagnant if there is a memory problem. One thing to note: when you monitor memory usage, you need to look at both RAM and swap space.
2.1. Basic Memory Usage Information
When talking about Memory, I think now comes the first Free.
- Mem: This line describes RAM
- Swap: This line describes the Virtual Memory
The remaining parameters:
total – This value represents the total memory capacity of the system.
used – The amount of memory currently in use
free – Free memory space.
shared – This value represents the amount of memory used by tmpfs. tmpfs is a filesystem that is often present on hard disks
buff / cache – Buffers or caches are temporary storage locations.
available – This value represents how much free memory is available for a new process.
By default, the values displayed in units are kilobytes. However, you can display more accurate values with the option -b / –bytes, or display in megabytes -m / –mega, or gigabytes -g / –giga
2.1. Detailed Memory Usage Information
- r: Number of processes running or waiting to run.
- b: Number of processes in uninterruptible sleep state.
- swpd: The amount of virtual memory used.
- free: Free memory space.
- buff: The amount of memory used as a buffer.
- cache: The amount of memory used as cache.
- bi: Block received from block device (blocks / s).
- bo: Block sends block device (blocks / s).
- in: Number of interupt per second. Including clock interrupt.
- cs: Number of context switches per second.
CPU Percentage of total CPU time.
- us: Usage time for running outside of the Kernel-code (including user time and nice time).
- sy: Time used to run the Kernel-code.
- id: Idle time. Linux versions 2.5.41 and earlier including IO wait times.
- wa: IO timeout. Linux versions 2.5.41 and earlier show 0.
- st: Time taken from a virtual machine. Linux version 2.6.11 or earlier, not specified.
3. Disk I / O Monitoring
To monitor disk I / O we can use the command iostat, with the -d option
sda – 1st SATA drive.
dm-0 – 1st LVM logical volume.
dm-9 – 2nd LVM logical volume.
If you still don’t understand the description in the output of the command iostat, please see the explanation below
tps – Transfer / s value for each device.
kB_read / s – The number of kilobytes / s read from the device.
kB_wrtn / s – The number of kilobytes / s written to the device.
kB_read – The number of kilobytes read from the device.
kB_wrtn – Number of kilobytes written to the device.
Listing Open Files
Here are some great techniques when using lsof
Displays the nodes of processes running on the specified network port. For example, with SSH port: lsof -i TCP: 22
List processes nodes related to IPv4: lsof -i 4 List open network commands: lsof -i
Exclude nodes that belong to the specified user: lsof -p 100
Display the output of all files in a folder. For example / usr / bin: lsof + d / usr / bin
4. Network I / O Monitoring
And to show more detailed information about network I / O than we use netstat command
This is not a network monitoring method but only provides us with useful information about the route.
Here are some great techniques when using netstat
-l – Display network sockets in a listening state.
-lt – Displays TCP sockets in a listening state.
-lu – Displays UDP sockets in a listening state.
-p – Display program name and PID in output.
-n – Speed netstat command, reduce delay when DNS is not responding.
-c – Update output realtime.
5. Additional Monitoring Tools
To display basic information of the system such as running time, average system load, number of running processes, CPU statistics, memory information, … We use the top command
The monitoring system is indispensable and very important when you deploy the system on staging as well as production. This article only introduces the basic commands and concepts, when deploying and operating the system on the end-user environment, you will face a lot of complex issues, In addition to using automatic monitoring systems Check_mk / Zabbix / Prometheus will save much on manpower and infrastructure costs.