Linux capacity planning

Tram Ho

Walking =))

The situation of Ms. Vy is increasingly complex, causing our lives to be turned upside down, but it is not possible that you should let go or leave everything behind. original position, but there will also be people transformed into a different person: younger, more energetic, creative and more bloody battle. Determined to overcome any epidemic!

Back to today’s main topic related to Capacity Planning. There are many components in a Linux operating system that can affect system performance. Active monitoring of (monitoring) components, will be the only way to protect your system. I will mention the tools and utilities to help you easily monitor the system.

1. CPU Monitoring

It is important not only to determine if the CPU (s) are overloaded, but also to what causes the overload. For example, is it a user process or a system process? Or if working in a virtual machine environment, is the hypervisor the cause? Identifying answers to the same questions mentioned will help you overcome system performance issues.

1.1. Basic CPU Load Information

uptime, tells us how long the system has been running. and give us a quick look at the number of users on the system, and the average load (load average) at the nearest 1, 5 and 15 minutes.

[email protected] :~# uptime

06:39:36 up 214 days, 21:35, 1 user, load average: 0.00, 0.01, 0.00

The concept of average load is confusing for administrators, this value is designed to describe CPU load.

With a single CPU system. For example:

With a system from 2 CPUs. For example:

1.2. Detailed CPU Load Information

Another useful tool for monitoring CPUs is the iostat. This tool is also used for monitoring disk I / O

The first part of the output provides brief information about the system including:

Kernel verion : 4.4.16-1.el7.elrepo.x86_64

Hostname : techzones

Date of report : March 21, 2020

Kernel type : x86_64

Number of CPUs : 24 CPUs

The next section is a report on statistics related to CPUs.

% user – This value represents the percentage of CPU used when the application is run at user-level (processes run by normal user accounts)

% nice – Frequently run statements by users with nice commands to change CPU proccess priority

% system – This value represents the percentage of CPU used by kernel processes.

% iowait – This value represents the percentage of CPU used when the CPU is waiting for disk I / O operation to complete before moving on to the next action.

% steal – This value is only associated with virtual CPUs. In some cases, the virtual CPU must wait for the hypervisor to process requests from other virtual CPUs. This value indicates the percentage of time the hypervisor waits for the virtual CPU’s request to process.

% idle – This value represents the percentage of CPU time not processing any request? Sit and play

I will take another example of iostat with high iowait value:

In this example, the output of% iowait is higher than normal. If there are issues that make the process slow, the service responds slowly, this is also worth considering

In operation, iostat has 2 very common options.

iterval : Time (units: seconds (s)) between iostat commands.

count : The number of times to report.

In addition to iostat, we also have 1 guest command that sar. However, the sar command displays information about every 10 minutes.

2. Memory Monitoring

A system equipped with a high-speed CPU can still become stagnant if there is a memory problem. One thing to note: when you monitor memory usage, you need to look at both RAM and swap space.

2.1. Basic Memory Usage Information

When talking about Memory, I think now comes the first Free.

  • Mem: This line describes RAM
  • Swap: This line describes the Virtual Memory

The remaining parameters:

total – This value represents the total memory capacity of the system.

used – The amount of memory currently in use

free – Free memory space.

shared – This value represents the amount of memory used by tmpfs. tmpfs is a filesystem that is often present on hard disks

buff / cache – Buffers or caches are temporary storage locations.

available – This value represents how much free memory is available for a new process.

By default, the values ​​displayed in units are kilobytes. However, you can display more accurate values ​​with the option -b / –bytes, or display in megabytes -m / –mega, or gigabytes -g / –giga

2.1. Detailed Memory Usage Information

That’s vmstat


  • r: Number of processes running or waiting to run.
  • b: Number of processes in uninterruptible sleep state.


  • swpd: The amount of virtual memory used.
  • free: Free memory space.
  • buff: The amount of memory used as a buffer.
  • cache: The amount of memory used as cache.


  • bi: Block received from block device (blocks / s).
  • bo: Block sends block device (blocks / s).


  • in: Number of interupt per second. Including clock interrupt.
  • cs: Number of context switches per second.

CPU Percentage of total CPU time.

  • us: Usage time for running outside of the Kernel-code (including user time and nice time).
  • sy: Time used to run the Kernel-code.
  • id: Idle time. Linux versions 2.5.41 and earlier including IO wait times.
  • wa: IO timeout. Linux versions 2.5.41 and earlier show 0.
  • st: Time taken from a virtual machine. Linux version 2.6.11 or earlier, not specified.

3. Disk I / O Monitoring

To monitor disk I / O we can use the command iostat, with the -d option

sda – 1st SATA drive.

dm-0 – 1st LVM logical volume.

dm-9 – 2nd LVM logical volume.

If you still don’t understand the description in the output of the command iostat, please see the explanation below

tpsTransfer / s value for each device.

kB_read / s – The number of kilobytes / s read from the device.

kB_wrtn / s – The number of kilobytes / s written to the device.

kB_read – The number of kilobytes read from the device.

kB_wrtn – Number of kilobytes written to the device.

Listing Open Files

Here are some great techniques when using lsof

Displays the nodes of processes running on the specified network port. For example, with SSH port: lsof -i TCP: 22

List processes nodes related to IPv4: lsof -i 4 List open network commands: lsof -i

Exclude nodes that belong to the specified user: lsof -p 100

Display the output of all files in a folder. For example / usr / bin: lsof + d / usr / bin

4. Network I / O Monitoring

And to show more detailed information about network I / O than we use netstat command

This is not a network monitoring method but only provides us with useful information about the route.

Here are some great techniques when using netstat

-l – Display network sockets in a listening state.

-lt – Displays TCP sockets in a listening state.

-lu – Displays UDP sockets in a listening state.

-p – Display program name and PID in output.

-n – Speed ​​netstat command, reduce delay when DNS is not responding.

-c – Update output realtime.

5. Additional Monitoring Tools

To display basic information of the system such as running time, average system load, number of running processes, CPU statistics, memory information, … We use the top command

6. Summary

The monitoring system is indispensable and very important when you deploy the system on staging as well as production. This article only introduces the basic commands and concepts, when deploying and operating the system on the end-user environment, you will face a lot of complex issues, In addition to using automatic monitoring systems Check_mk / Zabbix / Prometheus will save much on manpower and infrastructure costs.


Share the news now

Source : Viblo