Nixpert: Understanding “Top” Command - Linux Process Monitoring

Top

Linux Top command is a performance monitoring program which is used frequently by many system administrators to monitor Linux performance and it is available under many Linux/Unix like operating systems. The top command used to display all the running and active real-time processes in ordered list and updates it regularly. It display CPU usage, Memory usage, Swap Memory, Cache Size, Buffer Size, Process PID, User, Commands and much more. It also shows high memory and cpu utilization of a running processess. The top command is much userful for system administrator to monitor and take correct action when required. Let’s see top command in action.

There are a number of variants of top, but in the rest of this article, we will talk about the most common variant — the one that comes with the “procps” package. You can verify this by running:

[root@server ~]# top -v

top: procps version 3.2.8

Understanding top’s interface: the summary area

As we have previously seen, top’s output is divided into two different sections. In this part of the article, we’re going to focus on the elements in half of the output. This region is also called the “summary area”.

System time, uptime and user sessions:

At the very top left of the screen (as marked in the screenshot above), top displays the current time. This is followed by the system uptime, which tells us the time for which the system has been running. For instance, in our example, the current time is "12:48:02" and the system has been running for 1 hour and 21 mins. Next comes the number of active user sessions. In this example, there is one active user sessions. These sessions may be either made on a TTY (physically on the system, either through the command line or a desktop environment) or a PTY (such as a terminal emulator window or over SSH).

If you want to get more details about the active user sessions, use the "who" command.

Tasks:

The “Tasks” section shows statistics regarding the processes running on your system. The “total” value is simply the total number of processes. For example, in the above screenshot, there are 27 processes running. To understand the rest of the values, we need a little bit of background on how the Linux kernel handles processes.

Processes perform a mix of I/O-bound work (such as reading disks) and CPU-bound work (such as performing arithmetic operations). The CPU is idle when a process performs I/O, so OSes switch to executing other processes during this time. In addition, the OS allows a given process to execute for a very small amount of time, and then it switches over to another process. This is how OSes appear as if they were “multitasking”. Doing all this requires us to keep track of the “state” of a process. In Linux, a process may be in of these states:

Runnable (R): A process in this state is either executing on the CPU, or it is present on the run queue, ready to be executed.
Interruptible sleep (S): Processes in this state are waiting for an event to complete.
Uninterruptible sleep (D): In this case, a process is waiting for an I/O operation to complete.
Stopped (T): These processes have been stopped by a job control signal (such as by pressing Ctrl+Z) or because they are being traced.
Zombie (Z): The kernel maintains various data structures in memory known as "process table" to keep track of processes. A process may create a number of child processes, and they may exit while the parent is still around. However, these data structures must be kept around until the parent obtains the status of the child processes. Such terminated processes whose records are still around in process tables are called zombies.

Processes in the D and S states are in “sleeping”, and those in the T state are in “stopped”. The number of zombies are shown as the “zombie” value.

CPU usage:

The CPU usage section shows the percentage of CPU time spent on various tasks. The us value is the time the CPU spends executing processes in userspace. Similarly, the sy value is the time spent on running kernelspace processes.

Linux uses a “nice” value to determine the priority of a process. A process with a high “nice” value is “nicer” to other processes, and gets a low priority. Similarly, processes with a lower “nice” gets higher priority. As we shall see later, the default “nice” value can be changed. The time spent on executing processes with a manually set “nice” appear as the ni value.

This is followed by id, which is the time the CPU remains idle. Most operating systems put the CPU on a power saving mode when it is idle. Next comes the wa value, which is the time the CPU spends waiting for I/O to complete.

Interrupts are signals to the processor about an event that requires immediate attention. Hardware interrupts are typically used by peripherals to tell the system about events, such as a keypress on a keyboard. On the other hand, software interrupts are generated due to specific instructions executed on the processor. In either case, the OS handles them, and the time spent on handling hardware and software interrupts are given by hi and si respectively.

In a virtualized environment, a part of the CPU resources are given to each virtual machine (VM). The OS detects when it has work to do, but it cannot perform them because the CPU is busy on some other VM. The amount of time lost in this way is the “steal” time, shown as st.

Load average:

The load average section represents the average “load” over one, five and fifteen minutes.

System Load/CPU load is a measurement of CPU over or under-utilization in a Linux system; the number of processes which are being executed by the CPU or in waiting state.

Memory usage:

The “memory” section shows information regarding the memory usage of the system. The lines marked “Mem” and “Swap” show information about RAM and swap space respectively. Simply put, a swap space is a part of the hard disk that is used like RAM. When the RAM usage gets nearly full, infrequently used regions of the RAM are written into the swap space, ready to be retrieved later when needed. However, because accessing disks are slow, relying too much on swapping can harm system performance.

As you would naturally expect, the “total”, “free” and “used” values have their usual meanings. The “avail mem” value is the amount of memory that can be allocated to processes without causing more swapping.

The Linux kernel also tries to reduce disk access times in various ways. It maintains a “disk cache” in RAM, where frequently used regions of the disk are stored. In addition, disk writes are stored to a “disk buffer”, and the kernel eventually writes them out to the disk. The total memory consumed by them is the “buff/cache” value. It might sound like a bad thing, but it really isn’t — memory used by the cache will be allocated to processes if needed.

Understanding top’s interface: the task area

The summary area is comparatively simpler, and it contains a list of processes. In this section, we will learn about the different columns shown in top’s default output.

This is the process ID, a unique positive integer that identifies a process.

USER

This is the “effective” username (which maps to a user ID) of the user who started the process. Linux assigns a real user ID and an effective user ID to processes; the latter allows a process to act on behalf of another user. (For example, a non-root user can elevate to root in order to install a package.)

PR and NI

The “NI” field shows the “nice” value of a process. The “PR” field shows the scheduling priority of the process from the perspective of the kernel. The nice value affects the priority of a process.

VIRT, RES, SHR and %MEM

These three fields are related with to memory consumption of the processes. “VIRT” is the total amount of memory consumed by a process. This includes the program’s code, the data stored by the process in memory, as well as any regions of memory that have been swapped to the disk. “RES” is the memory consumed by the process in RAM, and “%MEM” expresses this value as a percentage of the total RAM available. Finally, “SHR” is the amount of memory shared with other processes.

As we have seen before, a process may be in various states. This field shows the process state in the single-letter form.

TIME+

This is the total CPU time used by the process since it started, precise to the hundredths of a second.

COMMAND

The COMMAND column shows the name of the processes.

Top command usage examples

So far, we have discussed about top’s interface. However, it can also manage processes, and you can control various aspects of top’s output. In this section, we’re going to take at a few examples.

In most of the examples below, you have to press a key while top is running. Keep in mind that these key-presses are case sensitive — so if you press “k” while Caps Lock is on, you have actually pressed a “K”, and the command won’t work, or do something else entirely.

Killing processes

If you want to kill a process, simply press ‘k’ when top is running. This will bring up a prompt, which will ask for the process ID of the process and press enter.

Next, enter the signal using which the process should be killed. If you leave this blank, top uses a SIGTERM, which allows processes to terminate gracefully. If you want to kill a process forcefully, you can type in SIGKILL here. You can also type in the signal number here. For example, the number for SIGTERM is 15 and SIGKILL is 9.

If you leave the process ID blank and hit enter directly, it will terminate the topmost process in the list. As we’ve mentioned previously, you can scroll using the arrow keys, and change the process you want to kill in this way.

Sorting the process list

One of the most frequent reasons to use a tool like top is to find out which process is consuming the most resources. You can press the following keys to sort the list:

‘M’ to sort by memory usage
‘P’ to sort by CPU usage
‘N’ to sort by process ID
‘T’ to sort by the running time

By default, top displays all results in descending order. However, you can switch to ascending order by pressing ‘R’.

You can also sort the list with the -o switch. For example, if you want to sort processes by CPU usage, you can do so with:

top -o %CPU

You can sort the list by any of the attributes in the summary area in the same way.

Showing a list of threads instead of processes

We have previously touched upon how Linux switches between processes. Unfortunately, processes do not share memory or other resources, making such switches rather slow. Linux, like many other operating systems, supports a “lightweight” alternative, called a “thread”. They are part of a process and share certain regions of memory and other resources, but they can be run concurrently like processes.

By default, top shows a list of processes in its output. If you want to list the threads instead, press ‘H’ when top is running. Notice that the “Tasks” line says “Threads” instead, and shows the number of threads instead of processes.

You may have noticed how none of the attributes in the process list changed. How is that possible, given that processes differ from threads? Inside the Linux kernel, threads and processes are handled using the same data structures. Thus, every thread has its own ID, state and so on.

If you want to switch back to the process view, press ‘H’ again. In addition, you can use the -H switch to display threads by default.

top -H

Showing full paths

By default, top does not show the full path to the program, or make a distinction between kernelspace processes and userspace processes. If you need this information, press ‘c’ while top is running. Press ‘c’ again to go back to the default.

Kernelspace processes are marked with square brackets around them. As an example, in the above screenshot there are two kernel processes, kthreadd and khelper. On most Linux installations, there will usually be a few more of them.

Alternatively, you can also start top with the -c argument:

top -c

Listing processes from a user

To list processes from a certain user, press ‘u’ when top is running. Then, type in the username, or leave it blank to display processes for all users.

Alternatively, you can run the top command with the -u switch. In this example, we have listed all processes from the root user.

top -u root

Nixpert

Understanding “Top” Command - Linux Process Monitoring

Top