LPI 010-160 – Processes and Process Data Part 2
- Identifying Running Processes
Alright, let’s dig a little bit deeper into processes. Now, before processes can be managed, you have to be able to identify them first. We talked about that with the process ID numbers, but where are you going to find these ID numbers? Well, there are two utilities to help identify processes. We have PS and Top. In either case, processes can be searched for in various different ways. And you can do this using their name or the resources they’re using. Another reason for identifying processes is to know exactly how much memory these processes are consuming at any given time. To check this, you can use the free command, which tells you how much memory is being used. Now the simplest tool for identifying processes is the PS command, which is going to produce a process listing to your screen. Given the large number of options that PS has, different users have different favorite ways of using the program.
For example, if you type PS Space Ax, you’re going to produce the desired information, including things like your PID values, their command names, including the command line options for all the processes on the computer that are running. Now, if we want to add a little bit to this command, we can use PS Spacious and by adding that U, we’re going to add user names, CPU loads, and a few other tidbits that are helpful. The sheer scope of the information produced, however, can be overwhelming when you’re dealing with the process tool. So I do recommend you play with the process tool a little bit using PS, using the Help menus, and even doing a man search on the PS tool.
Now, one way to help narrow down this scope is to pipe the results through a program called Grep. And this is going to eliminate lines that don’t include the search criteria for the specified things. So for example, if I wanted to know the process ID for the g edit process that was running on my system, I could type in PS Space Ax pipe grip Edit. And so the output, instead of going to the screen with all of the processes, is just going to show me the information for the G edit process. And so as you can see here, the output might look like this where I have two lines that contain the word g edit. The first one is the process for g edit and you’ll see it is 27946.
The second one is actually the process for the Grep command that I just ran and it’s at 27950. And you’ll notice that it came up because I did do a grep with the search parameter of g edit. So as we can see in this example, we found that G edit had the PID value of 27946. This is really the important part of information using the PS command because that pit value can be used to change the processes priority or even terminate it and kill that process. Now, although the PS can return process priority and CPU use information, the program’s output is usually sorted by the PID number and so we’re going to have one at the Top going down into higher numbers. Also, PS can provide information at a single moment in time only. It really is just a snapshot of when you run that command.
So if I run it right now and I wait five minutes and I run it again, I’m going to get different information because other processes are running or they’re using more or less memory. So if you want to locate the CPU or memory hogging processes very quickly or you need to study how resources are used over time, then you might want to use the Top command instead. Top is a program that’s essentially an interactive version of PS. Now, by default, Top is going to sort its entries by CPU usage. So it’s very much like the task manager inside of Windows. Now it also is going to update its display every few seconds to the screen. So it’s basically the same thing as running PS, hitting Enter, doing PS again a couple of seconds later, hitting Enter and doing it again and again. That’s why we call this this interactive always updating view of PS. And this is the Top program.
Now there is a need to be familiar with the purposes and the normal habits of the programs running on your system. This is what Top is really good at. It’s going to help you figure out which applications are CPU hungry and using too many resources. This is usually an indication of an application that is misbehaving. Now, the legitimate needs of different programs vary so much that it’s impossible for me to give you a simple rule of thumb when judging if a process is consuming too much CPU time. For example, if I’m running a heavy graphics game in Linux, that’s going to take up a lot more CPU time than something like running a Notepad or a G edit program.
So whenever you have Top running, any of several single letter commands can be entered. And again, you can get the entire list of these by doing a man command against the Top program. Some of these are going to prompt for additional information. Depending on the letters you’re using, there are additional commands that can be used and all of these are described inside the man page. And again, if you want to get comfortable with Top, search the man page, open up your Linux VM and start playing with it to get used to it.
Now, one of the pieces of information that Top gives you is what’s known as the load average. This is a measure of the demand of CPU time by the different applications. How resource hungry are they? Now, the load average can be useful in detecting what we call runaway processes. For example, if we have a system and it normally has a load average of zero five, but all of a sudden, it has a load average of 2. 5, which is five times more. Well, the that means we probably have some processes that are resource hungry and eating up those CPU cycles. Maybe it’s a process that became hung up or it became unresponsive. Now, if we have a hung process, this means it’s sitting there and consuming a lot of CPU time, but it’s kind of stuck in a loop. By using Top, we can locate these processes, and then, if necessary, we can kill them or stop them by using their PID number.
- Measuring Memory Use
Now, in the last video we talked about PS and Top and how we can use those to identify processes that are consuming a lot of resources. Specifically, we use Top to figure out which ones are using the most CPU time. Now, if we want to find out who’s using the most memory though, we can do that by sorting it a different way. So we will use Top and then we would click on the M key. By clicking the M key, we’re going to start store everything by the memory used instead of the processing used. And that way we can identify which processes are consuming the most memory. Now, just like we did with CPU time, we can’t assume that just because a process is at the top of this list, because it’s using a lot of memory, that it’s necessarily a bad thing because there’s a lot of programs that legitimately consume a lot of memory. For example, you might have a program that consumes a lot of memory because it’s doing a lot of graphic editing or large file editing.
Or you might have a program that uses very little because you’re using something like a text editor. Again, it depends on what the program is doing. But if you have a program that consumes too much memory, either because of inefficient coding or because of a memory leak, this can cause a problem. This type of program is a bug in which the program starts requesting memory from the kernel and failing to return it over back to the kernel when it’s done. So when you have this memory leak, I borrow some and then I borrow some more and I borrow some more, and eventually I borrow up all your memory and I keep using more and more memory until it becomes a big problem. Using Top and hitting the M key will help you identify programs that may be the source of one of these memory leaks.
Now, as a short term solution, the user can actually terminate that program and launch it again. By terminating that program, it’s going to release all of the program’s memory back to the kernel for it to dish out. Again, the problem may reoccur though, if that memory leak is something that’s built into the code. But again, if the memory leak is small enough, at least you can get back to doing useful work in the meantime and you can kill that process and restart it. Now, if you want to study the computer’s overall memory use instead of just looking at it from a single process, you can do this using the Free command. Now, the Free program is going to generate a report on the computer’s total memory status. There are two lines that are especially important when you start looking at this display.
It is the mem and the swap. Now the mem line is going to reveal the total memory statistics or Ram available, including the total memory in the computer, minus whatever is being used by the motherboard and the kernel. This is the amount of memory that’s being used, the amount of memory that is free and essentially what the computer has in terms of memory. Most of the computer’s memory is being used in a normal state since Linux puts otherwise unused memory to use as buffers and caches, which will help speed up disk access. Therefore, the mem line itself isn’t really the most useful because it shows you everything. And instead you want to start examining the buffer cache line because this is going to show the total memory that’s used by the computer’s programs actively at this time.
Now the swap line is going to reveal how much swap space is being used in Linux. Now swap space is actually displace that’s set aside and it’s being used to supplement the memory. So if you start running out of physical memory inside the computer, it’ll start writing things to and from your disk into this swap area. This will act as virtual memory. Linux uses this swap space whenever it runs out of real memory or Ram, or when it determines that Ram is better used for buffers or caches to hold currently inactive programs. That you may want to get faster now. Swapspace use is generally going to be low, but if it rises up, you’re going to start seeing performance problems. This means the system is run out of real memory, and it’s starting to use the hard disk as a substitute.
But because Hard disks are much slower Than real memory, it’s going to slow down your performance. Now, in the long run, to fix this, you would increase the computer’s Ram by adding More physical memory into the system. And this will solve the need of going to and from the swap space as much. If you start suffering from performance problems because of excessive swap use, you can terminate some memory hogging program and this can help in the short term, but again, for the long term, adding more memory is a better solution. If you have things like memory leaks, which we talked about earlier, these can also lead to these types of problems because you’re using up the physical memory, and therefore, you’re going to have to rely on swap space. So by terminating those leaking programs, you can restore your system back to normal performance.
- Log Files
In this lesson, let’s talk about log files. Many programs that run in the background are known as daemons. These are essentially servers and things that are going to run and do functions for you while you’re not even seeing them. And because of this, they’re going to write information about their normal operations to log files. These log files serve as a record or notes. Now, if you consult these log files, you can gain important information about dialogue, diagnosing problems with these daemons or these programs that are running in the background.
Now, the first step in doing this is to locate those log files. Linux likes to store most of the log files in the slash bar log directory. Here are some common log files that you’re going to find on a Linux system. You have boot Log, which is a file that summarizes the services that were started up late in the boot process. Through the SysV startup scripts. We have the cups or cups.
This directory is going to hold log files that are related to the Linux printing system. Next we have GDM. And GDM is a directory that holds log files related to the Nome display manager or GDM environment. This is what’s going to handle all your graphical based user logins on many of your systems. We also have messages or syslog, and this is a general purpose log file that contains messages from many different daemons that don’t have their own dedicated log files. We also have Secure, and Secure is going to have your security related messages that are found inside this file.
This includes notices of when a user uses sue or pseudo or other tools to elevate their privileges up to root level. Next we have Xorg zero log, which is going to give you information on the most recent startup of your X Windows system, which is going to appear inside this log file. So if you’re using a graphical user interface, you’re going to be running that on top of X Windows. And so this log file will give you information about that. All right, so that’s a lot of log files we talked about, and there are many, many other ones in your Linux system. But the key thing you need to think about with log files is that they’re not going to be there forever. These log files are frequently rotated. This means that your oldest log files are going to be deleted and your newer ones are going to be overwriting those, or they’re going to be creating a new file with a new date and time or number associated with it. The reason for this is if you didn’t, you would fill up your entire hard disk with logs and you could crash your system.
So for example, if I was going to see a log rotation happen on July 1, you might see the slash VAR slash log, slash messages become slash VAR slash log slash messages 2019 0701. Or it might be renamed to varlogmessages, one GZ or something similar like that. We can either rename it, we can delete it, or we can overwrite it, and then we’re going to get a new varlet messages file created. Now, this practice keeps your log files from growing out of control in size and taking over your system. Now, most of your log files are what we call a plain text file, so they can be checked using any tool that can read a text file.
So you might use gedit, pico, VI, or even displaying it to the screen. Some programs are also going to create their own specialized log files, but most of these are going to rely on a utility that is generically known as the system log daemon to do their job. This program’s process name is generally something like syslog or syslog D for syslog daemon. Now, like other daemons, it started up during the boot process by your system startup scripts, and it runs in the background. There are several system log daemon packages available, and so again with Linux you can choose which one you like. Some of these are going to provide a separate tool known as Klog or Klog D, which will handle logging messages from your kernel separately from your other programs.
Now, the behavior of the log daemon can be modified, including adjusting the files to which it’s going to log, the name of those files, what type of messages it’s going to log, you can adjust this configuration file and all sorts of other stuff. The name of this file is going to depend on the specific daemon in use, but generally it’s going to end with a CNF for configuration. So it might be something like etc syslog CNF, and this would be the configuration file for the Syslog system. Now, once it’s up and running, a log daemon is going to accept messages from other processes by using a technique known as system messaging. It’s then going to sort through those messages and direct them to a suitable log file depending on the messages source and a priority code.
- The Kernel Ring Buffer
Let’s take a look at the kernel ring buffer. It can be thought of as a log file, but for the kernel itself, however, unlike other log files, it’s stored in memory rather than in a disk file like regular log files, though its contents are continuing to change as the computer runs. If you want to examine the kernel ring buffer, you can type in the word D-M-E-S-G or D message.
Now, by doing so, you’re going to create an overwhelming amount of information that’s going to be outputted to your screen. If you want to see it through left so you can see one screen at a time, just type in demessage pipe le SS. Alternatively, if the information needed is going to be associated with a particular string, we can use grep to search for it. So, for example, to find the kernel ring buffer message about the first hard disk, which is known asdevsda, you can type the following command demessage pipe grep SDA. And now we’re going to find anything that has the letters SDA in it inside the kernel ring buffer and display it to the screen.
The kernel ring buffer messages can be particularly difficult to understand, though, and they also can be invaluable in diagnosing hardware and driver problems, which is why we would use it. Because the kernel’s job, if you remember, is to interface with all the different pieces of hardware in that system. And so that type of information is useful. But it’s all located in this cryptic kernel ring buffer. So the kernel ring buffer can be searched.
If a hardware device is behaving badly and you’re trying to troubleshoot it and figure out if it’s a hardware problem or a driver problem, even if that message is difficult to understand, you can take that message from the kernel ring buffer log, enter it into a web engine, something like Google or Bing, and try to figure out what it means. Or you can find a knowledgeable colleague and ask them for some advice, too.
Now, some distributions place a copy of the kernel ring buffer into the VAR logD message directory when the system first boots up, or some other file like that inside your logs. This file can then be consulted if the computer has been running for a long time. Because as the computer runs for a long time, that kernel ring buffer may have been rotated out from memory, but that log file would still be there to find those earliest entries.
Now, if your distribution doesn’t create the kernel ring buffer log file by default, you can do this manually by editing your RC local file. This is located in the etc recork local. This file is one of your defaults that’s going to load up files and scripts when you start the computer. So if you add this line to the end demessage greater than signvarlogdmessage, it will create the file Dressage in your log folder, and it’s going to take the output the D message command, which is running, and there’s a kernel ring buffer. And instead of outputting it to the screen, it’s going to output it into this file so you can look it up later and have a copy of it.