SentiBlue Blog: HowTo Handle Mass Output - Making "grep" command useful

This is part of a series of tutorial. Main thread starts here: http://blog.sentiblue.net/2013/08/howto-handle-massive-standard-output.html

The grep command is used to filter out wanted or unwanted information among lots of output.

Revisiting the command /usr/sbin/dmidecode that I asked you guys to run, it shows pages and pages of output. The output begins with the BIOS, then Motherboard, Chasis, Processors, Cache, etc..

1. Search for a single line

I'd like to find the Manufacturer of a server

# /usr/sbin/dmidecode | grep Manufacturer
        Manufacturer: HP
        Manufacturer: HP
        Manufacturer: AMD
        Manufacturer: AMD

What in the world??? These two companies joint manufacture the server? No they didn't! HP manufactured the server, but the CPU is made by AMD.

They both show here because we simply asked for lines that contain "Manufacturer". If you read about 1 page down on the full report, you'll see two sections related to the two CPUs made by AMD.

2. Search for blocks of information

Let's use "grep" command to get information about JUST the motherboard. Knowing that the Motherboard section begins with the line "System Information", here's the command:

# /usr/sbin/dmidecode | grep -A6 'System Information'
   System Information
       Manufacturer: HP
       Product Name: ProLiant DL385 G2
       Version: Not Specified
       Serial Number: *******
       UUID: 34313431-3039-5553-4537-31364E324248
       Wake-up Type: Power Switch

Here the switch "-A6" tells grep to search for the line "System Information" plus 6 lines below. I have masked the Serial Number to avoid revealing the identity of the server hardware.

Similarly, you can specify -Bn (where "n" is any number) to search for a string PLUS n lines above. The A stands for after and B stands for before.

2. Search for multiple criteria

Out of the above output, I'm only interested in the Manufacturer, Product Name and Serial Number. We can additionally throw in another piped grep command using the "-e" option to search for multiple criteria.

# /usr/sbin/dmidecode | grep -A6 'System Information' | grep -e Manufacturer -e 'Product Name' -e Serial
        Manufacturer: HP
        Product Name: ProLiant DL385 G2
        Serial Number: *********

You see, here we only get the lines that we wanted.

Note: The option "-e" is actually used to perform regular expression, but one of the advantage of this options is that we can specify it as many times as we want, allowing us to grep for multiple criteria in the same command.

3. Count items that match a search criteria

Remember in the first section you had HP twice and AMD twice? That makes it 4 lines containing Manufacturer. What if I wanted to see how many lines containing a certain string? Here's the syntax

# /usr/sbin/dmidecode | grep -c Manufacturer
4

The answer is 4 just as expected.

Remember this was a command (dmidecode) being piped to the grep command. If you wanted to search for similar information from a file, it's even more simple.

Say you wanted to see how many lines contain the string "Processor" in /proc/cpuinfo, you can simply do

# grep -c Processor /proc/cpuinfo
8

This tells me there's 8 lines having the word "Processor" in the file /proc/cpuinfo.

4. Excluding criteria

There are cases where we look at a file but not interested in certain lines. A useful scenario is when you are checking for a list of processes. Say we want to see how many db2 database instances running on a server, we do

# ps -eaf | grep db2sysc
db2_1 22599 22597 28 Aug18 ?        17:51:07 db2sysc 0
db2_2 22887 22754 1 Aug18 ?        05:31:44 db2sysc 0
db2_3 23183 23181 1 Aug18 ?        05:00:14 db2sysc 0
root     5767 5490 0 13:27 pts/5    00:00:00 grep db2sysc

The report shows 3 instances of db2, but also shows the VERY process that executed the grep command looking for db2sysc.

If we want to get rid of that last line, we can pipe the command to another grep using -v switch

# ps -eaf | grep db2sysc | grep -v grep
db2_1 22599 22597 28 Aug18 ?        17:51:07 db2sysc 0
db2_2 22887 22754 1 Aug18 ?        05:31:44 db2sysc 0
db2_3 23183 23181 1 Aug18 ?        05:00:14 db2sysc 0
This was just an example to show you how to remove a line from the output. The actual more convenient way to obtain the immediate above output is enclosing the first character of the criteria in square brackets. That instructs "grep" to ignore its own process.

# ps -eaf | grep [d]b2sysc
db2_1 22599 22597 28 Aug18 ?        17:51:07 db2sysc 0
db2_2 22887 22754 1 Aug18 ?        05:31:44 db2sysc 0
db2_3 23183 23181 1 Aug18 ?        05:00:14 db2sysc 0

5. Miscellaneous options

-i to instruct grep to ignore upper vs lower cases.
-colour to highlight the found string with distinctive color
-l (lower case L) tells grep to find a string, but only list the filenames that match the criteria, not to print the lines out.
-R is used to search recursively into subdirectories for a specific string

Combine -l and -R:
# grep -Rl kevin *

This command will search current and ALL subdirectories for files containing the word "kevin". If found, only show the filename, not the actual lines.

Keep in mind that this is Part 1 of a series of tutorials. At the end, all parts will be combined and help you turn only couple linux commands into very powerful tools to navigate a data center, find information that you're looking for and present it in a human readable format. And it can all be done in a single line of command.

Articles in the Series

Part 1: The grep Command
Part 2: The awk Command
Part 3: The sed Command
Part 4: Miscellaneous Commands

SentiBlue Blog

HowTo Handle Mass Output - Making "grep" command useful