Configure Splunk to work with FreeIPA LDAP

I have successfully configured this using
- Splunk Server 7
- FreeIPA Server 4.4

Keep in mind that you can configure multiple LDAP strategies, which means you can have people from multiple LDAP servers logging into your Splunk server.

From Splunk Management UI, login as an administrative account

- Settings --> Access Controls
- Click Authentication Method
- Select LDAP
- Click "LDAP Settings" to continue to configurations
- Click New

Fill out the form as below:
If I don't mention a field, it can be left empty.


  • LDAP Strategy Name: Any name you want
  • Host: The hostname (server name) of the machine running LDAP
  • Port: If you run SSL on your LDAP server, this would be 636. Otherwise 389
  • SSL Enabled: Check this box if you do have SSL enabled


  • User base DN: cn=users,cn=accounts,dc=ops,dc=company,dc=com
  • User base filter: Fill this out if you want to and know how. Used to restrict/filter who can login
  • User name attribute: uid
  • Real name attribute: cn
  • Email attribute: mail


  • Group base DN: cn=groups,cn=accounts,dc=ops,dc=company,dc=com
  • Static group search filter: Fill out if you want and know how. Used to restrict/filter which group scan login
  • Group name attribute: cn
  • Static member attribute: cn


Select "Nested group" if this is the case for you.

Automation That Makes Things Impossible

So I've been checking out few opensource stuffs out there in the World and I have to say you guys have done a great job. But I often notice something that you have put time in to automate, but it turns out automation of that kind causes more unnecessary pain for computing environments.

Linux installations of packages are now moving to repository based. I know you guys are thinking that by moving your packages into repositories, you eliminate manual work for your audience but this is actually having the opposite effect.

You wanted your audience to add a repository entry and run "yum install" or "apt-get" to get things done. Well that's very wrong!

Your audience consist of technical people. They know how to download packages and move them into their environment for installation. You don't have to help them on this part.

In most if not all computing environments, servers are prohibited from contacting the internet. There's no way for them to do what you want them to do. So what do they have to do? They have to get the network guy to poke a hole in their firewall and grant temporary access to the internet which presents GREAT security risks to their environment.

Not only that, most of the time the sysadm and netadm are two different people. They may even ben in different time zones or continents. The dependency costs up to a day of delay just to get something as simple as downloading a file.

If this was an initial setup for a brand new environment, that's OK. After the initial exposure, we can close it up and go on with our lives. But what about deploying a new tool into an existing environment? What if in the future when production is running and we need to upgrade the tool?

Let me show you few installation instructions that requires internet

Zabbix Server (for mysql)
# yum install http://repo.zabbix.com/zabbix/3.2/rhel/7/x86_64/zabbix-release-3.2-1.el7.noarch.rpm# yum install zabbix-server-mysql zabbix-web-mysql

Graylog
# rpm -Uvh https://packages.graylog2.org/repo/packages/graylog-2.2-repository_latest.rpm
# yum install graylog-server
 
Guys, why does it take so many people in your organizations to make such illogical decisions? Not only you chose the impossible method, you didn't give your audience the other option, which they can do with ease.

HowTo Handle Mass Output - Conclusion

This is the conclusion article of the tutorial series...

In this article, I will show you guys how to combine the usage of these commands (grep, awk, sed, sort, uniq, seq) to run one single command line and generate tons of output, yet formatted into a very verbose but informative human readable format.

Assumptions

- We have a job server farm consisting of server names job7201 to job7955 (continuous consecutive)
- These job servers run identical processes in production and serve one common purpose
- There is an administrative server called "adm4125" from which we run all commands
- Passwordless SSH has been setup so that you can do passwordless SSH from adm4125 to ALL job servers

Objectives

# Current Objective
- Generate an inventory listing of all hardware, Serial Numbers, OS, CPU/RAM

# Future updates to this article will include
# - Perform remote health checks on the servers and their applications
# - Find out which server(s) participated in a certain job using a jobID

Preparations

Prepare the server names list 
[blogger@adm4125:~] $ seq 7201 7955
7201
7202
7203
....
7954
7955

Prepare Commands to Extract Data
We are using server job7201 as a test to extract sample data and observing from which, we can decide how to formulate our commands create the report

[blogger@adm4125:~] $ ssh job7201 "sudo /usr/sbin/dmidecode | grep -A6 'System Information'" # Get the hardware information
    System Information
        Manufacturer: HP
        Product Name: ProLiant DL385 G2
        Version: Not Specified
        Serial Number: ************     
        UUID: 34313431-3039-5553-4537-31364E324248
        Wake-up Type: Power Switch


[blogger@adm4125:~] $ ssh job7201 "uname -a" # Get the kernel version
Linux job7201.sentiblue.com 2.6.18-128.el5 #1 SMP Wed Dec 17 11:42:39 EST 2008 i686 i686 i386 GNU/Linux[blogger@adm4125:~] $ ssh job7201 "cat /etc/redhat-release" # Get the OS version
Red Hat Enterprise Linux Server release 5.3 (Tikanga)


[blogger@adm4125:~] $ MEM=`ssh job7201 "head -1 /proc/meminfo" | awk '{ print $2 }'` # Get the memory amount (in KB)

[blogger@adm4125:~] $ MEM=$(($MEM/1024/1024)) # Calculate out to GB
[blogger@adm4125:~] $ echo ${MEM}GB

125GB
[blogger@adm4125:~] $ ssh job7201 "grep -c Processor /proc/cpuinfo" # Count CPU Cores
48

Generate Hardware/OS/Resource Inventory Report

$ for SEQ in `seq 7201 7955 | xargs`; do
> SERVER="job$SEQ"
> echo -n "$SERVER: "
> ssh $SERVER "sudo /usr/sbin/dmidecode | grep -A6 'System Info' 
> | grep -e Manufacturer -e 'Product Name' -e Serial"
> done

job7201:         Manufacturer: HP
        Product Name: ProLiant DL385 G2
        Serial Number:
SF349827     
job7202:         Manufacturer: HP
        Product Name: ProLiant DL360 G5
        Serial Number:
F39ERIU2     
...
...

job7955:         Manufacturer: HP
        Product Name: ProLiant DL585 G7
        Serial Number:
2DF20D8R     

That doesn't look pretty! Let's fine tune the command some more;

- We want to get rid of the headers like "Manufacturer: ", "Product Name: ", "Serial Number: "
- We also want all data pertaining to each server to show in one single line

The colors of each fine tune will match below between objective and syntax.

$ for SEQ in `seq 7201 7955 | xargs`; do
> SERVER="job$SEQ"
> echo -n "$SERVER: "
> ssh $SERVER "sudo /usr/sbin/dmidecode | grep -A6 'System Info' 
> | grep -e Manufacturer -e 'Product Name' -e Serial"
> | sed 's/     Manufacturer: //g'
> | sed 's/     Product Name: //g'
> | sed 's/     Serial Number: //g' 
> | xargs
> done

job7201: HP ProLiant DL385 G2 SF349827
job7202: HP ProLiant DL360 G5 F39ERIU2

...
...
job7955: HP ProLiant DL585 G7 2DF20D8R


MUCH MUCH MUCH BETTER!

But so far we've only gathered Manufacturer, Model and Serial Numbers. Let's get some additional information; like the OS version and kernel architecture.

$ for SEQ in `seq 7201 7955 | xargs`; do
> SERVER="job$SEQ"
> echo -n "$SERVER: "
> ssh $SERVER "sudo /usr/sbin/dmidecode | grep -A6 'System Info'
> | grep -e Manufacturer -e 'Product Name' -e Serial"
> | sed 's/     Manufacturer: //g'
> | sed 's/     Product Name: //g'
> | sed 's/     Serial Number: //g' 
> | xargs
> ( cat /etc/redhat-release | awk '{ print \$NF }'; uname -i ) | xargs 
> done

job7201: HP ProLiant DL385 G2 SF349827 (Tikanga) x86_64
job7202: HP ProLiant DL360 G5 F39ERIU2 (Tikanga) x86_64

...
...
job7955: HP ProLiant DL585 G7 2DF20D8R (Tikanga) x86_64


There was more to the Original Objectives. For now I will end the article here leaving Objective #1 hanging a little loose.

I'd like those who read this article to fine tune the above command to include the CPU/RAM information into the report so that it looks like this

job7201: HP ProLiant DL385 G2 SF349827 (Tikanga) x86_64  24 Cores  196GB
job7202: HP ProLiant DL360 G5 F39ERIU2 (Tikanga) x86_64  48 Cores  64GB

...
...
job7955: HP ProLiant DL585 G7 2DF20D8R (Tikanga) x86_64  96 Cores  128GB


Don't let me down!!

This article will be revised in the future to meet Objectives #2 and #3
- Perform remote health checks on the servers and their applications
- Find out which server(s) participated in a certain job using a jobID

HowTo Handle Mass Output - Miscellaneous Commands

This is the final article on the tutorial series "How To Handle Massive Output"

Part 4: sort, uniq, seq, xargs

The 4 commands above, when joined with grep/awk/sed, become very powerful by allowing us to manipulating massive/mostly irrelevant data into actionable data. 

 

The "sort" Command

It does exactly what it says; to sort

Let's say we have a file that looks like this

$ cat random.txt
November
Delta
Foxtrot
Tango
Charlie
Romeo

The sort command will reorganize the list above as below

$ cat random.txt | sort
Charlie
Delta
Foxtrot
November
Romeo
Tango


The sort command can specify a column to sort and additional options to sort as alpha or numeric. A new example shows a multi-column file such as below

$ cat names_ages_emails.txt
Kevin    27    kevin@sentiblue.net
Kelly    19    kelly@sentiblue.net
Robert   14    bob@sentiblue.net
Randall  32    randall@sentiblue.net
Michael  37    mike@blogspot.com


To sort ascending for age we do this

$ cat names_ages_emails.txt | sort -k 2n
Robert   14    bob@sentiblue.net
Kelly    19    kelly@sentiblue.net
Kevin    27    kevin@sentiblue.net
Randall  32    randall@sentiblue.net
Michael  37    mike@blogspot.com


The "-k 2" tells sort to act on  column 2 and the "n" tells it to sort numerically.

 

The "uniq" Command

Uniq command normally works with sort. In fact, it requires data to be sorted first in order to work.

Let's say we have a list that have multiple duplicate items like this

$ cat names.txt
Kevin
Kelly
Robert
Kevin
Kevin
Robert
Kelly
Tom
Amber
Kelly
Amber
Robert
Tom

The uniq command is used remove the duplicates, but it will only work correctly if the list is sorted. Notice above that there are 2 "Kevin"'s duplicated next to each other. The uniq command will remove one of them, but will reprint all of the rest because they are not sorted;

$ cat names.txt | uniq
Kevin
Kelly
Robert
Kevin
Robert
Kelly
Tom
Amber
Kelly
Amber
Robert
Tom


Now, if we REALLY want to remove all duplicates, apply the uniq command *AFTER* sorting the list:

$ cat names.txt | sort | uniq
Amber
Kelly
Kevin
Robert
Tom


See how that works?

Note that this is specifically useful when you use it to analyze IP Addresses in a web server access log file. Using grep/awk/sed together with sort/uniq, you can distinctively extract IP Address of the visitor, the time and the page they pulled up. This is very resourceful when doing forensic analysis on a web application.

Real Life Example:

Apache log file looks like this
192.168.247.33 - - [01/Apr/2014:11:20:49 -0700] "GET /vars/images/dt_images/topnav/header/flash.jpg HTTP/1.1" 302 268 "https://blog.sentiblue.com/blogs/technical/promodetailtechnical.dt?arg_promoid=80878&parentpage=technical_technicalmanager&pagename=technical_technicalmanager&idPageIndex=1&advfiltervalues=" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)"
192.168.247.33 - - [01/Apr/2014:11:20:49 -0700] "GET /vars/images/dt_images/topnav/header/menuselectorBK2.jpg HTTP/1.1" 302 270 "https://blog.sentiblue.com/blogs/technical/promodetailtechnical.dt?arg_promoid=80878&parentpage=technical_technicalmanager&pagename=technical_technicalmanager&idPageIndex=1&advfiltervalues=" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)"
192.168.247.33 - - [01/Apr/2014:11:20:49 -0700] "GET /vars/images/dt_images/topnav/header/banner_bg.jpg HTTP/1.1" 302 272 "https://blog.sentiblue.com/blogs/technical/promodetailtechnical.dt?arg_promoid=80878&parentpage=technical_technicalmanager&pagename=technical_technicalmanager&idPageIndex=1&advfiltervalues=" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)"
192.168.247.33 - - [01/Apr/2014:11:20:49 -0700] "GET /vars/images/dt_images/topnav/header/toolbarBkSelected.jpg HTTP/1.1" 302 272 "https://blog.sentiblue.com/blogs/technical/promodetailtechnical.dt?arg_promoid=80878&parentpage=technical_technicalmanager&pagename=technical_technicalmanager&idPageIndex=1&advfiltervalues=" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)"
192.168.247.33 - - [01/Apr/2014:11:20:49 -0700] "GET /vars/images/dt_images/k.png HTTP/1.1" 302 242 "https://blog.sentiblue.com/blogs/technical/promodetailtechnical.dt?arg_promoid=80878&parentpage=technical_technicalmanager&pagename=technical_technicalmanager&idPageIndex=1&advfiltervalues=" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)"
192.168.247.118 - - [01/Apr/2014:11:20:49 -0700] "GET / HTTP/1.1" 302 226 "-" "Echoping/6.0.2"
192.168.247.24 - - [01/Apr/2014:11:20:49 -0700] "GET / HTTP/1.1" 302 225 "-" "-"
192.168.247.142 - - [01/Apr/2014:11:20:49 -0700] "POST /container27/queues/amfpollingsecure HTTP/1.1" 200 65 "https://blog.sentiblue.com/visitors/listing/demo.swf/[[DYNAMIC]]/5" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; InfoPath.2; MS-RTC LM 8; .NET4.0E)"
192.168.247.231 - - [01/Apr/2014:11:20:50 -0700] "POST /container29/queues/amfpollingsecure HTTP/1.1" 200 66 "https://blog.sentiblue.com/visitors/listing/demo.swf/[[DYNAMIC]]/5" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)"
192.168.247.142 - - [01/Apr/2014:11:20:50 -0700] "POST /container27/queues/amfpollingsecure HTTP/1.1" 200 65 "https://blog.sentiblue.com/visitors/listing/demo.swf/[[DYNAMIC]]/5" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; InfoPath.2; MS-RTC LM 8; .NET4.0E)"
192.168.247.118 - - [01/Apr/2014:11:20:51 -0700] "GET / HTTP/1.1" 302 226 "-" "Echoping/6.0.2"
192.168.247.142 - - [01/Apr/2014:11:20:52 -0700] "POST /container27/queues/amfpollingsecure HTTP/1.1" 200 65 "https://blog.sentiblue.com/visitors/listing/demo.swf/[[DYNAMIC]]/5" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; InfoPath.2; MS-RTC LM 8; .NET4.0E)"
192.168.247.231 - - [01/Apr/2014:11:20:53 -0700] "POST /container29/queues/amfpollingsecure HTTP/1.1" 200 66 "https://blog.sentiblue.com/visitors/listing/demo.swf/[[DYNAMIC]]/5" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)"
192.168.247.142 - - [01/Apr/2014:11:20:53 -0700] "POST /container27/queues/amfpollingsecure HTTP/1.1" 200 65 "https://blog.sentiblue.com/visitors/listing/demo.swf/[[DYNAMIC]]/5" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; InfoPath.2; MS-RTC LM 8; .NET4.0E)"
192.168.247.142 - - [01/Apr/2014:11:20:52 -0700] "POST /container27/queues/amfsecure HTTP/1.1" 200 10411 "https://blog.sentiblue.com/visitors/listing/demo.swf/[[DYNAMIC]]/5" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; InfoPath.2; MS-RTC LM 8; .NET4.0E)"

Urghhh!!! That output looks tough!!! I just want to get a count of all the IP Addresses in this log... here's how

$ cat apache.log | awk '{ print $1 }' | sort | uniq -c
   2 192.168.247.118
   5 192.168.247.142
   2 192.168.247.231
   1 192.168.247.24
   5 192.168.247.33


See how easy it is? In this command example; we tell the bash shell to do this:

View the apache.log file, then print only column 1 (The IP Address), sort it, remove duplicates and count each of them. The Uniq command, when used with "-c", it will print an additional column in front indicating the count of each unique item.

 

The "xargs" Command

This is a simple command, it simply transforms columns into rows, separated by a space by default.

From this example above
$ cat names.txt | sort | uniq
Amber
Kelly
Kevin
Robert
Tom


We can revise it with the xargs commands to turn the output into a string of names;

$ cat names.txt | sort | uniq | xargs
Amber Kelly Kevin Robert Tom


This scenario of usage is particularly useful when generating output that are to be used with another command that requires data in string as above.

For example; the command "ps -eaf | grep apache | awk '{ print $2 }'" will list all processes owned by the account name "apache".

It shows like this
$ ps -eaf | grep [a]pache
apache 21962  6852  0 11:41 ?        00:00:00 /usr/sbin/httpd -d /opt/sentiblue/blog -D SSL
apache 21963  6852  0 11:41 ?        00:00:00 /usr/sbin/httpd -d /opt/sentiblue/blog -D SSL
apache 21976  6852  0 11:41 ?        00:00:00 /usr/sbin/httpd -d /opt/sentiblue/blog -D SSL
apache 21980  6852  0 11:41 ?        00:00:00 /usr/sbin/httpd -d /opt/sentiblue/blog -D SSL
apache 22021  6852  0 11:41 ?        00:00:00 /usr/sbin/httpd -d /opt/sentiblue/blog -D SSL
apache 22242  6852  0 11:42 ?        00:00:00 /usr/sbin/httpd -d /opt/sentiblue/blog -D SSL
apache 22247  6852  0 11:42 ?        00:00:00 /usr/sbin/httpd -d /opt/sentiblue/blog -D SSL
apache 22280  6852  0 11:42 ?        00:00:00 /usr/sbin/httpd -d /opt/sentiblue/blog -D SSL
apache 22281  6852  0 11:42 ?        00:00:00 /usr/sbin/httpd -d /opt/sentiblue/blog -D SSL
apache 22309  6852  0 11:42 ?        00:00:00 /usr/sbin/httpd -d /opt/sentiblue/blog -D SSL
apache 22395  6852  0 11:42 ?        00:00:00 /usr/sbin/httpd -d /opt/sentiblue/blog -D SSL
apache 22404  6852  0 11:42 ?        00:00:00 /usr/sbin/httpd -d /opt/sentiblue/blog -D SSL
apache 22415  6852  0 11:42 ?        00:00:00 /usr/sbin/httpd -d /opt/sentiblue/blog -D SSL
apache 22416  6852  0 11:42 ?        00:00:00 /usr/sbin/httpd -d /opt/sentiblue/blog -D SSL
apache 22420  6852  0 11:42 ?        00:00:00 /usr/sbin/httpd -d /opt/sentiblue/blog -D SSL


We want to write a single command line to kill all these processes. We know that the kill command syntax works like this;

$ kill ..

We also know that we can extract the second column to get the PID list like this
$ ps -eaf | grep [a]pache | awk '{ print $2 }' | xargs
21962 21963 21976 21980 22021 22242 22247 22280 22281 22309 22395 22404 22415 22416 22420

If we pipe this output to the kill command, they will all die;

$ ps -eaf | grep [a]pache | awk '{ print $2 }' | xargs kill
$ ps -eaf | grep [a]pache
< Nothing shows here because all processes died. >

The most useful credit for xargs is when you use it in a loop when programming a script;

The "for" loop uses syntax like this;

for NAME in Kevin Robert Kelly
do
   echo $NAME
done

Notice that the for loop lists all $NAME items in a text string as individual items. Xargs, when converting a column data file into such a data format, will be able to pipe that data to the for loop and process it accordingly.

 

The "seq" Command

The command "seq" is used to generate a sequential type of data list. The simplest example is to generate a lis of numbers from 1 to 5 as below

$ seq 1 5
1
2
3
4
5


Going back to for loop example in the section above, we want the list 1-5 to be in a string of text instead of column; we do this

$ for NUMBER in `seq 1 5 | xargs`
do
   grep $NUMBER
done

The loop above will generate a list of numbers in column, convert it to a row list, then pipe to the loop and iterate through each number and search for it in a file.

Let's make real life example more practical;

I want to check the date on my web server farm (5 machines); The server names are www2001 www2002 www2003 www2004 www2005. I can do something like this

$ for WS in www2001 www2002 www2003 www2004 www2005; do echo -n "$WS: "; ssh $WS date; done
www2001: Tue Apr  1 12:00:53 PDT 2014
www2002: Tue Apr  1 12:00:53 PDT 2014
www2003: Tue Apr  1 12:00:53 PDT 2014
www2004: Tue Apr  1 12:00:53 PDT 2014
www2005: Tue Apr  1 12:00:53 PDT 2014

But if the number of servers in my farm is a few thousand, will I be willing to type out the whole list in the for loop? Definitely NOT!!! Here's how to get away from that with a list of 500 servers:

$ for WS in `seq 2001 2500 | xargs`; do echo -n "www$WS: "; ssh www$WS date; done
www2001: Tue Apr  1 12:00:53 PDT 2014
www2002: Tue Apr  1 12:00:53 PDT 2014
www2003: Tue Apr  1 12:00:53 PDT 2014
....
....
www2499: Tue Apr  1 12:00:53 PDT 2014
www2500: Tue Apr  1 12:00:53 PDT 2014

Say we only want the odd numbers out of the server list; we can do this
$ for WS in `seq 2001 2 2500 | xargs`; do echo $WS; done
2001
2003
....
2497
2499

Note that seq doesn't know odd or even. The above command only tells it to increment by 2. You can increment by any number to achieve purposes other than odd/even.

- If we were dealing with leading zeroes, we may have the need to format numbers so that they have the same width... just throw in the "-w" switch with "seq" and the command will generate the list with the same number of digits where the number of digits is derived from the largest number in the output sequence;

If we have a server list like "s01 s02 s03.... s10" then we may need to do this $ seq -w 1 10
01
02
03
04
05
06
07
08
09
10


For 100-999 servers, seq will automatically use the same syntax to generate 3 digits. Again, the number of digits come from the highest number.

$ seq -w 1 250
001
002
...
250

Articles in the Series

Part 1: The grep Command
Part 2: The awk Command 
Part 3: The sed Command
Part 4: Miscellaneous Commands

HowTo Handle Mass Output - The sed Command

Thanks for reading thus far guys....

This is the 3rd article on a series of tutorial showing how to take a massive amount of output and extract/filter/format them into a human readable report.

Let's get straight to work;

The command "sed" allows users to find a certain string and replace that with a different string. The command will work directly on standard output (what you see on the screen) or on an existing file.

Simple Syntax (Standard Output):
$ ./some_command_to_generate_output.sh | sed 's/abc/xyz/g'

Whatever the command output is, the sed command above will replace "abc" with "xyz".

Tearing it down:
- Start with the command sed and an opening single quote: sed '
- The s immediately after the starting quote tells sed to perform a find/replace operation
- The slashes separate arguments for sed.
- The abc is expected to be "what to find" and is enclosed in the 1st two slashes
- The xya is expected to be "the new string" and is enclosed between the 2nd and 3rd slash
- The /g at the end tells sed to operate the find/replace on the entire output. Without the "g", sed will only find/replace the first occurrence.
- Finish with the final closing single quote.

Let's put this to work: Say there's a file servers.txt with the contents like below

Server Name    MFG               Product Description   Serial Number
===========    ===============   ===================   =============
wxv5238        Hewlett-Packard   ProLiant DL385G7      243897234987
erw3489        IBM               xSeries 3850          239472384798
dfw2342        Dell              PowerEdge 5900        234987234879
fxw5210        Hewlett-Packard   ProLiant DL385G2      298472938740
ter4320        HewLett-Packard   ProLiant DL385G6      345098594803

Let's say we don't like the brand name "Hewlett-Packard" to show like that. Instead we only want to show "HP". We can rewrite the file above with a single command

$ cat servers.txt | sed 's/Hewlett-Packard/HP/g'

Server Name    MFG   Product Description   Serial Number
===========    ====  ===================   =============
wxv5238        HP    ProLiant DL385G7      243897234987
erw3489        IBM   xSeries 3850          239472384798
dfw2342        Dell  PowerEdge 5900        234987234879
fxw5210        HP    ProLiant DL385G2      298472938740
ter4320        HP    ProLiant DL385G6      345098594803

How about if we want to make this change directly into the servers.txt file? Just one single command and the file gets changed and saved?

$ sed -i 's/Hewlett-Packard/HP/g' servers.txt
$ cat servers.txt

Server Name    MFG   Product Description   Serial Number
===========    ====  ===================   =============
wxv5238        HP    ProLiant DL385G7      243897234987
erw3489        IBM   xSeries 3850          239472384798
dfw2342        Dell  PowerEdge 5900        234987234879
fxw5210        HP    ProLiant DL385G2      298472938740
ter4320        HP    ProLiant DL385G6      345098594803

Articles in the Series

Part 1: The grep Command
Part 2: The awk Command 
Part 3: The sed Command
Part 4: Miscellaneous Commands