Shell Pipes by Example

July 22, 2007

Pipes, piping, pipelines… whatever you call them, are very powerful – in fact, they are one of the core tenets of the philosophy behind UNIX (and therefore Linux). They are also, really, very simple, once you understand them. The way to understand them, is by playing with them, but if you don’t know what they do, you don’t know where to start… Catch-22!

So, here are some simple examples of how the pipe works.

Let’s see the code

$ grep steve /etc/passwd | cut -d: -f 6
/home/steve
$

What did this do? There are two UNIX commands there: grep and cut. The command “grep steve /etc/passwd” finds all lines in the file /etc/passwd which contain the text “steve” anywhere in the line. In my case, this has one result:
steve:x:1000:1000:Steve Parker,,,:/home/steve:/bin/bash
The second command, “cut -d: -f6” cuts the line by the delimiter (-d) of a colon (“:“), and gets field (-f) number 6. This is, in the /etc/passwd file, the home directory of the user.

So what? Show me some more

This is the main point of this article; once you’ve seen a few examples, it normally all becomes clear.

EG2

$ find . -type f -ls | cut -c14- | sort -n -k 5
rw-r--r--   1 steve    steve       28 Jul 22 01:41 ./hello.txt
rwxr-xr-x   1 steve    steve     6500 Jul 22 01:41 ./a/filefrag
rwxr-xr-x   1 steve    steve     8828 Jul 22 01:42 ./c/hostname
rwxr-xr-x   1 steve    steve    30848 Jul 22 01:42 ./c/ping
rwxr-xr-x   1 steve    steve    77652 Jul 22 01:42 ./b/find
rwxr-xr-x   1 steve    steve    77844 Jul 22 01:41 ./large
rwxr-xr-x   1 steve    steve    93944 Jul 22 01:41 ./a/cpio
rwxr-xr-x   1 steve    steve    96228 Jul 22 01:42 ./b/grep
$

What I did here, was three commands: “find . -type f -ls” finds regular files, and lists them in an “ls”-style format: permissions, owner, size, etc.
cut -c14-” cuts out the first 14 characters, which mess up the formatting on this website (!), and aren’t very interesting.
sort -n -k 5” does a numeric (-n) sort, on field 5 (-k5), which is the size of the file.
So this gives me a list of the files in this directory (and subdirectories), ordered by file size. That’s much more useful than “ls -lS“, which restricts itself to the current directory, but not subdirectories.

(As an aside, I have to admit that I only concocted this by trying to think of an example; it actually seems really useful, and worth making into an alias… I must do a post about “alias” some time!)

So how does it work?

This seems pretty straightforward: get lines containing “steve” from the input file (“grep steve /etc/passwd“), and get the sixth field (where fields are marked by colons) (“cut -d: -f6“). You can read the full command from left to right, and see what happens, in that order.

How does it really work?

EG1 Explained

There are some gotchas when you start to look at the plumbing. Because we’re using the analogy of a pipe (think of water flowing through a pipe), the OS actually sets up the commands in the reverse order. It calls cutfirst, then it calls grep. If you have (for example) a syntax error in your cut command, then grep will never be called.
What actually happens is this:

  1. A “pipe” is set up – a special entity which can take input, which it passes, line by line, to its output.
  2. cut is called, and its input is set to be the “pipe”.
  3. grep is called, and its output is set to be the “pipe”.
  4. As grep generates output, it is passed through the pipe, to the waiting cut command, which does its own simple task, of splitting the fields by colons, and selecting the 6th field as output.

EG2 Explained

For EG2, “sort” is called first, which ties to the second (rightmost) pipe for its input. Then “cut” is called, which ties to the second pipe for its output, and the first (leftmost) pipe for its input. Then, “find” is called, which ties to the first pipe for its output.
So, the output of “find” is piped into “cut“, which strips off the first 14 characters of the “find” output. This is then passed to “sort“, which sorts on field 5 (of what it receives as input), so the output of the entire pipeline, is a numerically sorted list of files, ordered by size.


Follow

Get every new post delivered to your Inbox.