Tool Tip: “ls”

February 26, 2007

Yeah yeah, we know ls already.

But how much of ls‘s functionality do you actually use? There are so many switches to ls, that when Sun added extended attributes (does anyone use that?) they found that there were no letters left, so they had to use “-@” !

So, here are a couple of handy ls options, in no particular order; either for interactive or scripting use. I’m assuming GNU ls; Solaris ls supports most GNU-style features, but the “nice-to-have” features, like ls -h aren’t in historical UNIX ls implementations. I’ll split these into two categories: Sort ‘em and Show ‘em. What are your favourites?

Sort ‘em

When sorting, I tend to use the “-l (long listing)” and “-r (reverse order)” switches:

Sort ‘em by Size:

ls -lSr

Sort ‘em by Date:

ls -ltr

Show ‘em

There are a number of ways to show different attributes of the files you are listing; “-l” is probably the obvious example. However, there are a few more:

Show ‘em in columns

ls -C

Useful if you’re not seeing as many as you’d expect.

Show ‘em one by one

ls -1

That’s the number 1 (one) there, not the letter l (ell). Forces one-file-per-line. Particularly useful for dealing with strange filenames with whitespace in them.

Show ‘em as they are

ls -F

To append symbols (“*” for executables, “/” for directories, etc) to the filename to show further information about them.

Show ‘em so I can read it

ls -lh

Human-readable filesizes, so “12567166” is shown as “12M”, and “21418” is “21K”. This is handy for people, but of course, if you’re writing a script which wants to know file sizes, you’re better off without this (21Mb is bigger than 22Kb, after all!)

Show ‘em with numbers

ls -n

This is equivalent to ls -l, except that UID and GID are not looked up, so:

$ ls -l foo.txt
-rw-r--r-- 1 steve steve 46210 2006-11-25 00:33 foo.txt
$ ls -n foo.txt
-rw-r--r-- 1 1000 1000 46210 2006-11-25 00:33 foo.txt

This can be useful in a number of ways; particularly if your NIS (or other) naming service is down, or if you’ve imported a filesystem from another system.

What’s your favourite?

What are your most-used switches for the trusty old ls tool?


Backgrounding

February 14, 2007

The great thing about Unix, and the Bourne shell, when it was introduced back in the day, was multitasking. It’s such an overused buzzword these days, but at the time, it was really a new thing. If you’ve only got one connection in to a machine, you can get it doing as much as you want.

The shell command to “do this in the background, then give me a new prompt to provide the next command” is the ampersand (“&”):

$  # I need to trawl the filesystem for files called "*dodgy*"
    #  (should have installed slocate
    #  (http://packages.debian.org/stable/source/slocate), 
    # but it's too late for that)
$ find / -name "*dodgy" -print
    (wait for a very very very long time)
/foo/bar/thisfileisnotdodgy.txt
/bar/foo/thisisadodgyfile.txt
$

Well, that’s a good hour of my life wasted.

Chuck it into a script, and run it in the background. If you want the outcome, direct it to a file:

$ cd /tmp
$ cat myfindscript.sh
find / -name "*dodgy*"
$ chmod u+rx myfindscript.sh
$ ./myfindscript.sh > /tmp/mysuspectfiles.txt &
[4402]
$
$ # wow, didja see that? It'll take ages, but I've got 
   # control back. "4402" is the Process ID (PID), 
   # so I can run "ps -fp 4402" to check on its
   # progress, but it's happening, in the backrgound.

You don’t get a lot of job control here; the “ps” mentioned above is about your lot, but you can spawn a child process and let it run, whilst you get on with the stuff you need.

This is known as “backgrounding” a task; if you know it will take a long time, just background it. Of course, if the next thing you need to do is to read the entire file, then you won’t get away with it, you’ll have to wait for it to finish. However, you could background it and then “tail -f /tmp/mysuspectfiles.txt” to check on the status.


Search and Replace

February 6, 2007

Two great tools for search-and-replace are tr and sed.

tr – Translate (or Delete)

tr can translate a single character into another. For example, “tr 'a' 'b'” will convert all instances of “a” into “b”. Although it works on single characters, it also understands blocks, so “tr '[A-Z]' '[a-z]'” will convert uppercase to lowercase: ‘A’ becomes ‘a’, ‘B’ becomes ‘b’, and so on.

The GNU version of tr (which comes with most Linux distributions) has some handy keywords, too: “tr [:lower:] [:upper:]” is a more readable opposite of the above [A-Z] convention.

You can also specify your own set of characters, so if you want to convert ‘l’ to ‘1’, ‘o’ to ‘0’, and ‘e’ to ‘3’, then this will do the job:

$  echo welcome | tr 'loe' '103'
w31c0m3

You could extend it to a more complete “l33t sp33k”:

$ echo abcdefghijlkmnopqrstuvwxyz | tr 'aeilost' '@3!1057'
@bcd3fgh!j1kmn0pqr57uvwxyz

tr can also just delete – this deletes the letter ‘l’ :

$ echo hello and welcome | tr -d 'l'
heo and wecome

sed – Stream Editor

UNIX uses a few metaphors, one being a water metaphor, which we use with pipes (|), redirects (<, >), and a few other places. Sed gets its name from what it does… much like tr, you stream data into it, and slightly modified data comes out the other end.

sed isn’t limited to single-character operations; it can cope with whole phrases, as well as regular expressions. I’ll keep it simple(ish) for now, I plan to do a more complete post on sed and another on regular expressions soon, though. For today, I’ll stick to the sed s/from/to/count syntax.

With the s/from/to/count syntax, sed will convert “from” to “to”, as many times (per line of text) as you specify. The special “/g” converts every instance.

I like to get stuck in with a few examples, so here goes:

$ cat text.txt
Fedora Core is my favourite distribution.
It's got just the right level of ease-of-use
along with regular updates, whilst remaining
a stable, supportable Operating System. In fact,
I'd go so far as to say that Fedora Core is definitely
the best Linux distribution for home users.
Fedora Core is certainly my favourite distribution.
$ sed s/"Fedora Core"/"Ubuntu"/g text.txt
Ubuntu is my favourite distribution.
It's got just the right level of ease-of-use
along with regular updates, whilst remaining
a stable, supportable Operating System. In fact,
I'd go so far as to say that Ubuntu is definitely
the best Linux distribution for home users.
Ubuntu is certainly my favourite distribution.
$

The syntax there was sed s/from/to/count, so it replaces “Fedora Core” with “Ubuntu” in this example. If we specified “/1” at the end, it would only convert the first instance on each line. Similarly, “/2” would convert the first two instances. “/g” is probably the most-used, it converts everything (the “g” stands for “global”).

sed can emulate tr‘s tr -d functionality by having the “to” part being an empty string; here we refer to “Fedora Core” simply as “Fedora” (note the leading space: it’s ” Core”, not “Core”) :

$ sed s/" Core"//g text.txt
Fedora is my favourite distribution.
It's got just the right level of ease-of-use
along with regular updates, whilst remaining
a stable, supportable Operating System. In fact,
I'd go so far as to say that Fedora is the
best Linux distribution for home users.
Fedora is certainly my favourite distribution.

Notice also that we can cat stuff into sed, as “cat text.txt | sed s/src/dest/g“, or we can pass the file directly to sed, like this: sed s/src/dest/g text.txt. The same applies to most *nix commands.

To get in to the rest, we’ll need to get into regexp (Regular Expressions, the stuff like “the * brown * jumped over the * dog” which result in $1 = “quick”, $2 = “fox”, $3 = “lazy”). That’s for another day, though.


Redirection

January 29, 2007

A lot of the power of the *nix shell is in redirection. The model is called “streams”, and we even have “pipes” for these “streams”. It’s not the best metaphor ever, but it’s good enough, I suppose. There are 3 streams associated with a process: the standard input (stdin), standard output (stdout) and standard error output (stderr). As you might expect, you get user input on stdin, output to stderr, and send errors to stderr.

stdout : Redirecting Output

Let’s start with stdout, the output stream, since that is the most common:

$ ls > /tmp/foo.txt
$

This will run the “ls” (list files) command, but instead of showing you the results, it will write to the file you specify (/tmp/foo.txt, in this example). If there is already a /tmp/foo.txt file, then it gets replaced.

$ ls >> /tmp/foo.txt
$

By using >> instead of the single >, we can append to the file instead of overwriting it.

If we want to see the output like we normally would, but log it to a file as well, we can use the “tee” utility:

$ ls | tee /tmp/foo.txt
[ the usual ls output shown here ]
$ ls | tee -a /tmp/foo.txt
[ the usual ls output shown here ]
$

The first command will write to /tmp/foo.txt; the second (with “-a“) will append to /tmp/foo.txt.

We have used a different redirector here; “|” (pipe) instead of “>” and “>>” . This is because we’re not funnelling the output to a different place (/tmp/foo.txt), but passing it to a program (tee) which does some more funky redirection.

Another common use of the pipe (“|“) is to go to the more (or less) utility. If a command would produce loads of output, faster than you can read it, then more will pause after the screen is full, and prompt you with a “— more” prompt (hence the name). SPACE will show the next screen; RETURN will show the next line. less is just like more, but it can also go backwards (PgUp and PgDn):

$ ls | more
file1.txt
file2.txt
file3.txt
--- more  (PRESS SPACE)
file4.txt
file5.txt
file6.txt
--- more (PRESS SPACE)
file7.txt
$ 

stderr : Redirecting Errors

Well-written programs will send their errors to a different “device” than they send their normal output to. Both stdout and stderr are usually the same place (your terminal), but they can be treated separately:

$ ls
foo.txt
$ ls fo.txt
ls: fo.txt: No such file or directory
$ ls fo.txt > output.txt 2>errors.txt
$ ls
foo.txt    output.txt   errors.txt
$ cat output.txt
$ cat errors.txt
ls: fo.txt: No such file or directory
$

What happened there? Well, we asked for “ls fo.txt“, but fo.txt doesn’t exist (foo.txt does). So we see an error from ls. If we direct stdout to “output.txt” and stderr to “errors.txt”, then we can see the difference. What ls actually did, was that it wrote *nothing* to stdout and the error message was sent to stderr. (stderr has file descriptor #2, so we say “2>” to direct stderr.) So when we “cat output.txt“, we get nothing (there was no output), but when we “cat errors.txt“, we see the error.

stdin: Redirecting Input

This is most commonly done by system utilities, but many shell scripts use the functionality.
The simple way is to use the “<” director:

$ mycommand < myfile.txt

This will take input from “myfile.txt” instead of from the keyboard.

The second way is probably more common. The more example above had more redirecting its input, via the pipe (“|“). We can create entire pipelines this way:

$ find . -print | grep foo | more

This will attach the stdout of the find command to the stdin of the grep command, and attach the stdout of grep to the stdin of more. Got that?! So what the *nix kernel will do, is that it will start more first, and then start grep telling it that its stdout is more‘s stdin. It will then start find, and tell it that its stdout is grep‘s stdin.

There’s actually more to it than that, but that’s got the basics of stdin, stdout and stderr covered briefly.


File Permissions

January 25, 2007

The Unix file permissions model doesn’t seem to get explained very clearly, very often. It’s really quite simple, though some of the more advanced stuff isn’t so widely known. The key commands are ls -l and chmod. chmod has two ways of working; we’ll deal with the easy one first.

When you look at a file, there are lots of fields. (I’m using Linux with an ext3 filesystem for these examples, but it’s the same across the board for Unix and Linux, and just about any filesystem.)

$ ls -l myfile.txt
-rw-r--r-- 1 steve users 4 2007-01-25 20:37 myfile.txt
$

So what does it all mean? Going through the fields in order, it’s:

-rw-r--r--    1    steve   users  4    2007-01-25 20:37 myfile.txt
permission  links  owner   group  size  last-modified   filename

We’re dealing with the permission stuff here, but I’ll quickly run through the others. “Links” tells you how many “hard links” there are to the file. That’s probably for another post, but if you type “ln myfile.txt yourfile.txt“, then the link count will go up from 1 to 2. “owner” tells you what user owns the file, and “group” tells you what group is associated with the file. “size” is pretty obvious; it’s in bytes, (this file’s 4 bytes are “f”, “o”, “o” and a newline character). “last-modified” tells you when the file was last changed (not necessarily when it was created), and finally, the filename.

For our purposes, the important stuff is the permission, owner and group. That’s “-rw-r–r–“, “steve” and “users” in this example.

Looking at the “-rw-r–r–“, it seems almost random. Once you know the structure, it’s very informative. There are 10 characters, or fields, grouped with the first character by itself, then three sets of three, like so:

File Type Owner Group Other
- rw- r– r–

The initial “-” for File Type, tells you what kind of file it is. In this case, “-” means it’s a regular file. “d” indicates a directory, “c” means a character-special device, and “b” means a block-special device. Run “ls -l /dev” to see some “c” and “b” files. They’re device drivers; a character-device (eg, /dev/lp0, the printer) is accessed with characters; you tend to chuck text at it. A block-device (eg, /dev/hda1, the hard disk) is accessed in blocks, not single characters. We’re not kernel developers, so we don’t need to worry about that too much.

The Meat Of It

The main part of the -rw-r–r– information is the three sets of three characters: “rw-“, “r–” and “r–” in this example. Of the block, the first character is either “r” to indicate that you can Read the file, or “-” to indicate that you can’t read it. The second is “w” if you can Write to the file, and “-” if you can’t. The third is usually “x” if you can eXecute (run) the file, or “-” if you can’t. (the third can also be “t” or “s”; we’ll come to that in a minute).

So in this case, the file’s owner (“steve”) can read and write, but not execute (rw-). Members of the group (“users”) can read the file, but can’t change it or run it (r–). Anybody who’s not “steve” and not in the “user” group can do the same in this example (r–).

Common Uses

Common sets of permissions are:

600 -rw——- I can read and write it, but nobody else can. (Private files)
640 -rw-r—– I can read and write it, my group can only read it. Others can do nothing. (Semi-shared files)
755 -rwxr-xr-x I can read, write, execute; everyone else can read and execute it, but not change it. (Shared programs)
644 -rw-r–r– I can read and write it, the rest of the world can read it (Shared files)

What’s that left-hand column I threw in there? That’s the other way of thinking about permissions. if “r” is 4, “w” is 2, and “x” is 1, then “rwx” is “4+2+1=7″, “r–” is 4, “rw-” is 4+2=6, and so on. It’s a kind of shorthand.

chmod

We set permissions with the chmod command. The first set of three is “u” for “User”, the second is “g” for “Group”, and the last is “o” for “Other”. There’s also “a” for “All”. So “chmod g+rwx” means “add rwx to the second block”, while “chmod a-x” means “take off the x flags for everybody”.

This is easiest to show with examples:

$ ls -l myfile.txt
-rw-r--r-- 1 steve steve 4 2007-01-25 20:39 myfile.txt

#                                           Allow me to eXecute the file: User + eXecute = u+x:
$ chmod u+x myfile.txt
$ ls -l myfile.txt
-rwxr--r-- 1 steve steve 4 2007-01-25 20:39 myfile.txt

#                                           Don't let Others Read the file: Others - Read = o-r:
$ chmod o-r myfile.txt
$ ls -l myfile.txt
-rwxr----- 1 steve steve 4 2007-01-25 20:39 myfile.txt

#                                           Don't the Group Read the file: Group - Read = g-r:
$ chmod g-r myfile.txt
$ ls -l myfile.txt
-rwx------ 1 steve steve 4 2007-01-25 20:39 myfile.txt

#                                           Be specific with numbers: 600 = -rw-------
$ chmod 600 myfile.txt
$ ls -l myfile.txt
-rw------- 1 steve steve 4 2007-01-25 20:39 myfile.txt

#                                           Be specific with numbers: 755 = rwxr-xr-x
$ chmod 755 myfile.txt
$ ls -l myfile.txt
-rwxr-xr-x 1 steve steve 4 2007-01-25 20:39 myfile.txt

Follow

Get every new post delivered to your Inbox.