Tool Tip: “ls”

February 26, 2007

Yeah yeah, we know ls already.

But how much of ls‘s functionality do you actually use? There are so many switches to ls, that when Sun added extended attributes (does anyone use that?) they found that there were no letters left, so they had to use “-@” !

So, here are a couple of handy ls options, in no particular order; either for interactive or scripting use. I’m assuming GNU ls; Solaris ls supports most GNU-style features, but the “nice-to-have” features, like ls -h aren’t in historical UNIX ls implementations. I’ll split these into two categories: Sort ’em and Show ’em. What are your favourites?

Sort ’em

When sorting, I tend to use the “-l (long listing)” and “-r (reverse order)” switches:

Sort ’em by Size:

ls -lSr

Sort ’em by Date:

ls -ltr

Show ’em

There are a number of ways to show different attributes of the files you are listing; “-l” is probably the obvious example. However, there are a few more:

Show ’em in columns

ls -C

Useful if you’re not seeing as many as you’d expect.

Show ’em one by one

ls -1

That’s the number 1 (one) there, not the letter l (ell). Forces one-file-per-line. Particularly useful for dealing with strange filenames with whitespace in them.

Show ’em as they are

ls -F

To append symbols (“*” for executables, “/” for directories, etc) to the filename to show further information about them.

Show ’em so I can read it

ls -lh

Human-readable filesizes, so “12567166” is shown as “12M”, and “21418” is “21K”. This is handy for people, but of course, if you’re writing a script which wants to know file sizes, you’re better off without this (21Mb is bigger than 22Kb, after all!)

Show ’em with numbers

ls -n

This is equivalent to ls -l, except that UID and GID are not looked up, so:

$ ls -l foo.txt
-rw-r--r-- 1 steve steve 46210 2006-11-25 00:33 foo.txt
$ ls -n foo.txt
-rw-r--r-- 1 1000 1000 46210 2006-11-25 00:33 foo.txt

This can be useful in a number of ways; particularly if your NIS (or other) naming service is down, or if you’ve imported a filesystem from another system.

What’s your favourite?

What are your most-used switches for the trusty old ls tool?

Harnessing the flexibility of Regular Expressions with Grep

February 14, 2007

There are two sides to grep – like any command, there’s the learning of syntax, the beginning
of which I covered in the grep tool tip. I’ll
come back to the syntax later, because there is a lot of it.

However, the more powerful side is grep‘s use of regular expressions. Again, there’s not room
here to provide a complete rundown, but it should be enough to cover 90% of usage. Once I’ve got a library
of grep-related stuff, I’ll post an entry with links to them all, with some covering text.

This or That

Without being totally case-insensitive (which -i) does,
we can search for “Hello” or “hello” by specifying the optional
characters in square brackets:

$ grep [Hh]ello *.txt
test1.txt:Hello. This is  test file.
test3.txt:Why, hello there!

If we’re not bothered what the third letter is, then we can say “grep [Hh]e.o *.txt“, because the dot (“.”) will match any single character.

If we don’t care what the third and fourth letters are, so long as it’s “he..o”, then we say exactly that: “grep he..o” will match “hello”, hecko”, heolo”, so long as it is “he” + 1 character + “lo”.

If we want to find anything like that, other than “hello”, we can do that, too:

$ grep he[^l]lo *.txt

Notice how it doesn’t pick up any of the “Hello” variations which have a “llo” in them?

How many?!

We can specify how many times a character can repeat, too. We have to put the expression we’re talking about in [square brackets]:

  • “?” means “it might be there”
  • “+” means “it’s there, but there might be loads of them”
  • “*” means “lots (or none) might be there”

So, we can match “he”, followed by as many “l”s as you like (even none), followed by an “o” with “grep he[l]*o *.txt“:

$ grep he[l]*o *.txt
test3.txt:Why, hello there!


February 14, 2007

The great thing about Unix, and the Bourne shell, when it was introduced back in the day, was multitasking. It’s such an overused buzzword these days, but at the time, it was really a new thing. If you’ve only got one connection in to a machine, you can get it doing as much as you want.

The shell command to “do this in the background, then give me a new prompt to provide the next command” is the ampersand (“&”):

$  # I need to trawl the filesystem for files called "*dodgy*"
    #  (should have installed slocate
    #  (, 
    # but it's too late for that)
$ find / -name "*dodgy" -print
    (wait for a very very very long time)

Well, that’s a good hour of my life wasted.

Chuck it into a script, and run it in the background. If you want the outcome, direct it to a file:

$ cd /tmp
$ cat
find / -name "*dodgy*"
$ chmod u+rx
$ ./ > /tmp/mysuspectfiles.txt &
$ # wow, didja see that? It'll take ages, but I've got 
   # control back. "4402" is the Process ID (PID), 
   # so I can run "ps -fp 4402" to check on its
   # progress, but it's happening, in the backrgound.

You don’t get a lot of job control here; the “ps” mentioned above is about your lot, but you can spawn a child process and let it run, whilst you get on with the stuff you need.

This is known as “backgrounding” a task; if you know it will take a long time, just background it. Of course, if the next thing you need to do is to read the entire file, then you won’t get away with it, you’ll have to wait for it to finish. However, you could background it and then “tail -f /tmp/mysuspectfiles.txt” to check on the status.

Arguments and a bit about Functions… two nonintuitive words

February 10, 2007

Those of us who are used to command-line utilities, are used to passing arguments (aka parameters) to them: we don’t just run “ls“, we run “ls -ltSr“. So, how do we write a shell script which can take arguments?

There are two methods, and I will just deal with the simple approach, today. I’ll leave getopts for another day, when I have a little more time. The simple answer, is that, if we get called as “uppercase hello”, and we need to reply with “HeLlO”, then the script would look like this:

# uppercase ... convert alternate words to uppercase
# note: there are many ways to do this (see previous posts) 
echo $1 | tr '[a-z]' '[A-Z]'
echo $2
echo $3 | tr '[a-z]' '[A-Z]'
echo $4
echo $5 | tr '[a-z]' '[A-Z]'
echo $6

If we say uppercase a b c d e f,then it will reply with AbCdEf. Still, not much fun.

We can rewrite it like this, using the shift command, which discards the $1 and promotes $2:

echo $1 | tr '[a-z]' '[A-Z]'

echo $1

echo $1 | tr '[a-z]' '[A-Z]'

echo $1

echo $1 | tr '[a-z]' '[A-Z]'

echo $1

That way, we’re always dealing with $1. Otherwise, it’s just the same.

We can then put upper() into a function:


echo $1 | tr '[a-z]' '[A-Z]'

for i in $*
        upper $1
        echo $1

This way, “$1” is always the current argument. So “ a b c d e f” will output “AbCdEf“.

Search and Replace

February 6, 2007

Two great tools for search-and-replace are tr and sed.

tr – Translate (or Delete)

tr can translate a single character into another. For example, “tr 'a' 'b'” will convert all instances of “a” into “b”. Although it works on single characters, it also understands blocks, so “tr '[A-Z]' '[a-z]'” will convert uppercase to lowercase: ‘A’ becomes ‘a’, ‘B’ becomes ‘b’, and so on.

The GNU version of tr (which comes with most Linux distributions) has some handy keywords, too: “tr [:lower:] [:upper:]” is a more readable opposite of the above [A-Z] convention.

You can also specify your own set of characters, so if you want to convert ‘l’ to ‘1’, ‘o’ to ‘0’, and ‘e’ to ‘3’, then this will do the job:

$  echo welcome | tr 'loe' '103'

You could extend it to a more complete “l33t sp33k”:

$ echo abcdefghijlkmnopqrstuvwxyz | tr 'aeilost' '@3!1057'

tr can also just delete – this deletes the letter ‘l’ :

$ echo hello and welcome | tr -d 'l'
heo and wecome

sed – Stream Editor

UNIX uses a few metaphors, one being a water metaphor, which we use with pipes (|), redirects (<, >), and a few other places. Sed gets its name from what it does… much like tr, you stream data into it, and slightly modified data comes out the other end.

sed isn’t limited to single-character operations; it can cope with whole phrases, as well as regular expressions. I’ll keep it simple(ish) for now, I plan to do a more complete post on sed and another on regular expressions soon, though. For today, I’ll stick to the sed s/from/to/count syntax.

With the s/from/to/count syntax, sed will convert “from” to “to”, as many times (per line of text) as you specify. The special “/g” converts every instance.

I like to get stuck in with a few examples, so here goes:

$ cat text.txt
Fedora Core is my favourite distribution.
It's got just the right level of ease-of-use
along with regular updates, whilst remaining
a stable, supportable Operating System. In fact,
I'd go so far as to say that Fedora Core is definitely
the best Linux distribution for home users.
Fedora Core is certainly my favourite distribution.
$ sed s/"Fedora Core"/"Ubuntu"/g text.txt
Ubuntu is my favourite distribution.
It's got just the right level of ease-of-use
along with regular updates, whilst remaining
a stable, supportable Operating System. In fact,
I'd go so far as to say that Ubuntu is definitely
the best Linux distribution for home users.
Ubuntu is certainly my favourite distribution.

The syntax there was sed s/from/to/count, so it replaces “Fedora Core” with “Ubuntu” in this example. If we specified “/1” at the end, it would only convert the first instance on each line. Similarly, “/2” would convert the first two instances. “/g” is probably the most-used, it converts everything (the “g” stands for “global”).

sed can emulate tr‘s tr -d functionality by having the “to” part being an empty string; here we refer to “Fedora Core” simply as “Fedora” (note the leading space: it’s ” Core”, not “Core”) :

$ sed s/" Core"//g text.txt
Fedora is my favourite distribution.
It's got just the right level of ease-of-use
along with regular updates, whilst remaining
a stable, supportable Operating System. In fact,
I'd go so far as to say that Fedora is the
best Linux distribution for home users.
Fedora is certainly my favourite distribution.

Notice also that we can cat stuff into sed, as “cat text.txt | sed s/src/dest/g“, or we can pass the file directly to sed, like this: sed s/src/dest/g text.txt. The same applies to most *nix commands.

To get in to the rest, we’ll need to get into regexp (Regular Expressions, the stuff like “the * brown * jumped over the * dog” which result in $1 = “quick”, $2 = “fox”, $3 = “lazy”). That’s for another day, though.