Ordering items

November 7, 2007

There are lots of small little quirks to the *nix shells; this is just one of them.

If you want to list the files in a directory, then ls will list them all for you, in alphabetical order.

If you want to list them by size, you can use ls -S; by timestamp: ls -t, and so on.

But ls is a particular utility. What happens when we do this:


for myfile in *
do
  echo "My file is called $myfile"
done

We get an alphabetically sorted list (see man ascii for the actual detail; they’re sorted by ASCII value, so numbers first, then uppercase letters, then lowercase letters).

This can be a pain, but it can also be quite useful. If you’ve got a bunch of files:

1.install.txt
2.setup.txt
3.use.txt
4.uninstall.txt

Then you can play with them in order, just by using the asterisk:

for i in *
do
  echo "File $i" >> all.txt
  cat $i >> all.txt
done

And it will sort them into order for you (“1″ comes before “2” in ASCII, and so on…)

Or you could just do this:

more * > all.txt

Because more will prefix each file with its name in a header, if there is more than one file to process.


IFS – Internal Field Separator

September 26, 2007

It seems like an esoteric concept, but it’s actually very useful.

If your input file is “1 apple steve@example.com”, then your script could say:

while read qty product customer
do
  echo "${customer} wants ${qty} ${product}(s)"
done

The read command will read in the three variables, because they’re spaced out from each other.

However, critical data is often presented in spreadsheet format. If you save these as CSV files, it will come out like this:

1,apple,steve@example.com

This contains no spaces, and the above code will not be able to understand it. It will take the whole thing as one item – the first thing, quanity, $qty, and set the other two fields as blank.

The way around this, is to tell the entire shell, that “,” (the comma itself) separates fields; it’s the “internal field separator”, or IFS.

The IFS variable is set to space/tab/newline, which isn’t easy to set in the shell, so it’s best to save the original IFS to another variable, so you can put it back again after you’ve messed around with it. I tend to use “oIFS=$IFS” to save the current value into “oIFS”.

Also, when the IFS variable is set to something other than the default, it can really mess with other code.

Here’s a script I wrote today to parse a CSV file:

#!/bin/sh
oIFS=$IFS     # Always keep the original IFS!
IFS=","          # Now set it to what we want the "read" loop to use
while read qty product customer
do
  IFS=$oIFS
  # process the information
  IFS=","       # Put it back to the comma, for the loop to go around again
done < myfile.txt

It really is that easy, and it’s very versatile. You do have to be careful to keep a copy of the original (I always use the name oIFS, but whatever suits you), and to put it back as soon as possible, because so many things invisibly use the IFS – grep, cut, you name it. It’s surprising how many things within the “while read” loop actually did depend on the IFS being the default value.


Understanding init scripts

July 25, 2007

UNIX and Linux systems use “init scripts” – scripts typically placed in /etc/init.d/ which are run when the system starts up and shuts down (or changes runlevels, but we won’t go into that level of detail here, being more of a sysadmin topic than a shell scripting topic). In a typical setup, /etc/init.d/myservice is linked to /etc/rc2.d/S70myservice. That is to say, /etc/init.d/myservice is the real file, but the rc2.d file is a symbolic link to it, called "S70myservice". The “S” means “Start”, and “70” says when it should be run – lower-numbered scripts are run first. The range is usually 1-99, but there are no rules. /etc/rc0.d/K30myservice (for shutdown), or /etc/rc6.d/K30myservice (for reboot; possibly a different scenario for some services), will be the corresponding “Kill” scripts. Again, you can control the order in which your services are shut down; K01* first, to K99* last.

All of these rc scripts are just symbolic links to /etc/init.d/myservice, so there is just one actual shell script, which takes care of starting or stopping the service. The Samba init script from Solaris is a nice and simple script to use as an example:

case "$1" in
start)
	[ -f /etc/sfw/smb.conf ] || exit 0

	/usr/sfw/sbin/smbd -D
	/usr/sfw/sbin/nmbd -D
	;;
stop)
	pkill smbd
	pkill nmdb
	;;
*)
	echo "Usage: $0 { start | stop }"
	exit 1
	;;
esac
exit 0

The init daemon, which controls init scripts, calls a startup script as "/etc/rc2.d/S70myservice start", and a shutdown script as "/etc/rc0.d/K30myservice stop". So we have to check the variable $1 to see what action we need to take. (See http://steve-parker.org/sh/variables2.shtml to read about what $1 means – in this case, it’s either “start” or “stop”).

So we use case (follow link for more detail) to see what we are required to do.

In this example, if it’s “start”, then it will run the three commands:

	[ -f /etc/sfw/smb.conf ] || exit 0
	/usr/sfw/sbin/smbd -D
	/usr/sfw/sbin/nmbd -D

Where line 1 checks that smb.conf exists; there is no point continuing if it doesn’t exist, just “exit 0″ (success) so the system continues booting as normal. Lines 2 and 3 start the two daemons required for Samba.

If it’s “stop”, then it will run these two commands:

	pkill smbd
	pkill nmdb

pkill means “Process Kill”, and it simply kills off the two processes started by the “start” option.

The "*)" construct catches any other uses, and simply replies that the correct syntax is to call it with either “start” or “stop” – nothing else will do. Some services allow for status reports, restarting, and so on. The one thing we do need to provide is “start”. Most services also have a “stop” function. All others are optional.

The simplest possible init script would be this, to control an Apache webserver:

#!/bin/sh
/usr/sbin/apachectl $1

Apache comes with a program called “apachectl” (or “apache2ctl”), which will take “stop” and “start” as arguments, and act accordingly. It will also take “restart”, “status”, “configtest”, and a few more options, but that one-line script would be enough to act as /etc/init.d/apache, with /etc/rc2.d/S90apache and /etc/rc0.d/K10apache linking to it. To be frank, even that is not necessary; you could just link /usr/sbin/apachectl into /etc/init.d/apache. In reality, it’s normally good to provide a few sanity-checks in addition to the basic stop/start functionality.

The vast majority of init scripts use the case command; around that, you can wrap all sorts of other things – most GNU/Linux distributions include a generic reporting script (typically /lib/lsb/init-functions – to report “OK” or “FAILED”), read in a config file (like the Samba example above), define functions for the more involved aspects of starting, stopping, or reporting on the status of the service, and so on.

Some (eg, SuSE) have an “INIT INFO” block, which may allow the init daemon a bit more control over the order in which services are started. Ubuntu’s Upstart is another; Solaris 10 uses pmf (Process Monitor Facility), which starts and stops processes, but also monitors them to check that they are running as expected.

After a good decade of stability, in 2007 the world of init scripts appears to be changing, potentially quite significantly. However, I’m not here to speculate on future developments, this post is just to document the stable interface which is init scripts. Even if other things change, the basic “start|stop” syntax is going to be with us for a long time to come. It is easy, but often important, to understand what is going on.

In closing, I will list the run-levels, and what each run-level provides:

0: Shut down the OS (without powering off the machine)
1, s, S: Single-User mode. Networking is not enabled.
2: Networking enabled (not NFS, Printers)
3: Normal operating mode (including NFS, Printers)
4: Not normally used
5: Shut down the OS and power off the machine
6: Reboot the OS.

Some GNU/Linux distributions change these definitions – in particular, Debian provides all network services at runlevel 2, not 3. Run-level 5 is also sometimes used to start the graphical (X) interface.


Shell Pipes by Example

July 22, 2007

Pipes, piping, pipelines… whatever you call them, are very powerful – in fact, they are one of the core tenets of the philosophy behind UNIX (and therefore Linux). They are also, really, very simple, once you understand them. The way to understand them, is by playing with them, but if you don’t know what they do, you don’t know where to start… Catch-22!

So, here are some simple examples of how the pipe works.

Let’s see the code

$ grep steve /etc/passwd | cut -d: -f 6
/home/steve
$

What did this do? There are two UNIX commands there: grep and cut. The command “grep steve /etc/passwd” finds all lines in the file /etc/passwd which contain the text “steve” anywhere in the line. In my case, this has one result:
steve:x:1000:1000:Steve Parker,,,:/home/steve:/bin/bash
The second command, “cut -d: -f6” cuts the line by the delimiter (-d) of a colon (“:“), and gets field (-f) number 6. This is, in the /etc/passwd file, the home directory of the user.

So what? Show me some more

This is the main point of this article; once you’ve seen a few examples, it normally all becomes clear.

EG2

$ find . -type f -ls | cut -c14- | sort -n -k 5
rw-r--r--   1 steve    steve       28 Jul 22 01:41 ./hello.txt
rwxr-xr-x   1 steve    steve     6500 Jul 22 01:41 ./a/filefrag
rwxr-xr-x   1 steve    steve     8828 Jul 22 01:42 ./c/hostname
rwxr-xr-x   1 steve    steve    30848 Jul 22 01:42 ./c/ping
rwxr-xr-x   1 steve    steve    77652 Jul 22 01:42 ./b/find
rwxr-xr-x   1 steve    steve    77844 Jul 22 01:41 ./large
rwxr-xr-x   1 steve    steve    93944 Jul 22 01:41 ./a/cpio
rwxr-xr-x   1 steve    steve    96228 Jul 22 01:42 ./b/grep
$

What I did here, was three commands: “find . -type f -ls” finds regular files, and lists them in an “ls”-style format: permissions, owner, size, etc.
cut -c14-” cuts out the first 14 characters, which mess up the formatting on this website (!), and aren’t very interesting.
sort -n -k 5” does a numeric (-n) sort, on field 5 (-k5), which is the size of the file.
So this gives me a list of the files in this directory (and subdirectories), ordered by file size. That’s much more useful than “ls -lS“, which restricts itself to the current directory, but not subdirectories.

(As an aside, I have to admit that I only concocted this by trying to think of an example; it actually seems really useful, and worth making into an alias… I must do a post about “alias” some time!)

So how does it work?

This seems pretty straightforward: get lines containing “steve” from the input file (“grep steve /etc/passwd“), and get the sixth field (where fields are marked by colons) (“cut -d: -f6“). You can read the full command from left to right, and see what happens, in that order.

How does it really work?

EG1 Explained

There are some gotchas when you start to look at the plumbing. Because we’re using the analogy of a pipe (think of water flowing through a pipe), the OS actually sets up the commands in the reverse order. It calls cutfirst, then it calls grep. If you have (for example) a syntax error in your cut command, then grep will never be called.
What actually happens is this:

  1. A “pipe” is set up – a special entity which can take input, which it passes, line by line, to its output.
  2. cut is called, and its input is set to be the “pipe”.
  3. grep is called, and its output is set to be the “pipe”.
  4. As grep generates output, it is passed through the pipe, to the waiting cut command, which does its own simple task, of splitting the fields by colons, and selecting the 6th field as output.

EG2 Explained

For EG2, “sort” is called first, which ties to the second (rightmost) pipe for its input. Then “cut” is called, which ties to the second pipe for its output, and the first (leftmost) pipe for its input. Then, “find” is called, which ties to the first pipe for its output.
So, the output of “find” is piped into “cut“, which strips off the first 14 characters of the “find” output. This is then passed to “sort“, which sorts on field 5 (of what it receives as input), so the output of the entire pipeline, is a numerically sorted list of files, ordered by size.


Redirection – Simple Stuff

May 30, 2007

Nobody deals with the really low-level stuff any more; I learned it from UNIX Gurus in the 90s. I was really lucky to have met some real experts, and was stupid not to have better understood the opportunity to pick their brains.

Write to a file

$ echo foo > file

Append to a file

$ echo foo >> file

Read from a file (1)

$ cat < file

Read from a file (2)

$ cat file

Read lines from a file

$ while read f
> do
>   echo LINE: $f
> done < file
$

Pipes Primer

May 8, 2007

The previous post dealt with pipes, though the example may not have been the best for those who are not accustomed to the concept.

There are a few concepts to be understood – mainly, that of two (or more) processes operating together, how they put their data out, and how the get their data in. UNIX deals with multiple processes, all running (conceptually, at least) at the same time, on different CPUs, each with a standard input (stdin), and standard output (stdout). Pipes connect one process’s stdout to another’s stdin.

What do we want to pipe? Let’s say we’ve got a small 80×25 terminal screen, and lots of files. The ls command will spew out tons of data, faster than we can read it. There’s a handy utility called “more“, which will show a screen-worth of text, then prompt “more”. When you hit the space bar, it will scroll down a screen. You can hit ENTER to scroll one line.

I’m sure that you’ve worked this out already, but here is how we combine these two commands:


$ ls | more
<the first screenful of files is shown>
--More--

What happens here, is that the “more” command is started up first, then the “ls” command. The output of “ls” is piped to the input of “more”, so it can read the data.

Most such tools can also work another way, too:

$ more myfile.txt
<the first screenful of "myfile.txt" is shown>
--More--

That is to say, “myfile.txt” is taken as standard input (stdin).


Pipelines in the Shell

May 3, 2007

One of the most powerful things of the *nix shell, and one which is currently not even covered in the tutorial, is the pipeline. I try to keep this blog and the tutorial from overlapping, but I really must rectify this gap in the main site some time.

In the meantime, this is what it is all about.

UNIX (and therefore GNU/Linux) is full of small text-based utilities : wc to count words (and lines, and characters) in a text file; sort to sort a text file; uniq to get only the unique lines from a text file; grep to get certain lines (but not others) from a text file, and so on.

Did you see the common trait there? Yes, it’s not just that “Everything is a file”, nearly everything is also text. It’s largely from this tradition that HTML, XML, RSS, Email (SMTP, POP, IMAP) and the like are all text-based. Contrast with MS Office, for example, where all data is in binary files which can only (really) be manipulated by the application which created them.

So what? It’s crude, simple, works on an old-fashioned green-and-black screen. How could that be relevant in the 21st Century?

It’s relevant, not because of what each tool itself provides, but what they can do when combined.

Ugly Apache Access Log

Here’s a line from my access.log (Apache2) – Yes, honest, it’s one line. It just looks very ugly:
12.106.111.10 - - [02/May/2007:12:09:58 -0700] "GET /sh/sh.shtml HTTP/1.1" 200 33080 "http://www.google.ca/search?hl=en&q=bourne+shell&btnG=Google+Search&meta=" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
That’s interesting (well, kind of). But it doesn’t tell us much about this visitor. They came from Google Canada (google.ca), searched for “bourne shell”, and clicked on the link to give them http://steve-parker.org/sh/sh.shtml. They’re using FireFox 1.5.0.3 on Windows.

Great. What happened to them? Did they like the site? Did they stay? Did they actually read anything?

Filtering it

We can use grep to filter out just this visitor. However, this gives us lots of stuff we’re not interested in – all the CSS, PNG, GIF files which support the pages themselves:
$ grep "^12.106.111.10 " access_log
12.106.111.10 - - [02/May/2007:12:09:58 -0700] "GET /sh/sh.shtml HTTP/1.1" 200 33080 "http://www.google.ca/search?hl=en&q=bourne+shell&btnG=Google+Search&meta=" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
12.106.111.10 - - [02/May/2007:12:09:58 -0700] "GET /steve-parker.css HTTP/1.1" 200 8757 "http://steve-parker.org/sh/sh.shtml" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
12.106.111.10 - - [02/May/2007:12:09:59 -0700] "GET /images/1.png HTTP/1.1" 200 471 "http://steve-parker.org/sh/sh.shtml" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
12.106.111.10 - - [02/May/2007:12:09:59 -0700] "GET /images/prevd.png HTTP/1.1" 200 1397 "http://steve-parker.org/sh/sh.shtml" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
12.106.111.10 - - [02/May/2007:12:10:00 -0700] "GET /images/2.png HTTP/1.1" 200 648 "http://steve-parker.org/sh/sh.shtml" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
... etc ...

This is looking ugly already, and not just because of the small width of the webpage – even at 80 characters wide, this is hard to understand.

Filtering Some More

At first glance, I should be able to pipe this through a grep for html files:

$ grep "^12.106.111.10 " access_log | grep shtml

However, Apache also logs the referrer, so even the “GET /images/2.png” request above, includes a .shtml request. So I can use “grep -v“. I’ll add the “-w” option to grep to say that the search term must match a whole word. So – “grep -v gif” would let “gifford.html” through, whereas “grep -vw gif” would not. I’ll add “\” to the code, so that you can cut and paste it… the “\” means that although the line breaks, it’s still really part of the same line of code:
$ grep "^12.106.111.10 " access_log | grep -vw css \
| grep -vw gif | grep -vw jpg \
| grep -vw png | grep -vw ico
12.106.111.10 - - [02/May/2007:12:09:58 -0700] "GET /sh/sh.shtml HTTP/1.1" 200 33080 "http://www.google.ca/search?hl=en&q=bourne+shell&btnG=Google+Search&meta=" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
12.106.111.10 - - [02/May/2007:12:10:32 -0700] "GET /sh/variables1.shtml HTTP/1.1" 200 48431 "http://steve-parker.org/sh/sh.shtml" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
12.106.111.10 - - [02/May/2007:12:13:23 -0700] "GET /sh/external.shtml HTTP/1.1" 200 41322 "http://steve-parker.org/sh/sh.shtml" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
12.106.111.10 - - [02/May/2007:12:13:45 -0700] "GET /sh/quickref.shtml HTTP/1.1" 200 42454 "http://steve-parker.org/sh/sh.shtml" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
12.106.111.10 - - [02/May/2007:12:14:27 -0700] "GET /sh/test.shtml HTTP/1.1" 200 48844 "http://steve-parker.org/sh/sh.shtml" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"

This pumps the output through a grep which gets rid of CSS files (“| grep -vw css“), then gif, then jpg, then png, then ico. That should just leave us with the HTML files, which is what we’re really interested in.

It’s really hard to see what’s going on here. The narrow web page doesn’t help, but we just want to get the key information out, which should be nice and simple. It should look good however we view it.

If you look carefully, you can see that this visitor accessed /sh/sh.shtml, then /sh/variables1.shtml (after a minute), /sh/external.shtml (after 3 minutes), then /sh/quickref.shtml (about 20 seconds later, presumably – given the referrer is the same, in a new tab). A second later, they opened /sh/test.shtml (which also suggests that they’ve loaded the two pages in tabs, to read at their leisure).

Getting the Data we need

However, none of this is really very easy to read. If we just want to know what pages they visited, and when, we need to do some more filtering. awk is a very powerful tool, of whose abilities we will only scratch the surface here. We will get “fields” 4 and 7 – the timestamp and the URL accessed.
$ grep "^12.106.111.10 " access_log \
| grep -vw css | grep -vw gif \
| grep -vw jpg | grep -vw png \
| grep -vw ico | awk '{ print $4,$7 }'
[02/May/2007:12:09:58 /sh/sh.shtml
[02/May/2007:12:10:32 /sh/variables1.shtml
[02/May/2007:12:13:23 /sh/external.shtml
[02/May/2007:12:13:45 /sh/quickref.shtml
[02/May/2007:12:14:27 /sh/test.shtml

Okay, it’s the info we wanted, but it’s still not great. That “[” looks out of place now. We can use cut to tidy things up. In this case, we’ll use its positional spacing, because we want to get rid of the first character. Cut’s “-c” paramater tells it what character to cut from. We want the 2nd character onwards, so we just add it to the end of the pipe line:
$ grep "^12.106.111.10 " access_log | grep -vw css | grep -vw gif | grep -vw jpg | grep -vw png | grep -vw ico | awk '{ print $4,$7 }'|cut -c2-
02/May/2007:12:09:58 /sh/sh.shtml
02/May/2007:12:10:32 /sh/variables1.shtml
02/May/2007:12:13:23 /sh/external.shtml
02/May/2007:12:13:45 /sh/quickref.shtml
02/May/2007:12:14:27 /sh/test.shtml

And that’s the kind of thing that we can do with a pipe. We can get exactly what we want from a file.

Moving on

At the start of this post, we mentioned sort and uniq. Actually, “sort -u” will do the same as “sort | uniq“. So if we want to get the unique visitors, we can just get the first field (“cut -d" " -f1“) and sort it uniquely:
$ cut -d" " -f1 access_log | sort -u


Tool Tip: “Read” – it does what it says!

April 14, 2007

read is a very useful tool; it might seem too simple to bother mentioning, but there are at least three different ways to use it. (Okay, two, and the third isn’t really anything special about read, just a nifty thing that the shell itself provides)…

1. Read the whole line

Let’s start with an interactive script:

$ cat readme.sh
#!/bin/sh
echo "I'm a parrot!"
while read a
do
    echo "A is $a"
done
$ ./readme.sh
I'm a parrot!
hello
A is hello
one two three
A is one two three
piglet eeyore pooh owl
A is piglet eeyore pooh owl
^D
$

Yes, you’ll need to hit CTRL-D to exit this loop, it’s just a simple example.

So far, so stupid. But wait; what if I wanted to get that “one” “two” “three” and use them differently?

2. Read the words

$ cat readme.sh
#!/bin/sh
echo "I'm a parrot!"
while read a b c
do
        echo "A is $a"
        echo "B is $b"
        echo "C is $c"
done
$ ./readme.sh
I'm a parrot!
hello
A is hello
B is
C is
one two three
A is one
B is two
C is three
piglet eeyore pooh owl
A is piglet
B is eeyore
C is pooh owl
^D
$

So, just by naming some variables, we can pick what we get. And – did you see that last one? We don’t lose anything, either… Just because we asked for three variables (a, b, c) and we got 4 values (piglet eeyore pooh owl), we didn’t lose anything; the last one was treated like a normal read.

This is actually pretty handy stuff; you’d have to do a bit of messing about with pointers to get the same effect in C, for example.

3. Read from a file

We can do all this from a file, too. This isn’t special to read, but it’s often used in this way. See that “while – do – done” loop? It’s a sub-shell, and we can direct whatever we want to its input (everything is a file, remember, so the keyboard, a text file, a device driver, whatever you want, it’s all just a file)

We do this with the “<” operator. Just add “< filename.txt” after the “done” end of the loop:

$ cat readme.sh
#!/bin/sh
echo "I'm a parrot!"
while read a b c
do
        echo "A is $a"
        echo "B is $b"
        echo "C is $c"
done  < myfile.txt
$ cat myfile.txt
1 2 3
4
5 6
7
8 9 10 11 12 13
14
15 16 17
$  ./readme.sh
I'm a parrot!
A is 1
B is 2
C is 3
A is 4
B is
C is
A is 5
B is 6
C is
A is 7
B is
C is
A is 8
B is 9
C is 10 11 12 13
A is 14
B is
C is
A is 15
B is 16
C is 17

So we can process tons of data, wherever it comes from.

4. I only mentioned 3 uses

We could make the script a bit more useful, by allowing the user to specify the file, instead of hard-coding it to “myfile.txt“:

$ cat readme.sh
#!/bin/sh
echo "I'm a parrot!"
while read a b c
do
        echo "A is $a"
        echo "B is $b"
        echo "C is $c"
done < $1
$ cat someotherfile.txt
123
1 2 3
one two three four
$ ./readme.sh someotherfile.txt
I'm a parrot!
A is 123
B is
C is
A is 1
B is 2
C is 3
A is one
B is two
C is three four
$

Update 14 April

Updated to fix the “done < filename.txt” from the example code of the last two examples.


Timestamps for Log Files

March 11, 2007

There are two common occasions when you might want to get a timestamp

  • If you want to create a logfile called “myapp_log.11.Mar.2007″
  • If you want to write to a logfile with “myapp: 11 Mar 2007 22:14:44: Something Happened”

Either way, you want to get the current date, in the format you prefer – for example, it’s easier if a filename doesn’t include spaces.

For the purposes of this article, though for no particular reason, I am assuming that the current time is 10:14:44 PM on Sunday the 11th March 2007.

The tool to use is, naturally enough, called “date“. It has a bucket-load of switches, but first, we’ll deal with how to use them. For the full list, see the man page (“man date“), though I’ll cover some of the more generally useful ones below.

Setting the Date/Time

The first thing to note, is that date has two aspects: It can set the system clock:

# date 031122142007.44

will set the clock to 03 11 22 14 2007 44 – that is, 03=March, 11=11th day, 22 = 10pm, 14 = 14 minutes past the hour, 2007 = year 2007, 44 = 44 seconds past the minute.

Heck, I don’t even know why I bothered to spell it out, it’s obvious. Of course the year should come between the minutes and the seconds (ahem).

Getting the Date/Time

The more often used feature of the date command, is to find the current system date / time, and that is what we shall focus on here. It doesn’t follow tradition, in that it uses the “+” and “%” symbols, instead of the “-” symbol, for its switches.

H = Hours, M = Minutes, S = Seconds, so:

$ date +%H:%M:%S
22:14:44

Which means that you can name a logfile like this:

#!/bin/sh
LOGFILE=/tmp/log_`date +%H%M%S`.log
echo Starting work > $LOGFILE
do_stuff >> $LOGFILE
do_more_stuff >> $LOGFILE
echo Finished >> $LOGFILE

This will create a logfile called /tmp/log_221444.log

You can also put useful information to the logfile:

#!/bin/sh
LOGFILE=/tmp/log_`date +%H%M%S`.log
echo `date +%H:%M:%S : Starting work > $LOGFILE
do_stuff >> $LOGFILE
echo "`date +%H:%M:%S : Done do_stuff" >> $LOGFILE
do_more_stuff >> $LOGFILE
echo "`date +%H:%M:%S : Done do_more_stuff" >> $LOGFILE
echo Finished >> $LOGFILE

This will produce a logfile along the lines of:

$ cat /tmp/log_221444.log
22:14:44: Starting work
do_stuff : Doing stuff, takes a short while
22:14:53: Done do_stuff
do_more_stuff : Doing more stuff, this is quite time consuming.
22:18:35: Done do_more_stuff
$

Counting the Seconds

UNIX has 1st Jan 1970 as a “special” date, the start of the system clock; GNU date will tell you how many seconds have elapsed since midnight on 1st Jan 1970:

$ date +%s
1173651284

Whilst this information is not very useful in itself, it may be useful to know how many seconds have elapsed between two events:

$ cat list.sh
#!/bin/sh
start=`date +%s`
ls -R $1 > /dev/null 2>&1
end=`date +%s`

diff=`expr $end - $start`
echo "Started at $start : Ended at $end"
echo "Elapsed time = $diff seconds"
$ ./list.sh /usr/share
Started at 1173651284 : Ended at 1173651290
Elapsed time = 6 seconds
$

For more useful switches, see the man page, but here are a few handy ones:

$ date "+%a %b %d" # (in the local language)
Sun Mar 11
$ date +%D         # (show the full date)
03/11/07
$ date +%F         # (In another format)
2007-03-11
$ date +%j         # (how many days into the year)
070
$ date +%u         # (day of the week)
7
$

Tool Tip: “ls”

February 26, 2007

Yeah yeah, we know ls already.

But how much of ls‘s functionality do you actually use? There are so many switches to ls, that when Sun added extended attributes (does anyone use that?) they found that there were no letters left, so they had to use “-@” !

So, here are a couple of handy ls options, in no particular order; either for interactive or scripting use. I’m assuming GNU ls; Solaris ls supports most GNU-style features, but the “nice-to-have” features, like ls -h aren’t in historical UNIX ls implementations. I’ll split these into two categories: Sort ‘em and Show ‘em. What are your favourites?

Sort ‘em

When sorting, I tend to use the “-l (long listing)” and “-r (reverse order)” switches:

Sort ‘em by Size:

ls -lSr

Sort ‘em by Date:

ls -ltr

Show ‘em

There are a number of ways to show different attributes of the files you are listing; “-l” is probably the obvious example. However, there are a few more:

Show ‘em in columns

ls -C

Useful if you’re not seeing as many as you’d expect.

Show ‘em one by one

ls -1

That’s the number 1 (one) there, not the letter l (ell). Forces one-file-per-line. Particularly useful for dealing with strange filenames with whitespace in them.

Show ‘em as they are

ls -F

To append symbols (“*” for executables, “/” for directories, etc) to the filename to show further information about them.

Show ‘em so I can read it

ls -lh

Human-readable filesizes, so “12567166” is shown as “12M”, and “21418” is “21K”. This is handy for people, but of course, if you’re writing a script which wants to know file sizes, you’re better off without this (21Mb is bigger than 22Kb, after all!)

Show ‘em with numbers

ls -n

This is equivalent to ls -l, except that UID and GID are not looked up, so:

$ ls -l foo.txt
-rw-r--r-- 1 steve steve 46210 2006-11-25 00:33 foo.txt
$ ls -n foo.txt
-rw-r--r-- 1 1000 1000 46210 2006-11-25 00:33 foo.txt

This can be useful in a number of ways; particularly if your NIS (or other) naming service is down, or if you’ve imported a filesystem from another system.

What’s your favourite?

What are your most-used switches for the trusty old ls tool?


Follow

Get every new post delivered to your Inbox.