Calculating Averages

March 26, 2007

The Simple Maths post seems to be the most popular article in the so-far short life of this blog.

It’s also something that I have received a few emails about recently, so I feel like posting a bit more on the subject.

I think that the code can speak for itself… We implement a loop, which calls the builtin read function (I’m not sure the “-p” flag, to provide a prompt, is universal. It does work with the Bash builtin. If it doesn’t work on your *nix, it’s really only for show, so you can live without it.

Because read works on standard input (aka “stdin”), it will work interactively from the keyboard, or direct from a file (one number per line).

We use two methods of doing maths in the shell:

  • expr, because it’s a simple and easily-read way to do simple maths: n=`expr $n + 1`

  • bc, because it is more powerful. Do have a play with bc interactively, it can do a lot... see below.

So, we can write a fairly simple script (read down, it's only actually 11 lines of code without the comments), which is actually quite versatile - it can do running averages, it can be interactive or run from cron, called from another script, even used as a function.

So, here's the code. It should be fairly self-explanatory, but do have a look at the interactive bc sample session below, to see what we are doing with bc. Also, do play with bc (some Linux distros have dropped it from the default install recently, so you'll have to yast -i bc, or equivalent)

The Script - Calculate Averages

#!/bin/sh
# Calculate mean (average) of integer data

# Initialise the variables
n=0     # n being the number of (valid) data provided
sum=0   # sum being the running total of all data

# Note that by using ^D (aka "EOF") to quit, this
# script will work just as well interactively, as
# when provided with a file containing the data.
while read -p "Enter a number (^D to quit): " x
do
        # expr is useful for simple maths
  sum=`expr $sum + $x`
  # If this fails, it was non-numeric input
  if [ "$?" -eq "0" ]; then
    # Okay, it was valid input.
    n=`expr $n + 1`
    # We can provide a "running average" here;
    # I'll comment it out for now.
    # echo "Running Average:"
    # echo "scale=2;$sum/$n" | bc
    # echo
  fi
done

# Okay, we've done the loop.
# Present the data.
echo "Overall Average:"
        # bc is more useful than expr for
        # more involved maths, though its
        # syntax, particularly in a script,
        # is possibly less obvious.
        # Using bc interactively is easier
        # than using it in a shell script
echo "scale=2;$sum/$n" | bc

Interactive bc

The bold text is user input. The rest is from bc:

$ bc
bc 1.06
Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
ibase=2 I'll be entering base2 (binary)
01001001 So, I enter 1001001 (73)
73 And it replies with the answer in base 10
ibase=10 Does this set the input base back to 10?
10 Let's input "10", it should reply "10"
2 No, we entered "10" in base 2, which is 2!
ibase=1010 So, 10 in binary is 1010 (8+2)
10 We say 10
10 And bc says 10. Good, we're back to normal
11 And the same for 11
11 Good, it works. Now for some maths..
1 + 2 (tricky stuff!)
3 Yes, that's good, 1+2=3
23 + 34 + 45 + 56 We're not limited to x+y
158 So we can build up our sums
10/3 10/3 = 3 and a third, right?
3 Not to 0 decimal places.
scale=2 Okay, let's have 2 decimal places
10/3 Now ask again
3.33 That's better
scale=5 Or to 5 points?
10/3 Ask again...
3.33333 And it works!
scale=1 One point:
10/3 And ask again
3.3 As we expected.
scale=0 So, scale=0 means 0 places
10/3 Should say 3
3 Yes, we're back to where we started.

Back to the Script

That made a nice break. Now we'll go back to the script... it's only actually 11 lines long:

#!/bin/sh
n=0
sum=0
while read x
do
  sum=`expr $sum + $x`
  if [ "$?" -eq "0" ]; then
    n=`expr $n + 1`
  fi
done
echo "scale=2;$sum/$n" | bc

And as I said, we can use it interactively, or with a file of data:

$ cat data.txt
4
5
6
$ average.sh 
5.00

Because, under *nix, EVERYTHING IS A FILE, even the keyboard!


Links to bash and sed FAQs

March 19, 2007

A few more interesting / useful links …

17 Bash Pitfalls and (from the same site) BashFaq

Also the SED FAQ at SourceForge, which I hadn’t come across before today (the FAQ that is, not SourceForge!): http://sed.sourceforge.net/sedfaq.html (to go with http://sed.sourceforge.net/sed1line.txt which I plugged previously)


Variables – When to use a ‘dollar’ symbol

March 19, 2007

If you’ve used any other language, then you are probably quite familiar with how variables work, and how they are referred to. If not, then the following examples should suffice to cover the two most common options:

PHP (amongst others)

$foo="hello, world";
echo $foo;

This sets the “foo” variable to be “hello, world!”, and then echoes it out. Notice how “foo” is always preceded by a dollar ($) symbol. That denotes it as a variable. Whether we’re setting or reading its contents, it’s always “$foo“.

C (and others again)

int main() {
  int foo;
  int bar;

  foo=2;
  bar = foo * 5;
  printf("The answer is %d\n", bar);
}

This will provide the useful fact that 2*5 = 10. However, no “$” dollar symbols are used at all. That’s the “other” option… the compiler knows the rules, depending on the chosen language.

However, the shell falls part way in between these two examples.

If you want to quote the value of a variable, then you need the dollar. If you want to set it, then drop the dollar:

Shell Script

#!/bin/sh
foo=Hello
echo $foo World

This will say “Hello World”. But did you see what happened with the dollars? To set a variable, no dollar. To read it, use the dollar.

This is particularly confusing to many users of the shell. I can’t even provide a good reason as to why it should work this way, whether historically or pragmatically.

Similarly, when reading variable contents, do not use the dollar:

#!/bin/sh
echo "What do you want to tell the world?"
read msg
echo $msg World

This will tell your message to the world… if you say “Hello”, then it will say “Hello World”. If you say “Goodbye Cruel”, then it will say “Goodbye Cruel World”.

Notice the dollars… whilst strange, it is at least consistent. You use the dollar symbol to quote the content of the variable; otherwise, leave it out.

For more in-depth stuff about variables, check out


Bashish

March 13, 2007

Just a quick post, to plug a rather thorough article about changing the appearance of the bash prompt:

http://systhread.net/texts/200703bashish.php


Timestamps for Log Files

March 11, 2007

There are two common occasions when you might want to get a timestamp

  • If you want to create a logfile called “myapp_log.11.Mar.2007″
  • If you want to write to a logfile with “myapp: 11 Mar 2007 22:14:44: Something Happened”

Either way, you want to get the current date, in the format you prefer – for example, it’s easier if a filename doesn’t include spaces.

For the purposes of this article, though for no particular reason, I am assuming that the current time is 10:14:44 PM on Sunday the 11th March 2007.

The tool to use is, naturally enough, called “date“. It has a bucket-load of switches, but first, we’ll deal with how to use them. For the full list, see the man page (“man date“), though I’ll cover some of the more generally useful ones below.

Setting the Date/Time

The first thing to note, is that date has two aspects: It can set the system clock:

# date 031122142007.44

will set the clock to 03 11 22 14 2007 44 – that is, 03=March, 11=11th day, 22 = 10pm, 14 = 14 minutes past the hour, 2007 = year 2007, 44 = 44 seconds past the minute.

Heck, I don’t even know why I bothered to spell it out, it’s obvious. Of course the year should come between the minutes and the seconds (ahem).

Getting the Date/Time

The more often used feature of the date command, is to find the current system date / time, and that is what we shall focus on here. It doesn’t follow tradition, in that it uses the “+” and “%” symbols, instead of the “-” symbol, for its switches.

H = Hours, M = Minutes, S = Seconds, so:

$ date +%H:%M:%S
22:14:44

Which means that you can name a logfile like this:

#!/bin/sh
LOGFILE=/tmp/log_`date +%H%M%S`.log
echo Starting work > $LOGFILE
do_stuff >> $LOGFILE
do_more_stuff >> $LOGFILE
echo Finished >> $LOGFILE

This will create a logfile called /tmp/log_221444.log

You can also put useful information to the logfile:

#!/bin/sh
LOGFILE=/tmp/log_`date +%H%M%S`.log
echo `date +%H:%M:%S : Starting work > $LOGFILE
do_stuff >> $LOGFILE
echo "`date +%H:%M:%S : Done do_stuff" >> $LOGFILE
do_more_stuff >> $LOGFILE
echo "`date +%H:%M:%S : Done do_more_stuff" >> $LOGFILE
echo Finished >> $LOGFILE

This will produce a logfile along the lines of:

$ cat /tmp/log_221444.log
22:14:44: Starting work
do_stuff : Doing stuff, takes a short while
22:14:53: Done do_stuff
do_more_stuff : Doing more stuff, this is quite time consuming.
22:18:35: Done do_more_stuff
$

Counting the Seconds

UNIX has 1st Jan 1970 as a “special” date, the start of the system clock; GNU date will tell you how many seconds have elapsed since midnight on 1st Jan 1970:

$ date +%s
1173651284

Whilst this information is not very useful in itself, it may be useful to know how many seconds have elapsed between two events:

$ cat list.sh
#!/bin/sh
start=`date +%s`
ls -R $1 > /dev/null 2>&1
end=`date +%s`

diff=`expr $end - $start`
echo "Started at $start : Ended at $end"
echo "Elapsed time = $diff seconds"
$ ./list.sh /usr/share
Started at 1173651284 : Ended at 1173651290
Elapsed time = 6 seconds
$

For more useful switches, see the man page, but here are a few handy ones:

$ date "+%a %b %d" # (in the local language)
Sun Mar 11
$ date +%D         # (show the full date)
03/11/07
$ date +%F         # (In another format)
2007-03-11
$ date +%j         # (how many days into the year)
070
$ date +%u         # (day of the week)
7
$

Looping in the shell (for and while)

March 7, 2007

In programming, the two most common types of loop are “for” and “while” loops. We can do both (and “repeat” loops, too, because that’s just a special case of the “while” loop) with the *nix shell. I’ve got some more detail in the tutorial.

for loops

Some languages have a “foreach” command; if you are used to such a language, then treat the shell’s for command as equivalent to foreach. If not, then don’t worry about it, just watch the examples, it’s about as simple as it can be.

$ for artist in Queen "Elvis Costello" Metallica
> do
> echo "I like $artist"
> done
I like Queen
I like Elvis Costello
I like Metallica
$

The for loop will simply go through whatever text it gets passed, and do the same stuff with each item.

Note that for “Elvis Costello” to be marked as one artist, not “Elvis” and “Costello”, I had to put quotes around the words. However, if these were files, then the following would suffice:

$ ls -l
total 0
-rw-r--r-- 1 steve steve 0 2007-03-07 00:45 Elvis Costello
-rw-r--r-- 1 steve steve 0 2007-03-07 00:45 Metallica
-rw-r--r-- 1 steve steve 0 2007-03-07 00:45 Queen
$ for artist in *
> do
> echo "I like $artist"
> done
I like Elvis Costello
I like Metallica
I like Queen
$

This time around, because we passed them via the shell’s interpreter, they are parsed in alphabetical order.

Many languages can do better than “foreach”, though: “for i=1 to 99 step 3“, for example, to step through 1,4,7,10 .. 91, 94, 97.

We can do this with a while loop.

while loops

While loops are not quite as simple as for loops; they have some kind of condition to match; when the condition does not match, the loop will exit.

The examples above, stepping through from 1 to 100 in increments of 3 (1, 4, 7, 10, … 91, 94, 97), can easily enough be done with a while loop:

$ i=1
$ while [ "$i" -lt "100" ]
> do
>   echo $i
>   i=`expr $i + 3`
> done
1
4
7
...
91
94
97

The “i=`expr $i + 3`” means “increment ‘i’ by 3″ (“i = i + 3” in most other languages).

The “-lt” means “is less than” (“-le” means “is less than or equal too; see “man test”, or http://steve-parker.org/sh/test.shtml and http://steve-parker.org/sh/quickref.shtml)


Follow

Get every new post delivered to your Inbox.