Calculating Averages

The Simple Maths post seems to be the most popular article in the so-far short life of this blog.

It’s also something that I have received a few emails about recently, so I feel like posting a bit more on the subject.

I think that the code can speak for itself… We implement a loop, which calls the builtin read function (I’m not sure the “-p” flag, to provide a prompt, is universal. It does work with the Bash builtin. If it doesn’t work on your *nix, it’s really only for show, so you can live without it.

Because read works on standard input (aka “stdin”), it will work interactively from the keyboard, or direct from a file (one number per line).

We use two methods of doing maths in the shell:

  • expr, because it’s a simple and easily-read way to do simple maths: n=`expr $n + 1`

  • bc, because it is more powerful. Do have a play with bc interactively, it can do a lot... see below.

So, we can write a fairly simple script (read down, it's only actually 11 lines of code without the comments), which is actually quite versatile - it can do running averages, it can be interactive or run from cron, called from another script, even used as a function.

So, here's the code. It should be fairly self-explanatory, but do have a look at the interactive bc sample session below, to see what we are doing with bc. Also, do play with bc (some Linux distros have dropped it from the default install recently, so you'll have to yast -i bc, or equivalent)

The Script - Calculate Averages

#!/bin/sh
# Calculate mean (average) of integer data

# Initialise the variables
n=0     # n being the number of (valid) data provided
sum=0   # sum being the running total of all data

# Note that by using ^D (aka "EOF") to quit, this
# script will work just as well interactively, as
# when provided with a file containing the data.
while read -p "Enter a number (^D to quit): " x
do
        # expr is useful for simple maths
  sum=`expr $sum + $x`
  # If this fails, it was non-numeric input
  if [ "$?" -eq "0" ]; then
    # Okay, it was valid input.
    n=`expr $n + 1`
    # We can provide a "running average" here;
    # I'll comment it out for now.
    # echo "Running Average:"
    # echo "scale=2;$sum/$n" | bc
    # echo
  fi
done

# Okay, we've done the loop.
# Present the data.
echo "Overall Average:"
        # bc is more useful than expr for
        # more involved maths, though its
        # syntax, particularly in a script,
        # is possibly less obvious.
        # Using bc interactively is easier
        # than using it in a shell script
echo "scale=2;$sum/$n" | bc

Interactive bc

The bold text is user input. The rest is from bc:

$ bc
bc 1.06
Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
ibase=2 I'll be entering base2 (binary)
01001001 So, I enter 1001001 (73)
73 And it replies with the answer in base 10
ibase=10 Does this set the input base back to 10?
10 Let's input "10", it should reply "10"
2 No, we entered "10" in base 2, which is 2!
ibase=1010 So, 10 in binary is 1010 (8+2)
10 We say 10
10 And bc says 10. Good, we're back to normal
11 And the same for 11
11 Good, it works. Now for some maths..
1 + 2 (tricky stuff!)
3 Yes, that's good, 1+2=3
23 + 34 + 45 + 56 We're not limited to x+y
158 So we can build up our sums
10/3 10/3 = 3 and a third, right?
3 Not to 0 decimal places.
scale=2 Okay, let's have 2 decimal places
10/3 Now ask again
3.33 That's better
scale=5 Or to 5 points?
10/3 Ask again...
3.33333 And it works!
scale=1 One point:
10/3 And ask again
3.3 As we expected.
scale=0 So, scale=0 means 0 places
10/3 Should say 3
3 Yes, we're back to where we started.

Back to the Script

That made a nice break. Now we'll go back to the script... it's only actually 11 lines long:

#!/bin/sh
n=0
sum=0
while read x
do
  sum=`expr $sum + $x`
  if [ "$?" -eq "0" ]; then
    n=`expr $n + 1`
  fi
done
echo "scale=2;$sum/$n" | bc

And as I said, we can use it interactively, or with a file of data:

$ cat data.txt
4
5
6
$ average.sh 
5.00

Because, under *nix, EVERYTHING IS A FILE, even the keyboard!

About these ads

11 Responses to Calculating Averages

  1. nir says:

    for some reason on my system, expr did not work with floating point numbers. I had to change the script to use bc for the calculation:

    #!/bin/sh
    n=0
    sum=0
    while read x
    do
    sum=`echo $sum + $x|bc`
    if [ “$?” -eq “0” ]; then
    n=`expr $n + 1`
    fi
    done
    echo “scale=2;$sum/$n” | bc

  2. unixshell says:

    Unfortunately, expr can only deal with integers

  3. Bruce says:

    I ran this script on 10’s of thousands of numbers and it takes a *long* time. That’s because you spawn an expr not once but twice for every number you average. Here is the script with your expr statement changed to native bash arithmetic expansion.

    #!/bin/sh
    n=0
    sum=0
    while read x
    do
    sum=$(( $sum + $x ))
    if [ “$?” -eq “0” ]; then
    (( n += 1 ))
    fi
    done
    echo “$(echo “scale=2;$sum/$n” | bc) $n”

    The last command, bc, is changed a bit to also show the number of elements in the average.

    Time to average 10000 items:
    OLD: 25.76s
    NEW: 0.44s (440ms)

  4. unixshell says:

    Thanks Bruce. There are lots of things that Bash can do that Bourne can’t do.

    These days, most *nix boxes have Bash available, but /bin/sh still points to the Bourne shell. Indeed, Debian is going back from /bin/sh being bash, to dash. Ubuntu has already replaced /bin/sh with dash.

    http://release.debian.org/lenny/goals.txt

    Steve

    • /bin/sh in Linux is on most distros a symlink to bash (or dash) and when run it actually runs “Bash in POSIX compatibility mode”, which should be like Bourne Shell, but neither Bash nor Dash is good for actually testing if your script runs on real Bourne Shell.

      For those wanting to make sure their script is Bourne Shell compatible but you don’t have actual Bourne Shell available, Heirloom Bourne Shell is something you may want to look into – comes in source package, no configure script, you just have to edit very simple modifications to very simple Makefile, compile and install and you have Bourne Shell with NO extensions whatsoever – with that I’ve learned a lot on how much imagination and crazy hacks you may have to achieve to do things that in Bash are quite normal things to do :) It may be fun, I for one like to do some hobby projects “just to see if I can” where I try to do something in Bourne Shell – like transforming script relying on recursive function calls into Bourne Shell which has no function local variables, only script global ones (hint: subprocess inside function can help, of course the things you can do are limited, subprocess can have it’s own variables, but you can’t change variables outside it, he-he).

  5. slevin says:

    hi,

    what about

    $ awk ‘{ sum += $1} END {print sum/NR}’ /path/to/data.txt

  6. unixshell says:

    Thanks for that, slevin. I’m sure that there are lots of ways of doing it with awk, perl, and many other languages.

  7. […] Calculating Averages March 20077 comments […]

  8. delt says:

    #!/bin/sh
    # Copied/pasted directly from: http://nixshell.wordpress.com/2007/03/26/calculating-averages/
    # Then modified a little – mainly for taking numbers directly from the command line. – delt.

    n=0 # n being the number of (valid) data provided
    sum=0 # sum being the running total of all data

    # “Note that by using ^D (aka “EOF”) to quit, this script will work just as well blah blah”
    # — Thanks, so i can make a function out of this and pipe data directly into it =)

    function calc_avg() {
    while read -p “” x; do
    sum=`expr $sum + $x` && n=$[n+1] # “expr” return status indicates if valid integer or not
    done

    # ok, finished adding, now calculate the average.
    echo “scale=2;$sum/$n” | bc
    }

    # like said above, just pipe $* args through calc_avg with a newline between each one.
    echo $* | tr ‘ ‘ ‘\n’ | calc_avg

    ### TODO: write a version that accepts floating point numbers as arguments ###

  9. unixshell says:

    Nicely done, delt. Functions and scripts can easily replace each other too, which is a really nice feature: $1 $2 $#, etc all work just as well for a function as for a script.

    PS. There’s no need for ‘-p “”‘, it should default to a blank prompt

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.