Efficient Shell Scripting

January 22, 2014

In an article on ITworld.com, Sandra Henry-Stocker gives advice about writing efficient shell scripts.

Whilst a lot of the principles provided would appear to make sense, most of them actually do not make any significant difference, and some are entirely wrongly measured.

First, Henry-Stocker suggests replacing this script:

for day in Mon Tue Wed Thu Fri
do
    echo $day
    touch $day.log
done

… with this one:

for day in Mon Tue Wed Thu Fri
do
    if [ $verbose ]; then echo $day; fi
    touch $day.log
done

I ran both scripts 5,000 times, like this:

for x in `seq 1 5000`
do
  for day in Mon Tue Wed Thu Fri
  do
      echo $day
      touch $day.log
  done
done

… and similarly for the second script.

The “slow” script ran in 21.425 seconds on my PC, the “fast” script, which although it does not echo anything, instead parses and executes the test, which means that it took longer – 25.178 seconds, or 17% slower than simply running “echo” every time.

I would also note that the syntax if [ $verbose ] is asking for trouble, in real scripts I’m sure she would agree that you should use something like: “if [ "$verbose" -eq "y" ]“.

If the code is running on an old Sun framebuffer console, which will update the screen at around one second per line, all this needless echoing would make a difference, but in any real-world situation in 2014, the overhead of the test is far slower than writing the output.

Over on page two (because it’s all about selling advertising space :-) ), order of comparison is taken on. Whilst in principle, it could make a significant difference, the example given involves a single if statement, no fork()ing, and some simple variable comparisons:

echo -n "enter foo> "; read foo;
echo -n "enter bar> "; read bar;
echo -n "enter tmp> "; read tmp;


if [[ $foo -eq 1 && $bar -eq 2 && $tmp -eq 3 ]]; then
    echo ok
fi

Taking out the read from the tests, we find that it takes 0.083 seconds to do 5,000 runs of the full test, with all variables matching (so all three conditions have to be tested each time), and 0.033 seconds when the first condition does not match, so it takes just over twice as long to run three tests as it does to run one test.

This is a significant difference, but it’s not the 1.195 seconds per iteration suggested by the article, it’s 0.00001 second per iteration. Taking Sandra Henry-Stocker’s results at face value, my tests which each took well under 1 second, would have taken 4 hours 5 minutes, or 2 hours 26 minutes respectively.

If one comparison was particularly time-consuming, it would be a more effective example. Here, if the find command takes 10 seconds to run, but foo is usually 1, then this will take 11 seconds:
if find /var -name foo.txt && [ "$foo" -eq "93" ]; then ...
whilst this will take 1 second, 1000% faster:
if [ "$foo" -eq "93" ] && find /var -name foo.txt; then ...
The example provided just doesn’t match the claimed results.

Avoiding unnecessary cat, echo and similar statements is good advice; not as significant as it was 10 years ago, and much less significant on Linux, where fork()ing is much faster than on Unix.


Shell Scripting Tutorial on Kindle

March 29, 2013

Unix & Linux Shell Scripting Tutorial on Kindle

Unix & Linux Shell Scripting Tutorial on Kindle

The Shell Scripting tutorial at http://steve-parker.org is now available natively on the Kindle!

USA (amazon.com)

UK (amazon.co.uk)

Similarly, you can search for “B00C2EGNSA” on any Amazon site, or just go to http://www.amazon.COUNTRY/dp/B00C2EGNSA (where “COUNTRY” is .fr, .de, etc) for your local equivalent.


Shell Scripting page on Facebook

July 11, 2011

Shell Scripting

Shell Scripting

My Shell Scripting book, due out on August 12th by Wrox, now has a page on Facebook: http://www.facebook.com/pages/Shell-Scripting/175263275869249. Feel free to “Like” it, and get the latest updates on the project.

I have the final pages to proofread this week, ready to go to the printers. It’s looking like 576 pages, a little bit over the target of 504 pages, but close enough.

I will update the Table of Contents at http://sgpit.com/book/ once the page count is finalised.


Update on Shell Scripting Recipes book

April 23, 2011

Wow, it’s been nearly two months since I last made a post about the upcoming book on shell scripting. I’m really sorry, I had intended to give much more real-time updates here. The book focusses on GNU/Linux and the Bash shell in particular, but it does cover the other environments too – Solaris, Bourne Shell, as well as mentions for ksh, zsh, *BSD and the rest of the Unix family.

In terms of page count, it is currently 89% finished. There is still the proof-reading to be done, and whatever delivery details the publishers need to deal with, so the availability date of some time in August is still on schedule. I notice that http://amzn.com/1118024486 is already offering a massive discount on the cover price; I have no idea what that is about, I’m trying not to take offence – they can’t have dismissed the book already as I have not quite finished writing it yet! So hopefully you can get a bargain while it’s cheap.

The subject matter has the potential to be quite boring if presented as a list of tedious system administration tasks, so I have tried to make it light and fun whenever I can; it’s still with Legal at the moment, but I hope to have a Space Invaders clone written entirely in the shell published in the book. People don’t tend to see the Shell as being capable of doing anything interactive at all, so it is nice to write a playable interactive game in the shell. The main problem in terms of playability is in working out how much to slow it down, and at what stage! Of course, being a shell script, you can tweak the starting value, the level at which it speeds up, and anything else about the gameplay. If the game doesn’t make it in to the book, I’ll post it here anyway, and will welcome your contributions on gameplay.

Other than games, I’ve got recipes for init scripts, conditional execution, translating scripts into other (human) languages, even writing CGI scripts in the shell. There is coverage of arrays, functions, libraries, process control, wildcards and filename expansion, pipes and pipelines, exec and redirection of input and output; this book aims to cover pretty much all that you need to know about shell scripting without being a tedious list of what the bash shell can do.

There is a status page at http://sgpit.com/book which also has order information; you can pre-order your copy from there.


Ten Good Unix Habits

June 22, 2010

IBM’s DeveloperWorks has 10 Good Unix Habits, which apply to GNU/Linux at least as much as to Unix.

I would expect that most experienced admins can second-guess the content to 5-7 of these 10 points, just from the title (for example, item 1 is a reference to “mkdir -p”, plus another related syntax available to Bash users). I would be surprised if you knew all ten:

1. Make directory trees in a single swipe.
2. Change the path; do not move the archive.
3. Combine your commands with control operators.
4. Quote variables with caution.
5. Use escape sequences to manage long input.
6. Group your commands together in a list.
7. Use xargs outside of find .
8. Know when grep should do the counting — and when it should step aside.
9. Match certain fields in output, not just lines.
10. Stop piping cats.

How many did you get?


Use of pipes, and other nifty tricks

December 18, 2009

http://www.tuxradar.com/content/command-line-tricks-smart-geeks has some useful tricks. A lot of it is presented as being bash-specific, but isn’t. Also, a lot seems Linux-specific, but isn’t. Lots of useful info for all Unix/Linux admins here. These hints go on and on; hardly any of them are the generic stuff you often see on Ubuntu forums, stumbleupon, and so on.


find, locate, whereis, which, type

September 16, 2009

I suspect that most Linux admins know 3 or 4 of these five commands, and regularly use 2 or 3 of them.

linuxhaxor has a useful introduction to all five, with the most common uses for each of them.

Note that locate requires a regular run of updatedb – the article says that “The database is automatically created and updated daily” which is true for most distributions, but it depends on your cron setup – you can update the locate db as frequently as you wish. Another thing to note about locate is that it will not use the (normally root-generated) database to tell you (as a non-privileged user) about files which you would not otherwise know about.


get the width of the terminal

September 8, 2009

A quick and easy way to get the width of your terminal is the command stty size. I have used it with diff like this:

diff -y -W `stty size | cut -d” ” -f2` –suppress-common-lines oldfile newfile

Note: This stty option is not available on Solaris, however, if you have it installed, the /usr/openwin/bin/resize command sets the COLUMNS variable.

update: This post originally said “width of your Linux terminal” but as noted in the comments, this feature of stty is also available in *BSD implementations, even though it is not part of the POSIX standard. So you should expect this to work on GNU and BSD systems (eg, most GNU/Linux distros, most *BSDs, including OSX) but not on all POSIX-compliant systems (eg, Solaris). I would assume that AIX, HPUX, SCO, the other “traditional” UNIX systems would also not support this, though I have not (yet) tested any of them. YMMV.


Linux Command Directory

May 16, 2009

I just found this page on the OReilly website – a Linux Command Directory

Click on any of the 687 commands below to get a description and list of available options. All links in the command summaries point to the online version of the book on Safari Bookshelf.

It doesn’t cover everything (what could?) but it could be a useful page to bookmark.


awk one-liners

April 1, 2009

I have previously plugged the great list of sed 1-liners at http://sed.sourceforge.net/sed1line.txt.

Here is a similar (if shorter) list of handy awk 1-liners:

http://www.sap-basis-abap.com/unix/awk-one-liner-tips.htm:

Print column1, column5 and column7 of a data file or output of any columns list

awk '{print $1, $5, $7}' data_file

cat file_name |awk '{print $1 $5 $7}'

ls –al |awk '{print $1, $5, $7}' -- Prints file_permissions,size and date

List all files names whose file size greater than zero.

ls –al |awk '$5 > 0 {print $9}'

List all files whose file size equal to 512bytes.

ls –al |awk '$5 == 512 {print $9}'

print all lines

awk '{print }' file_name

awk '{print 0}' file_name

Number of lines in a file


awk ' END {print NR}' file_name

Number of columns in each row of a file

awk '{print NF}' file_name

Sort the output of file and eliminate duplicate rows

awk '{print $1, $5, $7}' |sort –u

List all file names whose file size is greater than 512bytes and owner is “oracle”

ls –al |awk '$3 == "oracle" && $5 > 512 {print $9}'

List all file names whose owner could be either “oracle” or “root”

ls –al |awk '$3 == "oracle" || $3 == "root" {print $9}'

list all the files whose owner is not “oracle

ls –al |awk '$3 != "oracle" {print $9}'

List all lines which has at least one or more characters

awk 'NF > 0 {print }' file_name

List all lines longer that 50 characters

awk 'length($0) > 50 {print }' file_name

List first two columns

awk '{print $1, $2}' file_name

Swap first two columns of a file and print

awk '{temp = $1; $1 = $2; $2 = temp; print }' file_name

Replace first column as “ORACLE” in a data file

awk '{$1 = "ORACLE"; print }' data_file

Remove first column values in a data file

awk '{$1 =""; print }' data_file

Calculate total size of a directory in Mb

ls –al |awk '{total +=$5};END {print "Total size: " total/1024/1024 " Mb"}'

Calculate total size of a directory including sub directories in Mb

ls –lR |awk '{total +=$5};END {print "Total size: " total/1024/1024 " Mb"}'


Find largest file in a directory including sub directories

ls –lR |awk '{print $5 "\t" $9}' |sort –n |tail -1


Follow

Get every new post delivered to your Inbox.