In an article on ITworld.com, Sandra Henry-Stocker gives advice about writing efficient shell scripts.
Whilst a lot of the principles provided would appear to make sense, most of them actually do not make any significant difference, and some are entirely wrongly measured.
First, Henry-Stocker suggests replacing this script:
for day in Mon Tue Wed Thu Fri
do
echo $day
touch $day.log
done
… with this one:
for day in Mon Tue Wed Thu Fri
do
if [ $verbose ]; then echo $day; fi
touch $day.log
done
I ran both scripts 5,000 times, like this:
for x in `seq 1 5000`
do
for day in Mon Tue Wed Thu Fri
do
echo $day
touch $day.log
done
done
… and similarly for the second script.
The “slow” script ran in 21.425 seconds on my PC, the “fast” script, which although it does not echo anything, instead parses and executes the test, which means that it took longer – 25.178 seconds, or 17% slower than simply running “echo” every time.
I would also note that the syntax if [ $verbose ]
is asking for trouble, in real scripts I’m sure she would agree that you should use something like: “if [ "$verbose" -eq "y" ]
“.
If the code is running on an old Sun framebuffer console, which will update the screen at around one second per line, all this needless echoing would make a difference, but in any real-world situation in 2014, the overhead of the test is far slower than writing the output.
Over on page two (because it’s all about selling advertising space 🙂 ), order of comparison is taken on. Whilst in principle, it could make a significant difference, the example given involves a single if
statement, no fork()
ing, and some simple variable comparisons:
echo -n "enter foo> "; read foo;
echo -n "enter bar> "; read bar;
echo -n "enter tmp> "; read tmp;
if [[ $foo -eq 1 && $bar -eq 2 && $tmp -eq 3 ]]; then
echo ok
fi
Taking out the read
from the tests, we find that it takes 0.083 seconds to do 5,000 runs of the full test, with all variables matching (so all three conditions have to be tested each time), and 0.033 seconds when the first condition does not match, so it takes just over twice as long to run three tests as it does to run one test.
This is a significant difference, but it’s not the 1.195 seconds per iteration suggested by the article, it’s 0.00001 second per iteration. Taking Sandra Henry-Stocker’s results at face value, my tests which each took well under 1 second, would have taken 4 hours 5 minutes, or 2 hours 26 minutes respectively.
If one comparison was particularly time-consuming, it would be a more effective example. Here, if the find
command takes 10 seconds to run, but foo
is usually 1
, then this will take 11 seconds:
if find /var -name foo.txt && [ "$foo" -eq "93" ]; then ...
whilst this will take 1 second, 1000% faster:
if [ "$foo" -eq "93" ] && find /var -name foo.txt; then ...
The example provided just doesn’t match the claimed results.
Avoiding unnecessary cat
, echo
and similar statements is good advice; not as significant as it was 10 years ago, and much less significant on Linux, where fork()
ing is much faster than on Unix.