Similarly, you can search for “B00C2EGNSA” on any Amazon site, or just go to http://www.amazon.COUNTRY/dp/B00C2EGNSA (where “COUNTRY” is .fr, .de, etc) for your local equivalent.
Shell Scripting Tutorial on Kindle
March 29, 2013Shell Scripting page on Facebook
July 11, 2011I have the final pages to proofread this week, ready to go to the printers. It’s looking like 576 pages, a little bit over the target of 504 pages, but close enough.
I will update the Table of Contents at http://sgpit.com/book/ once the page count is finalised.
Update on Shell Scripting Recipes book
April 23, 2011Wow, it’s been nearly two months since I last made a post about the upcoming book on shell scripting. I’m really sorry, I had intended to give much more real-time updates here. The book focusses on GNU/Linux and the Bash shell in particular, but it does cover the other environments too – Solaris, Bourne Shell, as well as mentions for ksh, zsh, *BSD and the rest of the Unix family.
In terms of page count, it is currently 89% finished. There is still the proof-reading to be done, and whatever delivery details the publishers need to deal with, so the availability date of some time in August is still on schedule. I notice that http://amzn.com/1118024486 is already offering a massive discount on the cover price; I have no idea what that is about, I’m trying not to take offence – they can’t have dismissed the book already as I have not quite finished writing it yet! So hopefully you can get a bargain while it’s cheap.
The subject matter has the potential to be quite boring if presented as a list of tedious system administration tasks, so I have tried to make it light and fun whenever I can; it’s still with Legal at the moment, but I hope to have a Space Invaders clone written entirely in the shell published in the book. People don’t tend to see the Shell as being capable of doing anything interactive at all, so it is nice to write a playable interactive game in the shell. The main problem in terms of playability is in working out how much to slow it down, and at what stage! Of course, being a shell script, you can tweak the starting value, the level at which it speeds up, and anything else about the gameplay. If the game doesn’t make it in to the book, I’ll post it here anyway, and will welcome your contributions on gameplay.
Other than games, I’ve got recipes for init scripts, conditional execution, translating scripts into other (human) languages, even writing CGI scripts in the shell. There is coverage of arrays, functions, libraries, process control, wildcards and filename expansion, pipes and pipelines, exec and redirection of input and output; this book aims to cover pretty much all that you need to know about shell scripting without being a tedious list of what the bash shell can do.
There is a status page at http://sgpit.com/book which also has order information; you can pre-order your copy from there.
Ten Good Unix Habits
June 22, 2010IBM’s DeveloperWorks has 10 Good Unix Habits, which apply to GNU/Linux at least as much as to Unix.
I would expect that most experienced admins can second-guess the content to 5-7 of these 10 points, just from the title (for example, item 1 is a reference to “mkdir -p”, plus another related syntax available to Bash users). I would be surprised if you knew all ten:
1. Make directory trees in a single swipe.
2. Change the path; do not move the archive.
3. Combine your commands with control operators.
4. Quote variables with caution.
5. Use escape sequences to manage long input.
6. Group your commands together in a list.
7. Use xargs outside of find .
8. Know when grep should do the counting — and when it should step aside.
9. Match certain fields in output, not just lines.
10. Stop piping cats.
How many did you get?
find, locate, whereis, which, type
September 16, 2009I suspect that most Linux admins know 3 or 4 of these five commands, and regularly use 2 or 3 of them.
linuxhaxor has a useful introduction to all five, with the most common uses for each of them.
Note that locate requires a regular run of updatedb – the article says that “The database is automatically created and updated daily” which is true for most distributions, but it depends on your cron setup – you can update the locate db as frequently as you wish. Another thing to note about locate is that it will not use the (normally root-generated) database to tell you (as a non-privileged user) about files which you would not otherwise know about.
Book
April 23, 2008A serious publisher has contacted me about writing a serious book about Linux shell programming.
It is all really very serious. I’m not used to being serious, as you can probably tell from the fact that I have now used the word “serious” four times in this three-sentence post.
I am rather keen to write a book on the subject, not because I’m vain, or desperate for money, but because the stuff I have seen out there in dead-tree format has been of rather low quality. Also because of all the emails I’ve received over the years, they have all been positive, and none has said anything along the lines of “I didn’t need any of that because I bought Book[X]“, or indeed any book. People have emailed me, asking for advice as to what book to buy, and I have been unable to recommend any book that I have seen.
So:
What would you like to see in your ideal book about UNIX / Linux shell scripting, be it Bourne, Bash, ksh, tcsh, zsh, whatever?
Please don’t be timid; if you want to know how to work out how many nose-flutes can be fitted into the area of a Boeing 757, you won’t be anything like as strange as some of the correspondants I’ve had over the years, so please, tell me what is bugging you, what has bugged you, or even what you think might be likely to bug you in days / months / years to come.
I’m likely to answer any specific questions here and now, whether or not they end up in the book, but anything you’d like to see in a book, too… post that here, and I’ll have a stab at it.
Also, I would of course be interested to know if you have found any useful books on or around the subject, and what they did particularly well.
Steve
Happy First Birthday!
January 6, 2008This blog has now been running for a year; the first post was Hello World on 17th Jan 2007.
I hadn’t realised it had been going for so long; in that time, I’ve made 41 posts, so I haven’t quite managed to make one post per week
I have been a bit slack lately, for which I do apologise. New Years Resolution: I must make more posts here!
In the meantime, my main site, steve-parker.org, has celebrated its seventh birthday, having been born in June 2000 – looking forward to making the 8th birthday celebrations this June!
IFS – Internal Field Separator
September 26, 2007It seems like an esoteric concept, but it’s actually very useful.
If your input file is “1 apple steve@example.com”, then your script could say:
while read qty product customer
do
echo "${customer} wants ${qty} ${product}(s)"
done
The read command will read in the three variables, because they’re spaced out from each other.
However, critical data is often presented in spreadsheet format. If you save these as CSV files, it will come out like this:
1,apple,steve@example.com
This contains no spaces, and the above code will not be able to understand it. It will take the whole thing as one item – the first thing, quanity, $qty, and set the other two fields as blank.
The way around this, is to tell the entire shell, that “,” (the comma itself) separates fields; it’s the “internal field separator”, or IFS.
The IFS variable is set to space/tab/newline, which isn’t easy to set in the shell, so it’s best to save the original IFS to another variable, so you can put it back again after you’ve messed around with it. I tend to use “oIFS=$IFS” to save the current value into “oIFS”.
Also, when the IFS variable is set to something other than the default, it can really mess with other code.
Here’s a script I wrote today to parse a CSV file:
#!/bin/sh oIFS=$IFS # Always keep the original IFS! IFS="," # Now set it to what we want the "read" loop to use while read qty product customer do IFS=$oIFS # process the information IFS="," # Put it back to the comma, for the loop to go around again done < myfile.txt
It really is that easy, and it’s very versatile. You do have to be careful to keep a copy of the original (I always use the name oIFS, but whatever suits you), and to put it back as soon as possible, because so many things invisibly use the IFS – grep, cut, you name it. It’s surprising how many things within the “while read” loop actually did depend on the IFS being the default value.
Pipes Primer
May 8, 2007The previous post dealt with pipes, though the example may not have been the best for those who are not accustomed to the concept.
There are a few concepts to be understood – mainly, that of two (or more) processes operating together, how they put their data out, and how the get their data in. UNIX deals with multiple processes, all running (conceptually, at least) at the same time, on different CPUs, each with a standard input (stdin), and standard output (stdout). Pipes connect one process’s stdout to another’s stdin.
What do we want to pipe? Let’s say we’ve got a small 80×25 terminal screen, and lots of files. The ls command will spew out tons of data, faster than we can read it. There’s a handy utility called “more“, which will show a screen-worth of text, then prompt “more”. When you hit the space bar, it will scroll down a screen. You can hit ENTER to scroll one line.
I’m sure that you’ve worked this out already, but here is how we combine these two commands:
$ ls | more
<the first screenful of files is shown>
--More--
What happens here, is that the “more” command is started up first, then the “ls” command. The output of “ls” is piped to the input of “more”, so it can read the data.
Most such tools can also work another way, too:
$ more myfile.txt
<the first screenful of "myfile.txt" is shown>
--More--
That is to say, “myfile.txt” is taken as standard input (stdin).
Pipelines in the Shell
May 3, 2007One of the most powerful things of the *nix shell, and one which is currently not even covered in the tutorial, is the pipeline. I try to keep this blog and the tutorial from overlapping, but I really must rectify this gap in the main site some time.
In the meantime, this is what it is all about.
UNIX (and therefore GNU/Linux) is full of small text-based utilities : wc to count words (and lines, and characters) in a text file; sort to sort a text file; uniq to get only the unique lines from a text file; grep to get certain lines (but not others) from a text file, and so on.
Did you see the common trait there? Yes, it’s not just that “Everything is a file”, nearly everything is also text. It’s largely from this tradition that HTML, XML, RSS, Email (SMTP, POP, IMAP) and the like are all text-based. Contrast with MS Office, for example, where all data is in binary files which can only (really) be manipulated by the application which created them.
So what? It’s crude, simple, works on an old-fashioned green-and-black screen. How could that be relevant in the 21st Century?
It’s relevant, not because of what each tool itself provides, but what they can do when combined.
Ugly Apache Access Log
Here’s a line from my access.log (Apache2) – Yes, honest, it’s one line. It just looks very ugly:
12.106.111.10 - - [02/May/2007:12:09:58 -0700] "GET /sh/sh.shtml HTTP/1.1" 200 33080 "http://www.google.ca/search?hl=en&q=bourne+shell&btnG=Google+Search&meta=" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
That’s interesting (well, kind of). But it doesn’t tell us much about this visitor. They came from Google Canada (google.ca), searched for “bourne shell”, and clicked on the link to give them http://steve-parker.org/sh/sh.shtml. They’re using FireFox 1.5.0.3 on Windows.
Great. What happened to them? Did they like the site? Did they stay? Did they actually read anything?
Filtering it
We can use grep to filter out just this visitor. However, this gives us lots of stuff we’re not interested in – all the CSS, PNG, GIF files which support the pages themselves:
$ grep "^12.106.111.10 " access_log
12.106.111.10 - - [02/May/2007:12:09:58 -0700] "GET /sh/sh.shtml HTTP/1.1" 200 33080 "http://www.google.ca/search?hl=en&q=bourne+shell&btnG=Google+Search&meta=" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
12.106.111.10 - - [02/May/2007:12:09:58 -0700] "GET /steve-parker.css HTTP/1.1" 200 8757 "http://steve-parker.org/sh/sh.shtml" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
12.106.111.10 - - [02/May/2007:12:09:59 -0700] "GET /images/1.png HTTP/1.1" 200 471 "http://steve-parker.org/sh/sh.shtml" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
12.106.111.10 - - [02/May/2007:12:09:59 -0700] "GET /images/prevd.png HTTP/1.1" 200 1397 "http://steve-parker.org/sh/sh.shtml" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
12.106.111.10 - - [02/May/2007:12:10:00 -0700] "GET /images/2.png HTTP/1.1" 200 648 "http://steve-parker.org/sh/sh.shtml" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
... etc ...
This is looking ugly already, and not just because of the small width of the webpage – even at 80 characters wide, this is hard to understand.
Filtering Some More
At first glance, I should be able to pipe this through a grep for html files:
$ grep "^12.106.111.10 " access_log | grep shtml
However, Apache also logs the referrer, so even the “GET /images/2.png” request above, includes a .shtml request. So I can use “grep -v“. I’ll add the “-w” option to grep to say that the search term must match a whole word. So – “grep -v gif” would let “gifford.html” through, whereas “grep -vw gif” would not. I’ll add “\” to the code, so that you can cut and paste it… the “\” means that although the line breaks, it’s still really part of the same line of code:
$ grep "^12.106.111.10 " access_log | grep -vw css \
| grep -vw gif | grep -vw jpg \
| grep -vw png | grep -vw ico
12.106.111.10 - - [02/May/2007:12:09:58 -0700] "GET /sh/sh.shtml HTTP/1.1" 200 33080 "http://www.google.ca/search?hl=en&q=bourne+shell&btnG=Google+Search&meta=" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
12.106.111.10 - - [02/May/2007:12:10:32 -0700] "GET /sh/variables1.shtml HTTP/1.1" 200 48431 "http://steve-parker.org/sh/sh.shtml" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
12.106.111.10 - - [02/May/2007:12:13:23 -0700] "GET /sh/external.shtml HTTP/1.1" 200 41322 "http://steve-parker.org/sh/sh.shtml" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
12.106.111.10 - - [02/May/2007:12:13:45 -0700] "GET /sh/quickref.shtml HTTP/1.1" 200 42454 "http://steve-parker.org/sh/sh.shtml" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
12.106.111.10 - - [02/May/2007:12:14:27 -0700] "GET /sh/test.shtml HTTP/1.1" 200 48844 "http://steve-parker.org/sh/sh.shtml" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3"
This pumps the output through a grep which gets rid of CSS files (“| grep -vw css“), then gif, then jpg, then png, then ico. That should just leave us with the HTML files, which is what we’re really interested in.
It’s really hard to see what’s going on here. The narrow web page doesn’t help, but we just want to get the key information out, which should be nice and simple. It should look good however we view it.
If you look carefully, you can see that this visitor accessed /sh/sh.shtml, then /sh/variables1.shtml (after a minute), /sh/external.shtml (after 3 minutes), then /sh/quickref.shtml (about 20 seconds later, presumably – given the referrer is the same, in a new tab). A second later, they opened /sh/test.shtml (which also suggests that they’ve loaded the two pages in tabs, to read at their leisure).
Getting the Data we need
However, none of this is really very easy to read. If we just want to know what pages they visited, and when, we need to do some more filtering. awk is a very powerful tool, of whose abilities we will only scratch the surface here. We will get “fields” 4 and 7 – the timestamp and the URL accessed.
$ grep "^12.106.111.10 " access_log \
| grep -vw css | grep -vw gif \
| grep -vw jpg | grep -vw png \
| grep -vw ico | awk '{ print $4,$7 }'
[02/May/2007:12:09:58 /sh/sh.shtml
[02/May/2007:12:10:32 /sh/variables1.shtml
[02/May/2007:12:13:23 /sh/external.shtml
[02/May/2007:12:13:45 /sh/quickref.shtml
[02/May/2007:12:14:27 /sh/test.shtml
Okay, it’s the info we wanted, but it’s still not great. That “[” looks out of place now. We can use cut to tidy things up. In this case, we’ll use its positional spacing, because we want to get rid of the first character. Cut’s “-c” paramater tells it what character to cut from. We want the 2nd character onwards, so we just add it to the end of the pipe line:
$ grep "^12.106.111.10 " access_log | grep -vw css | grep -vw gif | grep -vw jpg | grep -vw png | grep -vw ico | awk '{ print $4,$7 }'|cut -c2-
02/May/2007:12:09:58 /sh/sh.shtml
02/May/2007:12:10:32 /sh/variables1.shtml
02/May/2007:12:13:23 /sh/external.shtml
02/May/2007:12:13:45 /sh/quickref.shtml
02/May/2007:12:14:27 /sh/test.shtml
And that’s the kind of thing that we can do with a pipe. We can get exactly what we want from a file.
Moving on
At the start of this post, we mentioned sort and uniq. Actually, “sort -u” will do the same as “sort | uniq“. So if we want to get the unique visitors, we can just get the first field (“cut -d" " -f1“) and sort it uniquely:
$ cut -d" " -f1 access_log | sort -u

Posted by unixshell 