find, locate, whereis, which, type

September 16, 2009

I suspect that most Linux admins know 3 or 4 of these five commands, and regularly use 2 or 3 of them.

linuxhaxor has a useful introduction to all five, with the most common uses for each of them.

Note that locate requires a regular run of updatedb – the article says that “The database is automatically created and updated daily” which is true for most distributions, but it depends on your cron setup – you can update the locate db as frequently as you wish. Another thing to note about locate is that it will not use the (normally root-generated) database to tell you (as a non-privileged user) about files which you would not otherwise know about.


Tool Tip: “find”

January 19, 2007

find is a very powerful command. After the last post about grep, in which I mentioned that DOS has a command called “find” which is a simplistic version of grep, I now feel obliged to tell all about the real (that is, the *nix) find command.

Find works on the basis of find (from somewhere) (something); the “from somewhere” is often “.” (here, the current directory), or “/” (root, to search the entire filesystem). it’s not terribly interesting. What is interesting, is the “something” bit. You can specify a file name, for example:

$ find / -name "foo.txt"
/home/steve/misc/foo.txt
$ 

That wasn’t very exciting, and it will take a long time to complete, too. Systems with “slocate” installed could just say locate foo.txt and get the answer back in a fraction of a second (by looking it up in a database) ,without trawling through the whole hard disk (or, indeed, all attached disks). So that’s not what’s exciting about find. What is exciting about find, is what else it can do, instead of just “-name foo.txt”.

Don’t get me wrong; the “-name” switch is useful. More useful with wildcards: find . -name "*.txt" will find all text files.

You can restrict the search to one filesystem with the “-mount” (aka “-xdev”) flag.

If you want to find files newer than /var/log/messages, you can use find . -newer /var/log/messages

If you want to find files over 10Kb, then find . -size +10k will do the job. To get a full listing, find . -size +10k -ls.

Want to know what files I own? find . -uname steve

How about listing all files over 10Kb with their owner and permissions?

$ find . -size +10k -printf "%M %u %f\n"
-rwxr-xr-x steve foo.txt
-rw------- steve bar.doc
-rwxr-xr-x steve fubar.iso
-rwxr-xr-x steve fee.txt
-rw------- steve jg.tar
$

Here, the “%M” shows the permissions (-rwxr-xr-x), “%u” shows the username (“steve”), and “%f” shows the filename. The “\n” puts a “newline” character after each matching file, so that we get one file per line.

There is much more to find than this; I’ve not really covered the actions (other than printf) at all in this article, just a quick glimpse of how find can search for files based on just about any criteria you can think of. Search terms can be combined, so find . -size +10k -name "*.txt" will only find text files over 10Kb, and so on.


Efficient Shell Scripting

January 22, 2014

In an article on ITworld.com, Sandra Henry-Stocker gives advice about writing efficient shell scripts.

Whilst a lot of the principles provided would appear to make sense, most of them actually do not make any significant difference, and some are entirely wrongly measured.

First, Henry-Stocker suggests replacing this script:

for day in Mon Tue Wed Thu Fri
do
    echo $day
    touch $day.log
done

… with this one:

for day in Mon Tue Wed Thu Fri
do
    if [ $verbose ]; then echo $day; fi
    touch $day.log
done

I ran both scripts 5,000 times, like this:

for x in `seq 1 5000`
do
  for day in Mon Tue Wed Thu Fri
  do
      echo $day
      touch $day.log
  done
done

… and similarly for the second script.

The “slow” script ran in 21.425 seconds on my PC, the “fast” script, which although it does not echo anything, instead parses and executes the test, which means that it took longer – 25.178 seconds, or 17% slower than simply running “echo” every time.

I would also note that the syntax if [ $verbose ] is asking for trouble, in real scripts I’m sure she would agree that you should use something like: “if [ "$verbose" -eq "y" ]“.

If the code is running on an old Sun framebuffer console, which will update the screen at around one second per line, all this needless echoing would make a difference, but in any real-world situation in 2014, the overhead of the test is far slower than writing the output.

Over on page two (because it’s all about selling advertising space :-) ), order of comparison is taken on. Whilst in principle, it could make a significant difference, the example given involves a single if statement, no fork()ing, and some simple variable comparisons:

echo -n "enter foo> "; read foo;
echo -n "enter bar> "; read bar;
echo -n "enter tmp> "; read tmp;


if [[ $foo -eq 1 && $bar -eq 2 && $tmp -eq 3 ]]; then
    echo ok
fi

Taking out the read from the tests, we find that it takes 0.083 seconds to do 5,000 runs of the full test, with all variables matching (so all three conditions have to be tested each time), and 0.033 seconds when the first condition does not match, so it takes just over twice as long to run three tests as it does to run one test.

This is a significant difference, but it’s not the 1.195 seconds per iteration suggested by the article, it’s 0.00001 second per iteration. Taking Sandra Henry-Stocker’s results at face value, my tests which each took well under 1 second, would have taken 4 hours 5 minutes, or 2 hours 26 minutes respectively.

If one comparison was particularly time-consuming, it would be a more effective example. Here, if the find command takes 10 seconds to run, but foo is usually 1, then this will take 11 seconds:
if find /var -name foo.txt && [ "$foo" -eq "93" ]; then ...
whilst this will take 1 second, 1000% faster:
if [ "$foo" -eq "93" ] && find /var -name foo.txt; then ...
The example provided just doesn’t match the claimed results.

Avoiding unnecessary cat, echo and similar statements is good advice; not as significant as it was 10 years ago, and much less significant on Linux, where fork()ing is much faster than on Unix.


VMWare Balloon Size

October 29, 2013

This is just a quick note as it took me a while to find. I assumed that vmware_balloon.c would write stats into /proc, like most Linux kernel modules do.

That’s not how vmware roll, however. To find out how much RAM has been claimed by VMWare’s “Balloon” driver from within the guest OS itself, use the vmware-tools command:

# vmware-toolbox-cmd stat balloon
4460 MB
#

What shell am I running?

March 18, 2012

Whether you’ve got an interactive shell session, or are writing a shell script, it is very difficult to be certain. It’s better to check for the capabilities that you require.

If you find yourself in a shell session, but don’t know what type of shell it is, there are a few ways to find out whether it’s Bourne shell, Bash, ksh, csh, zsh, tcsh, or whatever.

The simple answer is that you type in the command

echo $SHELL

which should tell you the path to your current shell; if it’s /bin/ksh or /usr/bin/ksh then it’s the KornShell; if it’s /bin/csh then it’s the C shell, and so on.

However, it’s not always that simple; Bash will set $SHELL only if it was not already set, so if you were running csh and use that to call bash, then Bash’s $SHELL will still say csh, not bash. However, Bash will set the $BASH_VERSION variable, but that’s not really a guarantee that you have that version; it only tells you that there exists a variable which specifies that version. You may not even be running bash at all.

Similarly, some shells (particularly /bin/sh) are symbolic links to others – whether to bash (GNU/Linux and newer Solaris), dash (some Linux distros more recently), or ksh (AIX).

So $SHELL is a useful start, but not at all definitive. You can search for your UserID in the /etc/passwd file:

$ grep steve /etc/passwd
steve:x:1000:1000:Steve Parker,,,:/home/steve:/bin/bash
$ 

But there may be many entries containing the text “steve”, and you could be using NIS, LDAP, AD or some other authentication mechanism. $UID should be a read-only variable containing your UserID, and you can search your system-specific passwd via the getent command:

$ getent passwd $UID
steve:x:1000:1000:Steve Parker,,,:/home/steve:/bin/bash
$ 

However, as my UID is 1000, this would match any line containing “1000” such as an account called “HAL1000″. So this should be one way to get your shell:

$ getent passwd $UID | cut -d: -f3,7|grep "^${UID}:"
1000:/bin/bash
$ 

Still, you could have run another shell after picking up your default shell. You can always check $0 – that should tell you how your current shell was called. You don’t know what the $PATH variable was when that was called – if $0 says “/usr/bin/zsh” then that’s what was called (of course, it is possible that/usr/bin/zsh could have changed since your shell was invoked!). If it just says “sh” then it’s whatever “sh” was found first in the $PATH of the calling shell. And of course, you can’t find out for sure what state that shell was in at that time.

On a Linux system, “cat /proc/$$/cmdline” should also give a good clue; “ls -l /proc/$$/exe” is better (but not definitive; it may be marked “(deleted)” if it’s been deleted, so you should check if it’s been subsequently replaced by some other shell.

So – it depends why you need to know. If you need to know for sure, on an unknown system, exactly what shell you are in, then I’m sorry, it’s not possible to be absolutely certain. To be reasonably confident, check $SHELL or $0. If you need to be more certain than that, then check for the features you require.

If you’re writing a shell script which requires arrays, then define and access an array, check that the results are as expected, and if not, bail out with a message along the lines of “an array-capable shell is expected – we suggest bash or ksh”.


inodes – ctime, mtime, atime

October 7, 2010

http://www.unix.com/tips-tutorials/20526-mtime-ctime-atime.html has a really good explanation of the different timestamps in a Unix/Linux inode. GNU/Linux has a useful utility called “stat” which displays most of the inode contents:
$ stat .bashrc
File: `.bashrc'
Size: 3219 Blocks: 8 IO Block: 4096 regular file
Device: fe00h/65024d Inode: 33 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 1000/ steve) Gid: ( 1000/ steve)
Access: 2010-10-07 01:11:21.000000000 +0100
Modify: 2010-08-19 21:22:20.000000000 +0100
Change: 2010-08-19 21:22:21.000000000 +0100
$

As Perderabo explains in the above-linked post:

Unix keeps 3 timestamps for each file: mtime, ctime, and atime. Most people seem to understand atime (access time), it is when the file was last read. There does seem to be some confusion between mtime and ctime though. ctime is the inode change time while mtime is the file modification time. “Change” and “modification” are pretty much synonymous. There is no clue to be had by pondering those words. Instead you need to focus on what is being changed. mtime changes when you write to the file. It is the age of the data in the file. Whenever mtime changes, so does ctime. But ctime changes a few extra times. For example, it will change if you change the owner or the permissions on the file.

Let’s look at a concrete example. We run a package called Samba that lets PC’s access files. To change the Samba configuration, I just edit a file called smb.conf. (This changes mtime and ctime.) I don’t need to take any other action to tell Samba that I changed that file. Every now and then Samba looks at the mtime on the file. If the mtime has changed, Samba rereads the file. Later that night our backup system runs. It uses ctime, which also changed so it backs up the file. But let’s say that a couple of days later I notice that the permissions on smb.conf are 666. That’s not good..anyone can edit the file. So I do a “chmod 644 smb.conf”. This changes only ctime. Samba will not reread the file. But later that night, our backup program notices that ctime has changes, so it backs up the file. That way, if we lose the system and need to reload our backups, we get the new improved permission setting.

Here is a second example. Let’s say that you have a data file called employees.txt which is a list of employees. And you have a program to print it out. The program not only prints the data, but it obtains the mtime and prints that too. Now someone has requested an employee list from the end of the year 2000 and you found a backup tape that has that file. Many restore programs will restore the mtime as well. When you run that program it will print an mtime from the end of the year 2000. But the ctime is today. So again, our backup program will see the file as needing to be backed up.

Suppose your restore program did not restore the mtime. You don’t want your program to print today’s date. Well no problem. mtime is under your control. You can set it to what ever you want. So just do:
$ touch -t 200012311800 employees.txt
This will set mtime back to the date you want and it sets ctime to now. You have complete control over mtime, but the system stays in control of ctime. So mtime is a little bit like the date on a letter while ctime is like the postmark on the envelope.

This is a really clear, thorough explanation of ctime and mtime. Unfortunately, it is not possible to find the original creation time of a file, though that is somewhat meaningless as things are copied, moved, linked, changed; what is the creation time of a file which was created, removed, then created afresh, for example?


Useful GNU/Linux Commands

June 23, 2010

Pádraig Brady has some useful, if somewhat basic hints, at http://www.pixelbeat.org/cmdline.html. He has updated them to include more powerful commands at http://www.pixelbeat.org/docs/linux_commands.html.

Here are a few of my favourites (I have taken the liberty of slightly altering some of the code and/or descriptions):
From the original:
Search recursively for “expr” in all *.c and *.h files:
find -name '*.[ch]' | xargs grep -E 'expr'

Concatenate lines with training backslash:
sed ':a; /\\$/N; s/\\\n//; ta'

Delete line 42 from .known_hosts:
sed -i 42d ~/.ssh/known_hosts

From the new post:
Echo the path one item per line (assumes GNU tr):
echo $PATH | tr : '\n'

Top for Network:
iftop
Top for Input/Output (I/O):
iotop

Get SSL website Certificate:
openssl s_client -connect http://www.google.com:443 < /dev/null

List processes with Port 80 open:
lsof -i tcp:80

Edit a remote file directly in vim:
vim scp://user@remote//path/to/file

Add 20ms latency to loopback device (for testing):
tc qdisc add dev lo root handle 1:0 netem delay 20msec
Remove the latency:
tc qdisc del dev lo root


Ten Good Unix Habits

June 22, 2010

IBM’s DeveloperWorks has 10 Good Unix Habits, which apply to GNU/Linux at least as much as to Unix.

I would expect that most experienced admins can second-guess the content to 5-7 of these 10 points, just from the title (for example, item 1 is a reference to “mkdir -p”, plus another related syntax available to Bash users). I would be surprised if you knew all ten:

1. Make directory trees in a single swipe.
2. Change the path; do not move the archive.
3. Combine your commands with control operators.
4. Quote variables with caution.
5. Use escape sequences to manage long input.
6. Group your commands together in a list.
7. Use xargs outside of find .
8. Know when grep should do the counting — and when it should step aside.
9. Match certain fields in output, not just lines.
10. Stop piping cats.

How many did you get?


Unix / Linux Training Courses in the UK

May 11, 2010

After a few customers requesting it, my consultancy firm, SGP IT, is planning to run some technical training courses this Summer; in the Manchester area initially, though any location is possible.

Now would be a very good time to get in touch (training@sgpit.com) as things are at a very early stage and very fluid – if you can bring a few people along, we can even run a bespoke course for you, and tailor everything to your need.

Depending on subject, duration, location and so on, it should be possible to run the first few courses for as little as £250 – £300 per person per day – much less than the £400 – £500 or so you’d pay for a corporate course where you all get is a trainer who has no experience of the actual situation you face at work, and who delivers powerpoint slides to you, then doles out the free mousepads and t-shirts at the end of the course.

None of us have been overly impressed by many of the available training courses – we are hoping to redefine how personal IT training can be delivered. Here’s how:

The kind of training session I would envisage us providing, would involve a fairly small class size (certainly fewer than 6 people), allowing us to focus on your current issues, and tailor the course around the needs, interests and skills of the attendees. The courses are likely to be between 2 and 5 days, most being 2-3 day courses.

Of course, there will be no corners cut – we will insist on great location and facilities, free internet access, PCs for all candidates (preinstalled with Linux, Solaris, *BSD, you name it – contact us before the course and we’ll build the PC to suit you), tons of good quality course notes, including certificates and the obligatory full VAT receipts, of course. I’m sure that we can find a few freebies to throw in, too!

If you have specific queries or concerns that you would like to be addressed in the course, let us know up-front, and we can find a way to work it in to the course.

If any of this sounds vaguely interesting, please do get in touch (training@sgpit.com) and we can mold things around your requirements.


awk one-liners

April 1, 2009

I have previously plugged the great list of sed 1-liners at http://sed.sourceforge.net/sed1line.txt.

Here is a similar (if shorter) list of handy awk 1-liners:

http://www.sap-basis-abap.com/unix/awk-one-liner-tips.htm:

Print column1, column5 and column7 of a data file or output of any columns list

awk '{print $1, $5, $7}' data_file

cat file_name |awk '{print $1 $5 $7}'

ls –al |awk '{print $1, $5, $7}' -- Prints file_permissions,size and date

List all files names whose file size greater than zero.

ls –al |awk '$5 > 0 {print $9}'

List all files whose file size equal to 512bytes.

ls –al |awk '$5 == 512 {print $9}'

print all lines

awk '{print }' file_name

awk '{print 0}' file_name

Number of lines in a file


awk ' END {print NR}' file_name

Number of columns in each row of a file

awk '{print NF}' file_name

Sort the output of file and eliminate duplicate rows

awk '{print $1, $5, $7}' |sort –u

List all file names whose file size is greater than 512bytes and owner is “oracle”

ls –al |awk '$3 == "oracle" && $5 > 512 {print $9}'

List all file names whose owner could be either “oracle” or “root”

ls –al |awk '$3 == "oracle" || $3 == "root" {print $9}'

list all the files whose owner is not “oracle

ls –al |awk '$3 != "oracle" {print $9}'

List all lines which has at least one or more characters

awk 'NF > 0 {print }' file_name

List all lines longer that 50 characters

awk 'length($0) > 50 {print }' file_name

List first two columns

awk '{print $1, $2}' file_name

Swap first two columns of a file and print

awk '{temp = $1; $1 = $2; $2 = temp; print }' file_name

Replace first column as “ORACLE” in a data file

awk '{$1 = "ORACLE"; print }' data_file

Remove first column values in a data file

awk '{$1 =""; print }' data_file

Calculate total size of a directory in Mb

ls –al |awk '{total +=$5};END {print "Total size: " total/1024/1024 " Mb"}'

Calculate total size of a directory including sub directories in Mb

ls –lR |awk '{total +=$5};END {print "Total size: " total/1024/1024 " Mb"}'


Find largest file in a directory including sub directories

ls –lR |awk '{print $5 "\t" $9}' |sort –n |tail -1


Follow

Get every new post delivered to your Inbox.