Matilda Systems Corporation | High Availability Resources
  

 
Recent Events




Quick Links

What We Do

High Availability Resources

Where we are located

Kids Zone

Stupid find Tricks

Home >  Resources > HACMP Resources Collection > Stupid find Tricks

This page contains various examples of things that can be done with the find command. We call it Stupid find Tricks since some of them are pretty strange (I still have a ways to go so don't get too worried if none of the current ones strike you as pretty strange). Some of these can be quite useful to have in your mental toolkit when you're doing system administration on a HACMP cluster (or any Unix system, for that matter).

Feel free to help expand and improve this list of find tricks.

This page is part of the Matilda Team's HACMP Resources Collection. The home page of the collection is located here.

IMPORTANT: read the disclaimer BEFORE you use any information provided in this collection.


General Techniques

Avoiding NFS filesystems

Here's how to do a search from / while avoiding NFS filesystems (doing a find across an NFS mount is REALLY slow):

find / -fstype nfs -prune -o <<insert real search request here>>
        
The -prune option always returns true from find's perspective but tells find to stop searching deeper. It will be invoked if a directory in an NFS filesystem is encountered. Otherwise, the -fstype nfs will be false in which case the right hand side of the -o (or) option will be processed.

The end result is that find doesn't look inside NFS filesystems and the right hand side of the -o is applied to everything that isn't on an NFS filesystem.

For example, this would print the names of all files larger than 5M which reside on the local machine:

find / -fstype nfs -prune -o -size +10000 -print

In case it isn't obvious, I'm one of the old-timers who still tends to specify the -print option even though it is rarely necessary anymore.

Finding files which don't match a search request

It is sometimes easier to express what you don't want to find than it is to express what you do want to find. The conventional way to do it is to use the find command's ! option. For example, the following would find everything in the current directory hierarchy which isn't a directory:

find . ! -type d -print
        
Note that ! has higher precedence than the implied -a between the -type d and the -print in the above example.

Another way to achieve the same effect which some people prefer because some shells treat ! as a meta-character is as follows:

find . -type d -o -print

Using search criteria that find doesn't support directly

Sometimes, you want to find files which match complex criteria which are difficult or impossible to express directly with the find command. In this case, write a shell script or program that performs the desired tests and use the find command's -exec option. For example, assuming that our criteria testing shell script is called doit, the following would do the trick:

find . -exec ./doit {} \; -print
      
A simple (and relatively common) variant of this is to use the grep command to perform the test. For example, the following will print the names of all .c files in the current directory which contain the word hello:
find . -type f -name '*.c' -exec grep -q hello {} \; -print
There's a better way to just get a list of the files that match the grep request in the above example. For example, the following will list the files with names ending in .c and containing the word hello:
find . -type f -name '*.c' -exec grep -l hello {} /dev/null \;
The /dev/null is provided because some versions of grep don't list the file name unless at least two file names are specified (the /dev/null is not required on AIX's grep (since at least AIX 4.3.2) although we still provide it because we prefer to use techniques which are portable (the script may never be ported but we use lots of different kinds of Unix and can't keep track of these sorts of subtle differences between them)).

Note that the structure of the first example is still valid for many contexts. For example, the following will compile each of the .c files containing the word hello (yes, rather contrived but . . .):

find . -type f -name '*.c' -exec grep -q hello {} \; -exec cc -c {} \;
Also, keep in mind that invoking a command or especially a shell script against each file in a directory hierarchy can be rather expensive. Use conventional find options (like in the hello example above) to eliminate as many candidates as possible before invoking the command or shell script with the -exec option.

An arguably better way of invoking grep on a whole pile of files is:

find . -type f -print | xargs grep hello

This results in far fewer invocations of grep (i.e. it runs faster).

Doing complicated things with the files that you find

find is great at finding files matching particular criteria. What may not be obvious is that it is easy to then perform complex operations on the files that find finds for you. Write yourself a shell script that does the complex operations on a file which is passed as the first parameter and then invoke the script against each file that find finds like this:

find . -type f -size +10000 -exec ./doit {} \;

Using grep to print certain lines in the files you find

One common operation performed on found files is to use grep to extract certain lines from the files. At first glance, the following appears to print the lines containing the word hello in each of the found files:
find . -type f -size +10000 -exec grep hello {} \;
The problem is that since you've only specified a single file to the grep command, grep doesn't prefix each line with the name of the file that the line was found in. Since this is often quite important in this context, trick grep into showing the file name by giving it two files to look in but being sure that it won't find the pattern in the second file. Obviously, an empty file works best for this so we use everyone's favourite empty file /dev/null in our example:
find . -type f -size +10000 -exec grep hello {} /dev/null \;
Find is being given two file names so it will prefix lines found in the first file with the name of the file (it will never find anything in /dev/null so you won't get lines prefixed by /dev/null).

Applying a single invocation of a command on the list files that are found

Here's one of our favourites:
vi ` find . -type f -name '*.c' -print `
  
This invokes vi on all of the .c files in the current hierarchy. Once invoked, you move on to the next file using the :n command from within vi.

Be careful - you don't want to do this at the top of a directory hierarchy containing a few thousand .c files since you'll get really tired of typing the :n commands after the first hundred files!

This trick is really handy when you're planning on deleting the files which are found. First use the find command to print a list of the files which you want to delete. Once you're sure that the find command is listing the correct files, then use the shell's command line recall and editing capabilities to wrap an rm command around the find like this:

rm -f ` find . -type f -name '*.o' -print `
The idea here is to make sure that a typing mistake doesn't result in the loss of a whole bunch of files which you'd rather keep. Another way of getting to the same place is to use the find command's -exec option to remove the files. Again, construct and test a find command which lists the files and then use the shell's command line recall and editing capabilities to append the -exec rm -f {} \; onto the command to get the following:
find . -type f -name '*.o' -exec rm -f {} \;

Getting cron and find to be friends

One sometimes wants to put a find command into a crontab file. The problem is that find, the shell and cron parse their input lines in ways which aren't quite compatible. For example, the following crontab entry:
0 2 * * * find /tmp -type f -mtime +30 -exec rm -f {} \;
is supposed to delete any file in /tmp which hasn't been modified for over 30 days (this may or may not be a good idea depending on your context so be careful). The problem is that cron strips off the \ before the line is handed to the shell which then treats the ; as a command separator and executes the find command as
find /tmp -type f -mtime +30 -exec rm -f {}
This is, of course, illegal as the arguments to the -exec option must be terminated by a semi-colon. The solution is to quote the \; in the crontab file as follows:
0 2 * * * find /tmp -type f -mtime +30 -exec rm -f {} "\;"

Solving Specific Problems

Why is this filesystem full?

Here's a simple invocation which presents you with a list of all the files in the root filesystem with the largest files appearing first.
find / -xdev -type f -ls | sort +6n -r | more
  
Replacing the / with the name of another filesystem will get you a list for that filesystem.

Dropping the -xdev will get you a system-wide list. If the box is an NFS client then the form

find / -fstype nfs -o -type f -ls | sort +6n -r | more
  
if you want to avoid traversing the NFS-mounted filesystems.

Be prepared for the possibility that the find command is unable to find the large file(s) that are consuming all the space. This can happen if an application has deleted the name of an already open file. The file remains in existence until the application terminates but the find command can't find it because the file has no name.

One alternative to consider (carefully) in this situation is to use the lsof utility which is available "on the 'net".

Note that the lsof utility is NOT SUPPORTED by IBM and, as far as we know, not recommended by IBM either. The lsof utility program also requires root privileges to run. Before you decided to use lsof on your system, carefully consider the possible consequences of running a program that you found "on the 'net".

Matilda Systems Corporation doesn't support or even recommend lsof either. It is simply a utility program that we're aware of which you might want to (carefully) consider using.

IMPORTANT: If you lack the appropriate skills, experience and/or competency, are unwilling to take responsibility for your actions, or if you don't like these disclaimers then don't use this information.