...making Linux just a little more fun!

| TAG Index | 1 | 2 | 3 | 4 | Knowledge Base | News Bytes -->

The Answer Gang

By Jim Dennis, Karl-Heinz Herrmann, Breen, Chris, and... (meet the Gang) ... the Editors of Linux Gazette... and You!



(?) finding then catting

From Greg Messer

Answered By: Neil Youngman, Jason Creighton, Thomas Adam, Benjamin Okopnik

Answer Gang,

I run a smallish mail server and am using Squirelmail for web-based email. I use spamassassin/procmail to move emails that are borderline spam to /home/username/Trash.

My users have been instructed to occasionaly log into squirelmail and look through their Trash folder and empty it occassionaly even if they're popping in with the dreaded Outlook Express. They of course don't, and it's becoming a problem. I need to run a command that will find all the files called Trash in the user's home directories and empty them. I can't simply delete this file as it causes Squiremail to generate an error and I would get many many phone calls even though squirelmail will fix this problem on their next login.

This is my third attempt at automating this procedure and my third failure.

I can do this:

find /home -name Trash -exec ls -al {} \;

and this:

find /home -name Trash -exec rm {} \;

but not this:

find /home -name Trash -exec cat /dev/null > {} \;
(!) [Neil]
It's the redirection that's the problem. If you quote the '>' thus:
find /hometest -name Trash -exec cat /dev/null '>' {} \;
it will work, with the caveat that you may still hit some "trash" files in subdirectories.
Check where you ran the command. You will find an empty file called "{}", created by your redirection. The command you ran was equivalent to
find /hometest -name Trash -exec cat /dev/null \; > {}
That will empty anything called Trash in subdirectories as well as in the login directories. To only hit files in the login directories you should use a for loop, e.g.
for file in /home/*/Trash
do
  echo -n > $file
done
Before trying this put another echo in front of echo echo -n > $file, so you can see the commands it will run and sanity check them before running it for real.
What errors are you getting? Do you have permissions to write to these files?

(?) or this:

find /home -name Trash | xargs cat >  /dev/null
(!) [Neil] That wouldn't work. You're just listing the files and directing the output to /dev/null, which won't achieve what you want.

(?) While root, when I do this:

find /hometest -name -Trash -exec cat /dev/null > {} \;

it runs and exists after a second giving me a new prompt (a carriage return) and no errror messages.

When I run this:

find /hometest -name Trash -exec ls -s {} \;

I get this:

  60 /hometest/accounting.test/Trash
 264 /hometest/adam.test/Trash
3120 /hometest/agency.test/Trash
 164 /hometest/joh.doe/Trash
4976 /hometest/alice.test/Trash

so obviously it didn't work but I didn't get any errors.

Your "for" script worked great and is short and sweet. I'm very greatful, however, for my own information, I'd still like to understand what's wrong with my find syntax/structure. If you guys post this solution on the website you should put in the keywords "empty files". I've googled for all kinds of crazy things and never found a solution.

(!) [Jason] Look carefully at your command.
find /hometest -name -Trash -exec cat /dev/null > {} \;
This runs "find /hometest -name -Trash -exec cat /dev/null" and redirects the output to a file named "{}".
Quoting the '>' doesn't help since find doesn't use the shell to expand commands given with -exec. (That is, if you quoted the ">", cat would be run with three arguments. The first would be a file named "/dev/null". The second would a file named ">", which cat would probably complain doesn't exist. It is possible you might actually have a file named ">", but it's such a weird and confusing name that you probably don't. And the third would be the name of the file you're trying to truncate.)
If, for some reason, you needed to use "find" (perhaps to only truncate files with a certain mtime, or whatever), you could use a script like this:
#! /bin/sh

for file in "$@"; do
    [ -f "$file" ] && echo -n > "$file"
done
name it truncate.sh or something, make it executable, and save it somewhere. Then you could do:
find /path/to/files -exec truncate.sh {} \;
...or use xargs, or whatever.
(!) [Thomas] There's nothing wrong in your implimentation, but it is worthy of note that the test is simply going to add another "thing" for the script to do. If the number of files are vast, this is just going to slow it down. You could remove [1.] entirely and let find match the files beforehand:
find . -type f -exec ./truncate {} \;
(!) [Jason] Oh! I didn't think of that. That is better than silently dropping non-existent and non-regular files.
(!) [Thomas] I could hash this argument out in any number of combinations involving xargs, -exec, etc, with arguments as to whether you should use a shell script, etc., etc.
(!) [Jason] Yes, and you probably would be wanting to use xargs if the number of files is vast.
(!) [Thomas] Maybe. But that will still fail where a filename has spaces in it. Example:
[n6tadam@station fi]$ ls -lFb
total 8
-rw-r--r--  1 n6tadam n6tadam    0 Jan 11 11:18 foo
drwxr-xr-x  2 n6tadam n6tadam 4096 Jan 11 11:11 ignore/
-rw-r--r--  1 n6tadam n6tadam  120 Jan 11 11:08 this\ has\ spaces
Ignoring the "ignore/" directory, I've got a file with spaces in the filename [1], as well as a 'normal' file. If I wanted to truncate the files in the CWD above, I might use:
find . -type f -maxdepth 1 -exec sh -c 'cat /dev/null > {}' \;
... which is fine, for the file with no spaces. Of course, the truncate.sh script you wrote is fine for handling that (you actually quoted the variable -- thousands do not). But just what is wrong with that command above? Well, for each file that find finds, it has to spawn a separate non-interactive shell to process it. That's slow.
xargs might improve things (I'll leave this as an exercise to the reader to use 'time'):
find . -type f -maxdepth 1 -print0 | xargs -0i sh -c "cat /dev/null > \"{}\""
Note the quoting. It's paramount that this is done, because even though the '-print0' option to find splits file names ending '\0' (and xargs re-interprets them again at the other end), we're still having to quote the filename (this will still fail if the filename contains a '"' character, though). Why? Because by the time it gets passed through to the shell to handle it, we're back to the our old tricks of: '"\"use\" more quo\"t\"es'.
So is using find(1) any better than using a plain shell script that globs a given directory for files to truncate? No. Because find blindly exec()'s whatever we pass to it (and we're having to use shell redirection) we must invoke the shell for it to work. The only advantage to using find is that it would handle some strange files, nothing more (in this particular application of it, anyway).
I suppose you could make that find command more efficient:
find . -type f -maxdepth 1 -not -empty -print0 | xargs -0i sh -c "cat /dev/null > \"{}\""
... which just ensures that the files we pass to it have a filesize greater than zero. The "best" solution that I personally can see, is using the following:
find . -type f -maxdepth 1 -not -empty -print0 | xargs -0i cp /dev/null {}
This obliterates the need to fork a subshell to perform any redirection -- and as with any "find .. | xargs" combination, it'll be quite fast, too. But the main reason for using it is that by avoiding any shell-redirection-mangle-filename techniques, we don't have to worry about quoting. The delimiter of '\0' via find and xargs should be enough to protect it.
Also note that cat'ting /dev/null is nonsensical in this instance.
[1] Remember that there is nothing "illegal" about using such characters. Any character is a valid one for filenames at the filesystem level. What defines them as being a pain is the shell. Nothing more.
(!) [Ben] Not quite; '/' can't be used as a filename. Although "\n" can, which (along with any high-bit characters) can create lots of pain for anyone trying to work with them...
(!) [Jason] But ASCII NUL is an illegal character, right? So this will always work?
find -print0 | xargs -0 command
Jason Creighton
(!) [Ben] Right; you can't use a NUL or a '/'. Other than those two, anything is fair game... well, not really.  :) Mostly, it's a REALLY good way to screw yourself up; in general, it's not a good idea to use anything outside of [a-zA-Z0-9_] as part of a filename.
But then, we're talking about us.  :) "What do you mean, I can't jump off this cliff? It doesn't look all that high!"
ben@Fenrir:/tmp/foo$ for n in `seq 0 255`; do a=$(printf "\x`printf "%x" $n`"); >"a${a}"; done
bash: a/: Is a directory
ben@Fenrir:/tmp/foo$ ls -b
a    a{     a\214  a\243  a\272  a\321  a\350  a\377  a\031  ad  aO
a`   a}     a\215  a\244  a\273  a\322  a\351  a\002  a\032  aD  ap
a^   a@     a\216  a\245  a\274  a\323  a\352  a\003  a\033  ae  aP
a~   a$     a\217  a\246  a\275  a\324  a\353  a\004  a\034  aE  aq
a<   a*     a\220  a\247  a\276  a\325  a\354  a\005  a\035  af  aQ
a=   a\\    a\221  a\250  a\277  a\326  a\355  a\006  a\036  aF  ar
a>   a&     a\222  a\251  a\300  a\327  a\356  a\a    a\037  ag  aR
a|   a#     a\223  a\252  a\301  a\330  a\357  a\b    a0     aG  as
a\   a%     a\224  a\253  a\302  a\331  a\360  a\t    a1     ah  aS
a_   a+     a\225  a\254  a\303  a\332  a\361  a\v    a2     aH  at
a-   a\001  a\226  a\255  a\304  a\333  a\362  a\f    a3     ai  aT
a,   a\200  a\227  a\256  a\305  a\334  a\363  a\r    a4     aI  au
a;   a\201  a\230  a\257  a\306  a\335  a\364  a\016  a5     aj  aU
a:   a\202  a\231  a\260  a\307  a\336  a\365  a\017  a6     aJ  av
a!   a\203  a\232  a\261  a\310  a\337  a\366  a\020  a7     ak  aV
a?   a\204  a\233  a\262  a\311  a\340  a\367  a\021  a8     aK  aw
a.   a\205  a\234  a\263  a\312  a\341  a\370  a\022  a9     al  aW
a'   a\206  a\235  a\264  a\313  a\342  a\371  a\023  aa     aL  ax
a"   a\207  a\236  a\265  a\314  a\343  a\372  a\024  aA     am  aX
a(   a\210  a\237  a\266  a\315  a\344  a\373  a\025  ab     aM  ay
a)   a\211  a\240  a\267  a\316  a\345  a\374  a\026  aB     an  aY
a[   a\212  a\241  a\270  a\317  a\346  a\375  a\027  ac     aN  az
a]   a\213  a\242  a\271  a\320  a\347  a\376  a\030  aC     ao  aZ

This page edited and maintained by the Editors of Linux Gazette
HTML script maintained by Heather Stern of Starshine Technical Services, http://www.starshine.org/


Each TAG thread Copyright © its authors, 2005

Published in issue 111 of Linux Gazette February 2005

| TAG Index | 1 | 2 | 3 | 4 | Knowledge Base | News Bytes -->
Tux