Tuesday, 4 December 2007

Remembering that foreach uses whitespace separated arguments

I've forgotten this 3 times, so it's time to blog it. I have a client who has named directories with whitespace inside as /Users/joe/foo/foo bar/bart/. Obviously I need to be savvy to these when performing operations such as chmod, chgrp etc.

I'm already using a find command to find the troublesome directories and perform operations on them. Lets say I have a list of files such as:

$ cat foo.txt
/Users/joe/foo/foo bar/bart
/Users/joe/foo/foo bar/foo

The problem is that in shell script land the easiest thing to do is loop through these with a foreach


for FILE in `cat foo.txt`;
ls -l "${FILE}"

tng@monty:~/Desktop$ ./foo.sh
ls: /Users/joe/foo/foo: No such file or directory
ls: bar/bart: No such file or directory
ls: /Users/joe/foo/foo: No such file or directory
ls: bar/foo: No such file or directory

This doesn't work because the for splits on whitespace and not on the newline character. The correct way of doing this is using the following:


#Avoid nasty problems with whitespace in foo.txt

cat foo.txt | while read ANYOLD_FILE_LIST;

This reads each line into ANYOLD_FILE_LIST and works.


Anonymous said...

The "some-cmd | while read line ; do something with $line ; done" idiom is well worth publicising more.

Of course it still doesn't cope with names that have a linefeed in the middle. That would be insanely stupid, but legal according to Unix filesystem conventions. Almost no utility copes well in that case (this is what find's non-standard -print0 option is for).

Steve Bourne used to have a directory with 254 files in it, the names were all one character long and there was one for each legal 8-bit character (NUL and / cannot appear in a pathname, everything else is legal). He used it to annoy people writing backup utilities and other directory grovelling programs.

Nick said...

while read ANYOLD_FILE_LIST;
done < foo.txt

is good practice if you were really reading from a file.

The Pipe Police