Nifty Hat Mitch wrote: > Interesting... I sort of went nuts because my "music" dir > is also full of files with funny char. I even have some Greek > file names. Note that for the purposes of shell scripting, Greek letters aren't "funny". All the characters that could cause scripting problems are in low ASCII. UTF-8 characters just look like extra normal letters. Mind you, I'm not sure how cut copes with combining characters... I assume it counts a letter followed by combining characters as one letter, but I haven't been able to find documentation on this. Actually, I ought to try that out... But UTF-8 was designed so that 8 bit clean [1] programs that don't need to do text processing can just handle UTF-8 strings as a weird 8 bit encoding, and so that tests for "funny characters" would still work and not raise false positives. This was (correctly) seen as a security issue: most Western programmers and sysadmins would not appreciate the full intricacies of what Unicode needs to do to handle all the alphabets it is expected to handle, so it is important that intricate UTF-8 not break ASCII security logic. James. [1] I.e. they can handle character sets like ISO 8859-1. -- E-mail address: james | Never ask, "Oh, why were things so much better @westexe.demon.co.uk | in the old days?" | It's not an intelligent question. | -- Ecclesiastes 7 v. 10