Re: Scripting names from a file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nifty Hat Mitch wrote:
> Interesting... I sort of went nuts because my "music" dir
> is also full of files with funny char.   I even have some Greek
> file names.

Note that for the purposes of shell scripting, Greek letters aren't
"funny". All the characters that could cause scripting problems are in 
low ASCII. UTF-8 characters just look like extra normal letters.

Mind you, I'm not sure how cut copes with combining characters... I
assume it counts a letter followed by combining characters as one
letter, but I haven't been able to find documentation on this.

Actually, I ought to try that out...

But UTF-8 was designed so that 8 bit clean [1] programs that don't need
to do text processing can just handle UTF-8 strings as a weird 8 bit
encoding, and so that tests for "funny characters" would still work and
not raise false positives. This was (correctly) seen as a security
issue: most Western programmers and sysadmins would not appreciate the
full intricacies of what Unicode needs to do to handle all the alphabets
it is expected to handle, so it is important that intricate UTF-8 not
break ASCII security logic.

James.

[1] I.e. they can handle character sets like ISO 8859-1.
-- 
E-mail address: james | Never ask, "Oh, why were things so much better
@westexe.demon.co.uk  | in the old days?"
                      | It's not an intelligent question.
                      |     -- Ecclesiastes 7 v. 10


[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux