On Fri, 9 Jul 2004, Björn Persson wrote: > Ian Pilcher wrote: > > > Björn Persson wrote: > > > >> I use Latin 1 (ISO 8859-1). Well, so far I'm running Fedora Core 1, but > >> when I upgrade I'll still use Latin 1. I've got lots of text files and > >> filenames with non-English letters in them. No matter which operating > >> system system I use I'll be stuck to Latin 1 for the foreseeable future. > >> For filenames, a tool could be written to transcode them automatically, > >> but for the files' content there's *no* way for the system to know which > >> files would have to be transcoded and which would get destroyed if it > >> tried to transcode them. > > > > Based on your response to Jeff, however, you seem to be OK with using > > UTF-8 for /etc/passwd, even though non-ASCII characters would have to > > be converted to ISO-8859-1. > > Yes. Where's the contradiction in that? Programs and library routines > that handle the passwd file would know that it's in UTF-8, and convert > to and from other encodings when necessary. The worst thing that could > happen is that administrators upgrading their systems could have to run > a tool to transcode their passwd files. Once. > > Couldn't I run the same tool on all my other files? Yes, *if* I were > willing to go through them one by one to decide which of them had to be > converted. Having a tool to do the conversion would be a good thing. Non-utf8 characters do some odd things to cd burning and nautilus, among others. The problem is determining what character set it really is. If you download files from usenet or some other worldwide source, the files could be in all sorts of strange encoding. Almost need something to let the user say "this directory contains files in German, this one in Chinese, this one in japanese, etc.)