On Wed, 2007-05-02 at 20:58 -0500, Steven P. Ulrick wrote: > I have used the Sword Project's "diatheke" command to output the Bible > into plain text files, divided by chapter and book. That may be the problem. Is it creating real UTF-8 text, as Fedora usually defaults to? What do you get if you do a "file" on those text files. e.g. [tim@serge ~]$ file example.text example.text: UTF-8 Unicode text Also, what's your locale set up to? e.g. [tim@serge ~]$ locale LANG=en_AU.UTF-8 LC_CTYPE="en_AU.UTF-8" LC_NUMERIC="en_AU.UTF-8" LC_TIME="en_AU.UTF-8" LC_COLLATE="en_AU.UTF-8" LC_MONETARY="en_AU.UTF-8" LC_MESSAGES="en_AU.UTF-8" LC_PAPER="en_AU.UTF-8" LC_NAME="en_AU.UTF-8" LC_ADDRESS="en_AU.UTF-8" LC_TELEPHONE="en_AU.UTF-8" LC_MEASUREMENT="en_AU.UTF-8" LC_IDENTIFICATION="en_AU.UTF-8" LC_ALL= If they're working differently from each other, you're in for some troubles, as you've discovered. It generally is best if everything uses UTF-8, you have one encoding for everything, instead of this being in ISO-8859-1 and that in ISO-8859-9, because they had different characters that couldn't be done in the other, and so on. UTF-8 covers almost everything, in one scheme. For Fedora, I've found the easiest way to set this was when logging into an X session. The logon screen has a "language" option that sets just about all the parameters in one go. > If (as an example) I open up "01-Genesis.txt" in KDE's KWrite, Genesis > 4:22 looks like this (in the screenshots that follow, keep > an eye on the name "Tubal–cain"): > http://www.afolkey2.net/Projects/Genesis422-ss-001.jpg That looks fine. You've got an em or en dash between those words, not a hyphen. Being unfamilar with the terms in the quote, I don't know if that is a hyphenated word, or two words that should be joined by a *proper* dash. > If I do "Insert | File" from within OpenOffice.org, I get the following: > http://www.afolkey2.net/Projects/Genesis422-ss-002.jpg That looks like a character encoding issue (e.g. you see that sort of thing when importing an UTF-8 file, when the application thinks that the encoding is something like ISO-8859-1). How are you importing the file? There's a selection list of different file types you can import files as, in the import requester. Importing some UTF-8 text worked fine, for me, without picking anything (the default worked fine). > If I open OpenOffice.org and just open the same file referred to in all > of these examples, it looks like this: > http://www.afolkey2.net/Projects/Genesis422-ss-003.jpg Same issue (character encoding), different font. > Then, if I copy and paste the sample verse, book, entire Bible, > whatever from KWrite (which displays all occurrences of this > "hyphen-like" character correctly) into a new OpenOffice.org text > document, it looks like this: > http://www.afolkey2.net/Projects/Genesis422-ss-004.jpg As it should... Kwrite managed to determine the encoding, and cut and paste between programs used the default/locale encoding scheme (probably UTF-8), and things "just work". > The verse also correctly displays in gedit. BUT, if I display the same > file using abiword, vi, emacs or less, it does not display correctly. It sounds like your CLI encoding is not UTF-8. Hence why text-based tools aren't managing it, and many GUI tools are (they often get their settings in another way). OpenOffice.org, by default, reads the default locale, and tries to work using the same scheme. It can be custom configured, like you've done with kwrite, but I think that's working around the problem, rather than fixing it. > In "Tools | Encoding" in KWrite, I have it set to "utf8" That'd be why it could handle the file, it was presuming something because you specifically told it to. Others presume a different default. > So I guess the question is, "What SHOULD my default encoding be set to, > and how and where do I set it so that it is respected by all > applications?" Usually UTF-8. And, in general, you usually want it to be the same as everything else (if it's UTF-8 or something else) -- (This box runs FC6, my others run FC4 & FC5, in case that's important to the thread.) Don't send private replies to my address, the mailbox is ignored. I read messages from the public lists.