Re: OT: character encodings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 08, 2007 at 02:32:42AM +0100, Tilman Schmidt wrote:
> Am 08.01.2007 01:38 schrieb Willy Tarreau:
>...
> > And I'm not even
> > discussing the stupidity which requires that you read a whole text to get
> > its number of characters !
> 
> Personally I find the requirement to know the number of characters in a text
> rather unusual, so I wouldn't base a decision for an encoding on that. In
> fact, I cannot remember ever really wanting to know the actual number of
> characters in a text. The number of bytes occupied on storage, ok. The
> number of letters, of words, of lines, perhaps even the number of printable
> characters, all potentially interesting, depending on the application.
> But the raw number of characters? I don't know what that might serve for.

Also note that the UTF-32 Unicode encoding would offer this property, 
but with the following disadvantages compared to the UTF-8 Unicode 
encoding:
- 7bit ASCII is not a subset of UTF-32 losing a lot of compatibility
  (code 7bit ASCII with some UTF-8 in the comments is no problem
   for not-Unicode aware systems except for slight misdisplayments 
   of the comments)
- UTF-32 has up to 4 times the size of UTF-8

There's also the point that you can use e.g. "wc" or your editor for 
counting the characters.

> HTH
> Tilman

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux