Re: confusion and case problems: utf8 <-> iocharset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Eduard Bloch wrote:

> Hello (to whom it may concern),
> 
> I try to understand how the charset mapping with VFAT/Joliet and I found
> some inconsistencies between the user expectations, the docs, and the
> actuall behaviour.
> 
> Users view:
> 
> VFAT, NTFS and Joliet use a Unicode charset for storing the names
> internaly. 

Nope. VFAT is using short name as long as it complies with MSDOS; this name
is stored in codepage character set.

[...]
> 
> Second:
> there is the "utf8" option. How does that exactly differ from
> iocharset=utf8? There is not clear explanation in vfat.txt. What happens
> if you use both options, especially if iocharset!=utf8? Which one is
> prefered?
>

You actually need both; utf8 just says it is OK not to try to mangle names;
it does not substitute iocharset option.
 
> Third:
> how can I disable all that funny letter case conversions? They are not
> described anywhere properly,

man mount

> nor the way to disable them. 

man mount

> IMO there are  
> two problems:
> 
>  - what you write to the FS is not the same what "ls" shows you later.
>    Eg. ABW becomes "abw" but "ABWÖ" becomes "ABWÖ". Abcd becomes "Abcd"
>    but "ABC" becomes "abc".  Does it make sense? NO.

tell this to Microsoft. It is how VFAT works. Although umlaut should
probably be considered as MSDOS-safe, at least with proper codepage option.


>    I would like to stop the kernel playing such games, I had enough of
>    such trouble back in my Windows 98 times.
> 
>  - this case conversion can actually break things. When iocharset=utf-8
>    and utf8 are used, then you cannot access the data with the same
>    name after storing it.
> 

mount shortname=mixed is probably what you want.

> zombie:/tmp# mkdir test/TEST
> zombie:/tmp# ls test
> test
> zombie:/tmp# ls test/test
> zombie:/tmp# ls test/TEST
> ls: test/TEST: No such file or directory

{pts/1}% sudo mount -t vfat -o
loop,shortname=mixed,utf8,uid=bor /var/tmp/test.dos /tmp/x
{pts/1}% LC_ALL=C ll /tmp/x/test
total 2
drwxr-xr-x  2 bor root 2048 Jul 13 20:06 TEST/
{pts/1}% LC_ALL=C ll /tmp/x/test/TEST
total 0
-rwxr-xr-x  1 bor root 0 Jul 13 20:06 ?*
-rwxr-xr-x  1 bor root 0 Jul 13 20:06 ?*
-rwxr-xr-x  1 bor root 0 Jul 13 20:06 ?*
-rwxr-xr-x  1 bor root 0 Jul 13 20:06 ?*

The filesystem is foobared already because it contains both capital and
lowercase versions of the same names. This is what happens when you try to
use wrong codepage.

-andrey

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux