Re: what is our answer to ZFS?

Anton Altaparmakov wrote:

On Tue, 22 Nov 2005, Chris Adams wrote:
Once upon a time, Jan Harkes <jaharkes@cs.cmu.edu> said:
The only thing that tends to break are userspace archiving tools like
tar, which assume that 2 objects with the same 32-bit st_ino value are
identical.
That assumption is probably made because that's what POSIX and Single
Unix Specification define: "The st_ino and st_dev fields taken together
uniquely identify the file within the system."  Don't blame code that
follows standards for breaking.
The standards are insufficient however. For example dealing with namedstreams or extended attributes if exposed as "normal files" wouldnaturally have the same st_ino (given they are the same inode as thenormal file data) and st_dev fields.
I think that by now several actually double check that theinode
linkcount is larger than 1.
That is not a good check.  I could have two separate files that have
multiple links; if st_ino is the same, how can tar make sense of it?
Now that is true. In addition to checking the link count is larger then1, they should check the file size and if that matches compute the SHA-1digest of the data (or the MD5 sum or whatever) and probably should alsocheck the various stat fields for equality before bothering with thechecksum of the file contents.
Or Linux just needs a backup api that programs like this can use tosave/restore files. (Analogous to the MS Backup API but hopefullyless horid...)

In order to prevent the problems mentioned AND satisfy SuS, I wouldthink that the st_dev field would be the value which should be unique,which is not always the case currently. The st_inod is a file id onst_dev, and it would be less confusing if the inode on each st_dev wereunique. Not to mention that some backup programs do look at that st_devand could be mightily confused if the meaning is not determinant.

Historical application usage assumes that it is invariant, manyapplications were written before pluggable devices and network mounts.In a perfect world where nothing broke when things were changed, ifthere were some UUID on a filesystem, so it looks the same mounted overnetwork or by direct mount, or loopback mount, etc, then there would beno confusion.

A backup API would really be nice if it could somehow provide someunique ID, such that a netowrk or direct backup of the same data wouldhave the same IDs.

--
   -bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
 last possible moment - but no longer"  -me

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

References:
- Re: what is our answer to ZFS?
  - From: Chris Adams <cmadams@hiwaay.net>
- Re: what is our answer to ZFS?
  - From: Anton Altaparmakov <aia21@cam.ac.uk>

Prev by Date: Re: 2.6.15-git latest build broken on UP x86-64
Next by Date: Re: what is our answer to ZFS?
Previous by thread: Re: what is our answer to ZFS?
Next by thread: Re: what is our answer to ZFS?
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]