Re: Finding hardlinks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi!

> > > the use of a good hash function.  The chance of an accidental
> > > collision is infinitesimally small.  For a set of 
> > > 
> > >          100 files: 0.00000000000003%
> > >    1,000,000 files: 0.000003%
> > 
> > I do not think we want to play with probability like this. I mean...
> > imagine 4G files, 1KB each. That's 4TB disk space, not _completely_
> > unreasonable, and collision probability is going to be ~100% due to
> > birthday paradox.
> > 
> > You'll still want to back up your 4TB server...
> 
> Certainly, but tar isn't going to remember all the inode numbers.
> Even if you solve the storage requirements (not impossible) it would
> have to do (4e9^2)/2=8e18 comparisons, which computers don't have
> enough CPU power just yet.

Storage requirements would be 16GB of RAM... that's small enough. If
you sort, you'll only need 32*2^32 comparisons, and that's doable.

I do not claim it is _likely_. You'd need hardlinks, as you
noticed. But system should work, not "work with high probability", and
I believe we should solve this in long term.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux