On Sat, 9 Apr 2005 [email protected] wrote:
>
> With 60,000 changesets in the current tree, we will start out our git
> repository with about 600,000 files. Assuming the first byte of the
> SHA1 hash is random, that means an average of 2343 files in each of the
> objects/xx directories. Give it a few more years at the current pace,
> and we'll have over 10,000 files per directory. This sounds like a lot
> to me ... but perhaps filesystems now handle large directories enough
> better than they used to for this to not be a problem?
The good news is that git itself doesn't really care. I think it's
literally _one_ function ("get_sha1_filename()") that you need to change,
and then you need to write a small script that moves files around, and
you're really much done.
Also, I did actually debate that issue with myself, and decided that even
if we do have tons of files per directory, git doesn't much care. The
reason? Git never _searches_ for them. Assuming you have enough memory to
cache the tree, you just end up doing a "lookup", and inside the kernel
that's done using an efficient hash, which doesn't actually care _at_all_
about how many files there are per directory.
So I was for a while debating having a totally flat directory space, but
since there are _some_ downsides (linear lookup for cold-cache, and just
that "ls -l" ends up being O(n**2) and things), I decided that a single
fan-out is probably a good idea.
> Or maybe the files should be named objects/xx/yy/zzzzzzzzzzzzzzzz?
Hey, I may end up being wrong, and yes, maybe I should have done a
two-level one. The good news is that we can trivially fix it later (even
dynamically - we can make the "sha1 object tree layout" be a per-tree
config option, and there would be no real issue, so you could make small
projects use a flat version and big projects use a very deep structure
etc). You'd just have to script some renames to move the files around..
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]