Re: Kernel SCM saga..

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Fri, 8 Apr 2005, Andrea Arcangeli wrote:
> 
> We'd need a regenerated coherent copy of BKCVS to pipe into those SCM to
> evaluate how well they scale.

Yes, that makes most sense, I believe. Especially as BKCVS does the 
linearization that makes other SCM's _able_ to take the data in the first 
place. Few enough SCM's really understand the BK merge model, although the 
distributed ones obviously have to do something similar.

> OTOH if your git project already allows storing the data in there,
> that looks nice ;).

I can express the data, and I did a sparse .git archive to prove the 
concept. It doesn't even try to save BK-specific details, but as far as I 
can tell, my git-conversion did capture all the basic things (ie not just 
the actual source tree, but hopefully all the "who did what" parts too).

Of course, my git visualization tools are so horribly crappy that it is 
hard to make sure ;)

Also, I suspect that BKCVS actually bothers to get more details out of a
BK tree than I cared about. People have pestered Larry about it, so BKCVS
exports a lot of the nitty-gritty (per-file comments etc) that just
doesn't actually _matter_, but people whine about. Me, I don't care. My
sparse-conversion just took the important parts.

> I don't yet fully understand how the algorithms of the trees are meant
> to work

Well, things like actually merging two git trees is not even something git
tries to do. It leaves that to somebody else - you can see what the
relationship is, and you can see all the data, but as far as I'm
concerned, git is really a "filesystem". It's a way of expression
revisions, but it's not a way of creating them.

> It looks similar to a diff -ur of two hardlinked trees

Yes. You could really think of it that way. It's not really about
hardlinking, but the fact that objects are named by their content does
mean that two objects (regardless of their type) can be seen as
"hardlinked" whenever their contents match.

But the more interesting part is the hierarchical virtual format it has,
ie it is not only hardlinked, but it also has the three different levels
of "views" into those hardlinked objects ("blob", "tree", "revision").

So even though the hash tree looks flat in the _physcal_ filesystem, it 
detinitely isn't flat in its own virtual world. It's just flattened to fit 
in a normal filesystem ;)

[ There's also a fourth level view in "trust", but that one hasn't been
  implemented yet since I think it might as well be done at a higher
  level. ]

Btw, the sha1 file format isn't actually designed for "rsync", since rsync 
is really a hell of a lot more capable than my format needs. The format is 
really designed for something like a offline http grabber, in that you can 
just grab files purely by filename (and verify that you got them right by 
running sha1sum on the resulting local copy). So think "wget".

				Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux