Re: [PATCH 1b/7] dlm: core locking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thursday 28 April 2005 08:37, Lars Marowsky-Bree wrote:
> On 2005-04-27T22:52:55, Daniel Phillips <[email protected]> wrote:
> > > So we can't deliver it raw membership events. Noted.
> >
> > Just to pick a nit: there is no way to be sure a membership event might
> > not still be on the way to the dead node, however the rest of the cluster
> > knows the node is dead and can ignore it, in theory.  (In practice, only
> > (g)dlm and gfs are well glued into the cman membership protocol, and
> > other components, e.g., cluster block devices and applications, need to
> > be looked at with squinty eyes.)
>
> I'm sorry, I don't get what you are saying here. Could you please
> clarify?
>
> "Membership even on the way to the dead node"? ie, you mean that the
> (now dead) node hasn't acknowledged a previous membership which still
> included it, because it died inbetween? Well, sure, membership is never
> certain at all; it's always in transition, essentially, because we can
> only detect faults some time after the fact.

Exactly, and that is what the barriers are for.  I like the concept of 
barriers a whole lot.  We should put this interface on a pedestal and really 
dig into how to use it, or even better, how to optimize it.

But for now, as I understand it, a cluster client's view of the cluster world 
is entirely via cman events, which include things like other nodes joining 
and leaving service groups.  (Service groups are another interface we need to 
put on a pedestal, and start working on, because right now it's a clean idea, 
not thought all the way through.)

> (It'd be cool if we could mandate nodes to pre-announce failures by a
> couple of seconds, alas I think that's a feature you'll only find in an
> OSDL requirement document, rated as "prio 1" ;-)

Heh, I generally think about failing over in less than a second, preferably 
much, much less.  Maybe you just have to scale your heuristic a little?

> I also don't understand what you're saying in the second part. How are
> gdlm/gfs "well glued" into the CMAN membership protocol, and what are we
> looking for when we turn our squinty eyes to applications...?

Gdlm and gfs are well-glued because Dave and Patrick have been working on it 
for years.  Other components barely know about the interfaces, let alone use 
them correctly.  In the end _every component_ of the cluster stack has to do 
the dance correctly on every node.  We've really only just started on that 
path.  Hopefully we'll be able to move down it much more quickly now that the 
code is coming out of the cathedral.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux