git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Designing for maximum Artemis performance


Justin,

That approach will work, to a point, but it has (at least) two failure
cases that would be problematic.

First, spinning up a replacement host is not instantaneous, so there will
be a period of at least a minute but possibly several where the messages on
that broker and storage volume will simply be unavailable to consumers.

Second, it means that there is only one copy of a given message within the
broker cluster, so if that storage volume gets corrupted or fails, you've
lost data, which would be unacceptable in some use cases.

There's also be a failure case if the number of hosts was not an even
multiple of the number of AZs, where the new host comes up in a different
AZ than the storage volume, and therefore can't use it. So you'd need to be
careful in designing the setup to avoid that potential problem.

Overall I think it's better to have a slave host addressing both the
availability and data durability concerns than to try to manage reusing
storage volumes, but it might depend on the exact requirements for which
approach was best.

Tim

On Wed, Oct 3, 2018, 2:56 PM Justin Bertram <jbertram@xxxxxxxxxx> wrote:

> > Would it be desirable for Artemis to support this functionality in the
> future though, i.e. if we raised it as a feature request?
>
> All things being equal I'd say probably so, but I suspect the effort to
> implement the feature might outweigh the benefits.
>
> > The cloud can manage spinning up another node, but the problem is
> telling/getting the Artemis cluster to make that server the master now.
>
> The way I imagine it would work best is without any slave at all.  The
> whole point of the slave is to take over quickly from a live broker that
> has failed in such a way that all the data from the failed broker is still
> available to clients.  Maybe I'm wrong about clouds, but I believe the
> cloud itself can provide this functionality by quickly spinning up a new
> broker when one fails.  So, you would have 3 live brokers in a cluster each
> with a separate storage node.  There wouldn't be any slaves at all.  When
> one of those brokers fails the cloud will spin up another to replace it and
> re-attach to the storage node so that any reconnecting client has access to
> all the data as before just like it would on a slave.  Or is that not how
> clouds work?
>
>
> Justin
>
> On Tue, Oct 2, 2018 at 10:50 PM schalmers <
> simon.chalmers@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> > jbertram wrote
> > > The master/slave/slave triplet architecture complicates fail-back
> quite a
> > > bit and it's not something the broker handles gracefully at this point.
> > > I'd recommend against using it for that reason.
> >
> > Would it be desirable for Artemis to support this functionality in the
> > future though, i.e. if we raised it as a feature request?
> >
> >
> > jbertram wrote
> > > To Clebert's point...I also don't understand why you wouldn't let the
> > > cloud
> > > infrastructure deal with spinning up another live node when one
> fails.  I
> > > was under the impression that's kind of what clouds are for.
> >
> > The cloud can manage spinning up another node, but the problem is
> > telling/getting the Artemis cluster to make that server the master now.
> > From
> > what I've read and been told, there's no way to failback to the master
> when
> > there is already a backup for the (new) master.
> >
> > That's what I'm looking for help on and were my original questions.
> >
> > If the position from Artemis is that there's no desire for Artemis to
> ever
> > work that way, even if we ask/raise a feature request, then we just need
> to
> > understand that so we can make design decisions in our application stack
> to
> > cater for that.
> >
> >
> >
> > --
> > Sent from:
> > http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
> >
>