git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [ARTEMIS] Three nodes symmetric static discovery cluster with HA replication colocated and automatic client failover


I have tried to recreate my case with use of provided tooling (ServerUtil) for orchestrated server start and stop but I received worrying results on Windows machine.

I'm starting three nodes of the cluster and background threads for produce / consume (for one minute). After 10 seconds first node is killed, next 10 second node is killed, next 10 seconds third node is killed and... background verification is still working! It fails immediately when I use artemis.cmd stop from command line.

It works on Linux.

Kind regards
Marcin


-----Original Message-----
From: Justin Bertram [mailto:jbertram@xxxxxxxxxx] 
Sent: 05 July 2018 16:07
To: users@xxxxxxxxxxxxxxxxxxx
Subject: Re: [ARTEMIS] Three nodes symmetric static discovery cluster with HA replication colocated and automatic client failover

I haven't had any time to look into this in depth.  Would you be able to
work up a reproducer?  I think you could easily modify one of the HA
examples shipped with the broker to reproduce your use-case.  You might
even try simplifying it a bit to just 2 nodes.  Simpler is always better
for reproducers as it narrows down the investigation.  Once you get a
reproducer you can slap it into a GitHub repo somehwere.


Justin

On Wed, Jul 4, 2018 at 9:19 AM, Stefaniuk, Marcin <
marcin.stefaniuk@xxxxxxxxxxxxxxxxx> wrote:

> I'm struggling to create set-up as mentioned in the subject on ActiveMQ
> Artemis 2.5.0. My key configuration looks as follows (for first node of
> three):
>
> <acceptors>
>     <acceptor name="node-1-universal-plain">tcp://0.0.0.0:61616?
> tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;
> protocols=CORE,AMQP,STOMP,HORNETQ,MQTT,OPENWIRE;useEpoll=true;amqpCredits=
> 1000;amqpLowCredits=300</acceptor>
> </acceptors>
>
> <connectors>
>     <connector name="node-1-connector">tcp://localhost:61616</connector>
>     <connector name="node-2-connector">tcp://localhost:62616</connector>
>     <connector name="node-3-connector">tcp://localhost:63616</connector>
> </connectors>
>
> <cluster-connections>
>     <cluster-connection name="showcase-cluster">
>         <connector-ref>node-1-connector</connector-ref>
>         <retry-interval>500</retry-interval>
>         <use-duplicate-detection>true</use-duplicate-detection>
>         <message-load-balancing>ON_DEMAND</message-load-balancing>
>         <max-hops>1</max-hops>
>         <static-connectors>
>             <connector-ref>node-2-connector</connector-ref>
>             <connector-ref>node-3-connector</connector-ref>
>         </static-connectors>
>     </cluster-connection>
> </cluster-connections>
>
> <ha-policy>
>     <replication>
>         <colocated>
>             <backup-port-offset>7</backup-port-offset>
>             <request-backup>true</request-backup>
>             <max-backups>2</max-backups>
>             <backup-request-retries>-1</backup-request-retries>
>             <backup-request-retry-interval>2000</backup-request-
> retry-interval>
>             <master />
>             <slave />
>         </colocated>
>     </replication>
> </ha-policy>
>
> Rest of nodes has similar configuration - adjusted cluster connections and
> acceptors. I'm deploying it also on three separate hosts (each different
> from localhost). What is important I have no discovery groups (no
> possibility to use UDP).
>
> So my test is connecting to a cluster using ActiveMQConnectionFactory and
> URI "(tcp://node-1:61616,tcp://node-2:62616)?ha=true&reconnectAttempts=-1"
> (leaving third to be obtained directly from a cluster) and one thread is
> producing and second consuming messages (separate connection used). Test is
> working fine (unsurprisingly) even when producer is connected to different
> nodes of the cluster. But when one node is stopped then producer / consumer
> connected to that node is affected - no send / receive is performed but
> some messages on the client side is buffered and flushed when node is again
> available. I would expect to automagically switch connection to another
> node but it is not happening here. I have tried that previously without HA
> but with the same result.
>
> Could you help me determine what I'm doing wrong?
>
> Kind regards
> Marcin Stefaniuk
> CREDIT SUISSE (POLAND) SP. Z O.O
> Solution Architect | Messaging Engineering Warsaw, MITM 47
> Atrium 2 | 00-849 Warsaw | Poland
> marcin.stefaniuk@xxxxxxxxxxxxxxxxx<mailto:marcin.
> stefaniuk@xxxxxxxxxxxxxxxxx> | www.credit-suisse.com<http://
> www.credit-suisse.com/>
>
> ===============================================================================
>
> Please access the attached hyperlink for an important electronic
> communications disclaimer:
> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
> ===============================================================================
>
>


=============================================================================== 
Please access the attached hyperlink for an important electronic communications disclaimer: 
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
===============================================================================