git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Artemis 2.6.2] AMQP messages going to expiry queue, redelivery off that queue is failing


thanks for looking into this. what is the proper way to force for testing a
redelivery that goes back to the broker without transactions? its probably
like killing the session or connection. that would be if we wanted to test
if non-transacted redeliveries were getting corrupted

we have a lot of people using spring-jms and as i look in the spring code...
https://github.com/spring-projects/spring-framework/blob/master/spring-jms/src/main/java/org/springframework/jms/listener/AbstractMessageListenerContainer.java
looking at doExecuteListener and rollbackOnExceptionIfNecessary
it looks like if its transacted they try to rollback() and if its
client_ack they try recover(). they probably dont handle auto_ack because
that was acked immediately?

speaking of this what is an appropriate way to NACK a message when using
Client Acknowledgemnt and a JMS messageHandler? is it simply to reach the
end of the method execution without having called message.acknowledge() or
would it be appropriate to throw a RuntimeException (since i cannot throw a
checked Exception out of an implementation of javax.jms.MessageListener) ?

On Thu, Aug 30, 2018 at 8:38 AM Robbie Gemmell <robbie.gemmell@xxxxxxxxx>
wrote:

> Tim and I had an initial look at this, and can see generally where the
> broker is internally corrupting things on send, though not yet the
> full picture how it gets there or what to do about it. The expiration
> is likely to be key, one difference with the non-transacted case is
> actually going to be because its using recover() which the client
> performs locally.
>
> Robbie
>
> On Wed, 29 Aug 2018 at 06:23, Dan Langford <danlangford@xxxxxxxxx> wrote:
> >
> > ok i wrote 3 test files. I don't know the best way to get them to you
> > easily. hopefully a Gist is ok.
> > https://gist.github.com/danlangford/071e738225ec0c68dd470816b977499b
> >
> > you can copy those 3 files straight to
> >
> ./tests/integration-tests/src/test/java/org/apache/activemq/artemis/tests/integration/amqp
> >
> > The test JMSTransactedRedeliveryTest::testAMQPProducerAMQPConsumer proves
> > that a transacted client can .rollback() a handful of times and still be
> > able to consume the redelivered message later on.
> >
> > The test JMSTransactedRedeliveryBugTest::testAMQPProducerAMQPConsumer
> shows
> > that if a message had been expired and now a transacted client is
> > attempting to consume it the client only has 2 chances before the broker
> > starts sending the message in a way that will not parse correctly
> >
> > The test JMSNonTransactedRedeliveryBugTest::testAMQPProducerAMQPConsumer
> shows
> > that if a message had been expired a non-transacted client has no
> troubles
> > reliably accessing the redelivered message from broker
> >
> >
> > as you can tell i am mostly concerned about AMQP->AMQP for my use case.
> > some of those other combos are failing some of these tests in other ways.
> > naturally you can address those as you see fit but for my client the
> > AMQP->AMQP is a roadblocker.
> >
> >
> > let me know if you can determine why the broker is sending an extra null
> > character in the payload on the third time the messages attempts
> delivery.
> > maybe we are doing something incorrectly.
> >
> >
> > This has been more of an issue than i thought due to the fact that Spring
> > default to enabling transactions. in all of my initial tests i couldn't
> > reproduce it because i prefer the straight simplified jms api from 2.0
> and
> > that defaults to sessions not being transacted. that being said nearly
> all
> > of my clients prefer using Spring Boot autoconfigurer and other spring
> > pieces which happen to default to transacted sessions.  i can now have
> some
> > of them workaround but others of them are requiring the transaction.
> >
> >
> > also as a reminder and for context here is a link to the initial
> > conversation i had with the Qpid Jms Client devs who pointed out to me
> the
> > erroneous null character in the message transfer from the broker:
> > https://lists.apache.org/
> > thread.html/b1fd9c09a1f66f5529601a8651fbb96585c011b22bbd84e07c4f23b1@%
> 3Cusers.qpid.apache.org%3E
> >
> >
> > thank you so much for your time
> >
> > On Tue, Aug 14, 2018 at 1:19 PM Timothy Bish <tabish121@xxxxxxxxx>
> wrote:
> >
> > > On 08/13/2018 07:12 PM, Dan Langford wrote:
> > > > some of my users are attempting a pattern to deduplicate messages
> based
> > > on
> > > > a time window instead of a fixed amount of space (a duplicate id
> cache)
> > > >
> > > > so far the concept has been working very well. So they send their
> AMQP
> > > > messages (qpid-jms-client) into a Last Value Queue with an
> appropriate
> > > > identifier in the _AMQ_LVQ_NAME. They also set a TimeToLive on the
> > > message
> > > > that is essentially the lag they will allow as they want to wait for
> > > > possible duplicates. If any duplicates come in the Last Value Queue
> > > > behavior is replacing the older message with the newer message until
> the
> > > > expiration. expired messages are delivered to the preconfigured
> expiry
> > > > queue where their application is listening. This is not perfect but
> its
> > > not
> > > > intended to be. Its just intended to reduce additional unnecessary
> > > > processing and they understand this is not a guarantee. It really
> helps
> > > > with a system that produces messages in a way that has flurries of
> > > > "notifications" about the same assetID over and over again.
> > > >
> > > > BUT where we are seeing is a problem is when we are consuming from
> the
> > > > queue used to hold expired messages and we toss some exception and
> the
> > > > message needs to be redelivered. the first time or two the message is
> > > > redelivered it is delivered OK. But when the JMSXDeliveryCount is
> about 3
> > > > or 4 (we use redelivery delay and multipliers to spread these out)
> our
> > > > qpid-jms-client stops being able to read the messages.
> > > >
> > > > we were only able to reproduce this when an AMQP message expired
> onto the
> > > > queue. (expired from a LVQ in case that is relevant). if we place the
> > > > message directly on a queue and test different exception and
> redelivery
> > > > scenarios we cannot reproduce this behavior.
> > > >
> > > > i enable the qpid-jms-client frame logging (via env variable
> > > > PN_TRACE_FRM=true) and i saw that in the situation when the client
> code
> > > > cannot access the payload, even though the broker WAS still sending
> the
> > > > payload. so i thought it was some odd issue with the client. The
> Apache
> > > > Qpid team responded that the issue seems to be that the broker
> starts to
> > > > send some ill formed payloads in this scenario. i dont want to
> repeat the
> > > > stack traces and their response, you can read those here
> > > >
> > > >
> > >
> https://lists.apache.org/thread.html/b1fd9c09a1f66f5529601a8651fbb96585c011b22bbd84e07c4f23b1@%3Cusers.qpid.apache.org%3E
> > > >
> > > > would it be helpful if i tested that this happens if there is not a
> LVQ
> > > > involved? i could have a message in a non-LVQ expire to another
> queue and
> > > > see if redeliveries over their get messed up after a few attempts.
> For
> > > the
> > > > record this is AMQP for producing and consuming. i do notice the
> messages
> > > > waiting in the expiry queue have much more headers messages sent
> directly
> > > > to a queue from client code. they seem to be headers full of
> information
> > > > about the message as it left the previous queue. I tried to send a
> > > message
> > > > directly to the expiry queue with all these headers to determine if
> it
> > > was
> > > > the existence of one of these specifically that trigger the malformed
> > > frame
> > > > but was not able to fully set all those headers. the JMSDeliverCount
> > > (type
> > > > Long) was the one that the client would not let me set and as a
> result i
> > > > could not test. for clarity thought i dont know that the issue
> exists due
> > > > to a header that is just what i saw as a difference between messages
> be
> > > > delivered to the queue by client code versus messages expiring from
> one
> > > > queue to another.
> > > >
> > > > please look over the linked thread on the qpid list and let me know
> if
> > > you
> > > > know why a message transfer fram would become malformed after a few
> > > failed
> > > > deliveries only if the message expired onto the current queue.
> > > >
> > > > thanks so much
> > > >
> > >
> > > A great place to start is to create a unit test that reproduces the
> > > issue.  You can look at the Artemis unit tests for AMQP to get some
> > > inspiration on how to set one up.  Then try and create the smallest
> > > possible test that can reproduce the issue to make it easier to narrow
> > > in on where the issue might be.
> > >
> > > The AMQP tests in Artemis are located here:
> > >
> > >
> > >
> https://github.com/apache/activemq-artemis/tree/master/tests/integration-tests/src/test/java/org/apache/activemq/artemis/tests/integration/amqp
> > >
> > >
> > >
> > > --
> > > Tim Bish
> > >
> > >
>