git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[jira] [Commented] (ARIES-1804) Timeout due to connection loss in RSA fastbin provider?


    [ https://issues.apache.org/jira/browse/ARIES-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496307#comment-16496307 ] 

Alex Weirig commented on ARIES-1804:
------------------------------------

Hi Johannes,

thanks for the answer.

I obviously looked at the karaf logs on the 2nd karaf server and there is neither a stacktrace nor anything else since the service doesn't get called at all.

The service does a basic LDAP authentication so the response time is far less than a second, either auth is ok or not. There is no potentially long running process.

But again since the remote service is not called that can't be the problem. As you can see in the AuthenticationServiceImpl.txt I'm writing to the LogService as soon as the service is called and that never happens.

It seems that the "proxy connection" between the 2 karaf servers using zookeeper and fastbin stops working after several hours.

I also looked at zookeeper logs but can't find anything that could help me figure out what's happening.

I have a similar setup with 2 older karaf (4.1.1) servers but using the same zookeepers but running other services and there the problem does not occur.

So really strange ...

Alex

> Timeout due to connection loss in RSA fastbin provider?
> -------------------------------------------------------
>
>                 Key: ARIES-1804
>                 URL: https://issues.apache.org/jira/browse/ARIES-1804
>             Project: Aries
>          Issue Type: Bug
>          Components: Remote Service Admin
>    Affects Versions: rsa-1.12.0
>         Environment: Karaf 4.2.0
> RSA 1.12.0
> zookeeper 3.4.12
> java 1.8.0_172-b11
> RHEL 7.5
>            Reporter: Alex Weirig
>            Priority: Critical
>         Attachments: AuthenticationServiceImpl.txt, LoginView.txt, stacktrace.txt, zoo.cfg.txt
>
>
> Hello,
> I'm running two karaf (4.2.0) servers, one is running the frontend of my application, the second one is running the backend.
> The backend services are published to 3 clustered zookeeper (3.4.12) servers. In karaf I have deployed the following RSA features:
> karaf@appsrvtlk()> feature:list | grep rsa
> aries-rsa-core │ 1.12.0 │ │ Started │ aries-rsa-1.12.0 │
> aries-rsa-provider-tcp │ 1.12.0 │ │ Uninstalled │ aries-rsa-1.12.0 │
> aries-rsa-provider-fastbin │ 1.12.0 │ x │ Started │ aries-rsa-1.12.0 │
> aries-rsa-discovery-local │ 1.12.0 │ │ Uninstalled │ aries-rsa-1.12.0 │
> aries-rsa-discovery-config │ 1.12.0 │ │ Uninstalled │ aries-rsa-1.12.0 │
> aries-rsa-discovery-zookeeper │ 1.12.0 │ x │ Started │ aries-rsa-1.12.0 │
> aries-rsa-discovery-zookeeper-server │ 1.12.0 │ │ Uninstalled │ aries-rsa-1.12.0 │
> When I start my karaf servers everything is working fine and my frontend can call my backend service and gets the result. But after some time (I can't figure out when) it seems that the connections between the karaf and zookeeper gets lost and I'm getting a timeout when I call my remote service eventhough all the servers (karaf and zookeepers) are still available and responding. Exhibitor shows no apparent issues with the zookeepers.
> I have attached the 
>  * relevant parts of my LoginView UI where I declared the @Reference to my service and where I call the remote service
>  * relevant parts of my AuthenticationService implementation that should be called on the remote karaf
>  * the stacktrace that I'm getting on the frontend karaf when the timeout occurs
>  * my zoo.cfg file
> From the stacktrace one can see that the LoginView has a non-null fastbin proxy handler for the authentication service but that after 5 minutes a timeout occurs and there is no line in the log that shows that the remote service was actually called.
> Many thanks in advance for your support.
> Kind regards,
> Alex



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)