git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[openstack-ansible] strange execution delays


Hi Joe,

Those timeouts re almost 99% the reason behind this issue.  I'd
suggest restarting systemd-logind and seeing how that fares:

systemctl restart systemd-logind

If the issue persists or happens again, I'm not sure, but those
timeouts are 100% a cause of issue here.

Thanks,
Mohammed

On Mon, Dec 30, 2019 at 2:51 PM Joe Topjian <joe at topjian.net> wrote:
>
> Hi Mohammad,
>
>> Do you have any PAM modules that might be hitting some sorts of
>> external API for auditing purposes that may be throttling you?
>
>
> Not unless OSA would have configured something. The deployment is *very* standard, heavily leveraging default values.
>
> DNS of each container is configured to use LXC host for resolution. The host is using the systemd-based resolver, but is pointing to a local, dedicated upstream resolver. I want to point the problem there, but we've run into this issue in two different locations, one of which has an upstream DNS resolver that I'm confident does not throttle requests. But, hey, it's DNS - maybe it's still the cause.
>
>>
>> How is systemd-logind feeling?  Anything odd in your system logs?
>
>
> Yes. We have a feeling it's *something* with systemd, but aren't exactly sure what. Affected containers' logs end up with a lot of the following entries:
>
> Dec  3 20:30:17 infra1-repo-container-a0f194b3 su[4170]: Successful su for root by root
> Dec  3 20:30:17 infra1-repo-container-a0f194b3 su[4170]: + ??? root:root
> Dec  3 20:30:17 infra1-repo-container-a0f194b3 su[4170]: pam_unix(su:session): session opened for user root by (uid=0)
> Dec  3 20:30:27 infra1-repo-container-a0f194b3 dbus-daemon[47]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
> Dec  3 20:30:42 infra1-repo-container-a0f194b3 su[4170]: pam_systemd(su:session): Failed to create session: Connection timed out
> Dec  3 20:30:43 infra1-repo-container-a0f194b3 su[4170]: pam_unix(su:session): session closed for user root
>
> But we aren't sure if those timeouts are a symptom of cause.
>
> Thanks for your help!
>
> Joe



-- 
Mohammed Naser â?? vexxhost
-----------------------------------------------------
D. 514-316-8872
D. 800-910-1726 ext. 200
E. mnaser at vexxhost.com
W. https://vexxhost.com