git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[neutron] OVS inactivity probes


Hi,

I checked it a bit deeper and it seems for me that method add_manager, which is
in [1] is not used at all.
In the past it was used by function "enable_connection_uri" from
neutron.agent.ovsdb.native.helpers module but commit [2] switched it to use
helper function from ovsdbapp.
So I think that this is simply bug in Neutron which we need to fix. I opened bug
for it [3].

[1] https://opendev.org/openstack/neutron/src/branch/master/neutron/agent/common/ovs_lib.py#L122
[2] https://review.opendev.org/#/c/453014/
[3] https://bugs.launchpad.net/neutron/+bug/1868686

On Mon, Mar 23, 2020 at 05:11:42PM +0000, James Denton wrote:
> Hello all,
> 
> Like others, we have seen an increase in the amount of messages like those below, followed up by disconnects:
> 
> 2020-03-19T07:19:47.414Z|01614|rconn|ERR|br-int<->tcp:127.0.0.1:6633: no response to inactivity probe after 10 seconds, disconnecting
> 2020-03-19T07:19:47.414Z|01615|rconn|ERR|br-ex<->tcp:127.0.0.1:6633: no response to inactivity probe after 10 seconds, disconnecting
> 2020-03-19T07:19:47.414Z|01616|rconn|ERR|br-tun<->tcp:127.0.0.1:6633: no response to inactivity probe after 10 seconds, disconnecting
> 
> We have since increased the value of of_inactivity_probe (https://bugs.launchpad.net/neutron/+bug/1817022) and confirmed the Controller connection reflects this new value:
> 
> ---
> # ovs-vsctl list Controller
> ...
> _uuid               : b4814677-c6f3-4afc-9c9e-999d5a5ac78f
> connection_mode     : out-of-band
> controller_burst_limit: []
> controller_rate_limit: []
> enable_async_messages: []
> external_ids        : {}
> inactivity_probe    : 60000
> is_connected        : true
> local_gateway       : []
> local_ip            : []
> local_netmask       : []
> max_backoff         : []
> other_config        : {}
> role                : other
> status              : {last_error="Connection refused", sec_since_connect="1420", sec_since_disconnect="1423", state=ACTIVE}
> target              : "tcp:127.0.0.1:6633"
> ---
> 
> However, we also see disconnects on the manager side, which the config option does not address:
> 
> 2020-03-23T11:01:02.871Z|00443|reconnect|ERR|tcp:127.0.0.1:50098: no response to inactivity probe after 5 seconds, disconnecting
> 
> This bug (https://bugs.launchpad.net/neutron/+bug/1627106) and related commit (https://opendev.org/openstack/neutron/commit/1698bee770b84a2663ba940a6ded5d4b9733101a) appear to leverage the ovs_vsctl_timeout value (since renamed to ovsdb_timeout), but the inactivity_probe for the Manager connection does not appear to be implemented. Honestly, I'm not sure if that code path is used.
> 
> ---
> # ovs-vsctl list Manager
> _uuid               : d61519ba-93fc-4fe5-b05c-b630778a44b0
> connection_mode     : []
> external_ids        : {}
> inactivity_probe    : []
> is_connected        : true
> max_backoff         : []
> other_config        : {}
> status              : {bound_port="6640", n_connections="2", sec_since_connect="0", sec_since_disconnect="0"}
> target              : "ptcp:6640:127.0.0.1"
> ---
> 
> Running this by hand sets the inactivity_probe timeout on manager connection, but we'd prefer to use a built-in method, if possible:
> 
> # ovs-vsctl set manager d61519ba-93fc-4fe5-b05c-b630778a44b0 inactivity_probe=30000
> 
> Any suggestions?
> 
> Thanks,
> James
> 

-- 
Slawek Kaplonski
Senior software engineer
Red Hat