git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[neutron] OVS inactivity probes


Thank you, Slawek. Appreciate the quick assist!

James 

On 3/24/20, 4:58 AM, "Slawek Kaplonski" <skaplons at redhat.com> wrote:

    CAUTION: This message originated externally, please use caution when clicking on links or opening attachments!
    
    
    Hi,
    
    I checked it a bit deeper and it seems for me that method add_manager, which is
    in [1] is not used at all.
    In the past it was used by function "enable_connection_uri" from
    neutron.agent.ovsdb.native.helpers module but commit [2] switched it to use
    helper function from ovsdbapp.
    So I think that this is simply bug in Neutron which we need to fix. I opened bug
    for it [3].
    
    [1] https://opendev.org/openstack/neutron/src/branch/master/neutron/agent/common/ovs_lib.py#L122
    [2] https://review.opendev.org/#/c/453014/
    [3] https://bugs.launchpad.net/neutron/+bug/1868686
    
    On Mon, Mar 23, 2020 at 05:11:42PM +0000, James Denton wrote:
    > Hello all,
    >
    > Like others, we have seen an increase in the amount of messages like those below, followed up by disconnects:
    >
    > 2020-03-19T07:19:47.414Z|01614|rconn|ERR|br-int<->tcp:127.0.0.1:6633: no response to inactivity probe after 10 seconds, disconnecting
    > 2020-03-19T07:19:47.414Z|01615|rconn|ERR|br-ex<->tcp:127.0.0.1:6633: no response to inactivity probe after 10 seconds, disconnecting
    > 2020-03-19T07:19:47.414Z|01616|rconn|ERR|br-tun<->tcp:127.0.0.1:6633: no response to inactivity probe after 10 seconds, disconnecting
    >
    > We have since increased the value of of_inactivity_probe (https://bugs.launchpad.net/neutron/+bug/1817022) and confirmed the Controller connection reflects this new value:
    >
    > ---
    > # ovs-vsctl list Controller
    > ...
    > _uuid               : b4814677-c6f3-4afc-9c9e-999d5a5ac78f
    > connection_mode     : out-of-band
    > controller_burst_limit: []
    > controller_rate_limit: []
    > enable_async_messages: []
    > external_ids        : {}
    > inactivity_probe    : 60000
    > is_connected        : true
    > local_gateway       : []
    > local_ip            : []
    > local_netmask       : []
    > max_backoff         : []
    > other_config        : {}
    > role                : other
    > status              : {last_error="Connection refused", sec_since_connect="1420", sec_since_disconnect="1423", state=ACTIVE}
    > target              : "tcp:127.0.0.1:6633"
    > ---
    >
    > However, we also see disconnects on the manager side, which the config option does not address:
    >
    > 2020-03-23T11:01:02.871Z|00443|reconnect|ERR|tcp:127.0.0.1:50098: no response to inactivity probe after 5 seconds, disconnecting
    >
    > This bug (https://bugs.launchpad.net/neutron/+bug/1627106) and related commit (https://opendev.org/openstack/neutron/commit/1698bee770b84a2663ba940a6ded5d4b9733101a) appear to leverage the ovs_vsctl_timeout value (since renamed to ovsdb_timeout), but the inactivity_probe for the Manager connection does not appear to be implemented. Honestly, I'm not sure if that code path is used.
    >
    > ---
    > # ovs-vsctl list Manager
    > _uuid               : d61519ba-93fc-4fe5-b05c-b630778a44b0
    > connection_mode     : []
    > external_ids        : {}
    > inactivity_probe    : []
    > is_connected        : true
    > max_backoff         : []
    > other_config        : {}
    > status              : {bound_port="6640", n_connections="2", sec_since_connect="0", sec_since_disconnect="0"}
    > target              : "ptcp:6640:127.0.0.1"
    > ---
    >
    > Running this by hand sets the inactivity_probe timeout on manager connection, but we'd prefer to use a built-in method, if possible:
    >
    > # ovs-vsctl set manager d61519ba-93fc-4fe5-b05c-b630778a44b0 inactivity_probe=30000
    >
    > Any suggestions?
    >
    > Thanks,
    > James
    >
    
    --
    Slawek Kaplonski
    Senior software engineer
    Red Hat