git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Openstack] Lose 30+ seconds of packets to instance during Live-Migration


After turning off L2 population on the compute and network nodes, the
packet loss during live migration diminished from 30+ to about 3 seconds...

Does anyone have an explanation for this? I'd really like to be able to use
L2 pop and ARP responder if I can, but not at the cost of that large of a
hit when I live migrate.

Thanks in advance!

Steve

On Wed, Aug 22, 2018 at 11:56 AM Sterdnot Shaken <sterdnotshaken at gmail.com>
wrote:

> Version: Pike
> OVS version: 2.9
>
> VM-A (On Compute A) ----- (On Compute B) VM-B
>
> What is it in Neutron that might delay vxlan tunnel construction on the
> destination compute node during live-migration? As the VM is live-migrated,
> I'm watch the flows and the vxlan tunnel interfaces on br-tun on the
> Compute node where the VM is moving too and they don't appear until 30+
> seconds into the migration. I'm wondering if this is the cause of packet
> loss during this migration that's around ~35 seconds or so.
>
> The strange thing is, if I start a continuous ping from VM B on compute B
> to VM A on compute A and then initiate a live-migration of VM A to move to
> Compute B, I only lose ~1 second of traffic, which leads me to suspect this
> issue is related to said tunnels or flows on br-tun...
>
> Any help would be greatly appreciated!
>
> Thanks!
>
> Steve
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20180822/3f8a6b70/attachment.html>