git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SSVM's not starting, timeout for libvirt python script in agent.log


Hi Dag,

Yes, I altered the IP addresses as I do not fancy throwing them out on the public net. If you think they have value to the troubleshooting, I can send you the original logdata directly. I configured the networking with OVS and as follows:

cloudbr0 - MGMT0 (management interface, VLAN1000), this also hosts the system VMs I guess. In the Zone setup, I labelled the Management network as MGMT0, I guess that's OK? It's a Internal network where other servers are also connected. I can ping the VMs here, also over a S2S connection
cloudbr1 - Public (VLAN 4000) and Guest network (VLAN 500-999). I use real public IP-addresses from a scope and I am able to ping the addresses of both the CPVM and SSVM from my home.

I have to update the situation though...

Somehow at 13:02, I had the last error from the agent. Now, I do not have any other errors and both VMs are now showing as "running" as well as agent being "up" in the UI...  had these errors since yesterday and did not change anything after 9:00 this morning. Maybe I'm impatient but 4 hours seems a bit long for them to get to work. :)

Console VM works though at least:



My impression is that cloudstack needs a while to get hold of things... or am I just experiencing unusual things? I have a question though: When adding an ISO this morning, I had an error about that there was no space left (though the storage is 20TB). Was this because the SSVM was not running at the time?

Thank you!
Chris

On Tue, Jun 12, 2018 at 4:39 PM, Dag Sonstebo <Dag.Sonstebo@xxxxxxxxxxxxx> wrote:
Chris,

Going off in a slightly different direction to previous answers. I suspect your problem is with networking - how have you configured this? When you say you can ping the SSVM on the private interface which IP address do you use and where do you successfully ping from?

/usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.py -n
v-1-VM -p
%template=domP%type=consoleproxy%host=1.1.1.1%port=8250%name=v-1-VM%zone=1%pod=1%guid=Proxy.1%proxy_vm=1%disable_rp_filter=true%eth2ip=9.9.9.9%eth2mask=255.255.255.0%gateway=9.9.9.1%eth0ip=169.254.3.159%eth0mask=255.255.0.0%eth1ip=6.6.6.6%eth1mask=255.255.255.0%mgmtcidr=
9.9.9.0/24%localgw=1.2.3.4%internaldns1=1.2.3.4%dns1=8.8.8.8%dns2=8.8.4.4

It could be you have edited the above IP addresses to mask your real addresses – if so ignore this.

If not then the above points to:
- Management host is on 1.1.1.1
- Eth2 which for a console proxy is public traffic is on 9.9.9.9/24
- Eth0 which is the link local management interface is on 169.254.3.159/16 (system generated)
- Eth1 is the main management interface on 6.6.6.6/24
- You have a gateway address of 1.2.3.4

So in this case – the CPVM can not check in to the management host on 1.1.1.1 -  It’s got no interface on that subnet and it also has a gateway it’s not able to reach.

Regards,
Dag Sonstebo
Cloud Architect
ShapeBlue

On 12/06/2018, 13:12, "Nicolas Bouige" <n.bouige@xxxxxxxx> wrote:

    Hi Ivan,


    Are you talking about this global parameters :

    router.aggregation.command.each.timeout



    Best regards,

    Nicolas Bouige
    DIMSI
    cloud.dimsi.fr<http://www.cloud.dimsi.fr>
    4, avenue Laurent Cely
    Tour d’Asnière – 92600 Asnière sur Seine
    T/ +33 (0)6 28 98 53 40


    ________________________________
    De : Ivan Kudryavtsev <kudryavtsev_ia@xxxxxxxxx>
    Envoyé : mardi 12 juin 2018 13:59:39
    À : users
    Objet : Re: SSVM's not starting, timeout for libvirt python script in agent.log

    Increasing command timeouts in global parameters can work here. At least I
    met similar behaviour with VR.

    вт, 12 июн. 2018 г., 14:39 Christoffer Pedersen <vrod@xxxxxxx>:

    > Hi Nicolas,
    >
    > I did a apt show qemu and it gave me this version:
    >
    > Version: 1:2.5+dfsg-5ubuntu10.29
    >
    > So I guess tha would be version 2.5?
    >

Dag.Sonstebo@xxxxxxxxxxxxx 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
 


> On Tue, Jun 12, 2018 at 1:04 PM, Nicolas Bouige <n.bouige@xxxxxxxx> wrote:
    >
    > > Hello Christoffer,
    > >
    > >
    > > Could you tell us wich qemu version are you using ?
    > >
    > > Nicolas Bouige
    > > DIMSI
    > > cloud.dimsi.fr<http://www.cloud.dimsi.fr>
    > > 4, avenue Laurent Cely
    > > Tour d’Asnière – 92600 Asnière sur Seine
    > > T/ +33 (0)6 28 98 53 40
    > >
    > >
    > > ________________________________
    > > De : Christoffer Pedersen <vrod@xxxxxxx>
    > > Envoyé : mardi 12 juin 2018 12:30:48
    > > À : users@xxxxxxxxxxxxxxxxxxxxx
    > > Objet : SSVM's not starting, timeout for libvirt python script in
    > agent.log
    > >
    > > Hi all,
    > >
    > > I have an issue regarding the system VMs. After deploying an advanced
    > zone,
    > > the system VMs are trying to be created but gets stuck in a "Starting"
    > > state, however the Agent state is "Up". I have these logs in the
    > agent.log
    > > (sorry for the formatting)
    > >
    > > 2018-06-12 12:22:06,354 WARN  [kvm.resource.LibvirtComputingResource]
    > > (Script-8:null) (logid:) Interrupting script.
    > > 2018-06-12 12:22:06,355 WARN  [kvm.resource.LibvirtComputingResource]
    > > (agentRequest-Handler-4:null) (logid:ea9cb55a) Timed out:
    > > /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.py
    > > -n
    > > v-1-VM -p
    > > %template=domP%type=consoleproxy%host=1.1.1.1%
    > > port=8250%name=v-1-VM%zone=1%pod=1%guid=Proxy.1%proxy_vm=1%
    > > disable_rp_filter=true%eth2ip=9.9.9.9%eth2mask=255.255.255.
    > > 0%gateway=9.9.9.1%eth0ip=169.254.3.159%eth0mask=255.255.0.
    > > 0%eth1ip=6.6.6.6%eth1mask=255.255.255.0%mgmtcidr=
    > >
    > 9.9.9.0/24%localgw=1.2.3.4%internaldns1=1.2.3.4%dns1=8.8.8.8%dns2=8.8.4.4
    > > .  Output is:
    > > 2018-06-12 12:22:06,355 ERROR [kvm.resource.LibvirtComputingResource]
    > > (agentRequest-Handler-4:null) (logid:ea9cb55a) passcmd failed:timeout
    > > 2018-06-12 12:22:08,914 WARN  [kvm.resource.LibvirtComputingResource]
    > > (Script-4:null) (logid:) Interrupting script.
    > > 2018-06-12 12:22:08,915 WARN  [kvm.resource.LibvirtComputingResource]
    > > (agentRequest-Handler-5:null) (logid:8e44093e) Timed out:
    > > /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.py
    > > -n
    > > s-2-VM -p
    > > %template=domP%type=secstorage%host=1.1.1.1%port=
    > > 8250%name=s-2-VM%zone=1%pod=1%guid=s-2-VM%workers=5%resource=org.apache.
    > > cloudstack.storage.resource.NfsSecondaryStorageResource%
    > > instance=SecStorage%sslcopy=false%role=templateProcessor%
    > > mtu=1500%eth2ip=7.7.7.7%eth2mask=255.255.255.0%gateway=9.9.9.1%public.
    > > network.device=eth2%eth0ip=169.254.2.193%eth0mask=255.
    > > 255.0.0%eth1ip=10.120.0.61%eth1mask=255.255.255.0%mgmtcidr=
    > > 9.9.9.0/24%localgw=1.2.3.4%private.network.device=eth1%
    > > internaldns1=1.2.3.4%dns1=8.8.8.8%dns2=8.8.4.4%nfsVersion=null
    > > .  Output is:
    > > 2018-06-12 12:22:08,915 ERROR [kvm.resource.LibvirtComputingResource]
    > > (agentRequest-Handler-5:null) (logid:8e44093e) passcmd failed:timeout
    > >
    > > I have seen this error around but did not really find a solution to it. I
    > > am not exactly sure whats "timing" out? I can ping both SSVM's on their
    > > private and public interface.
    > >
    > > I hope someone can help me out here. :)
    > >
    > > --
    > > Thanks,
    > > Chris pedersen
    > >
    >
    >
    >
    > --
    > Thanks,
    > Chris pedersen
    >





--
Thanks,
Chris pedersen