git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Instance (vm guest) not getting PCI card


On Tue, Jul 2, 2019 at 4:36 PM Sean Mooney <smooney at redhat.com> wrote:
>
> On Tue, 2019-07-02 at 15:51 -0400, Mauricio Tavares wrote:
> > Newbie and easy questions: I have two cards, one in each stein
> > (centos) compute node setup for kvm, which I want to be able to handle
> > to a vm guest (instance). Following
> > https://docs.openstack.org/nova/latest/admin/pci-passthrough.html, I
> >
> > 1. Setup both computer nodes to vt-t and iommu.
> > 2. On the controller
> > 2.1. Create a PCI alias based on the vendor and product ID
> > alias = { "vendor_id":"19fg", "product_id":"4000",
> > "device_type":"type-PF", "name":"testnic" }
> the alias looks correcect. assuming you have it set in teh pci section
> https://docs.openstack.org/nova/latest/configuration/config.html#pci.alias
> then i should generate teh request of a pci device.
>
      In fact my initial sed script is 'find the line beginning with
"[pci]" and then append this underneath it'. I could probably do
something more clever, or use ansible, but I was in a hurry. :)

> the alias needs to be defiend in the nova.conf used by the api node and the compute node
> for it to work correctly but i assume that when you say its set on the contoler it set on
> the nova.conf the nova api is useing.

      Exactly; that would be step 3.1 further down.
> >
> > - The PCI address for the card is different on each compute node
> >
> > 2.2. Create a flavor, say, n1.large
> > openstack flavor create n1.large --id auto --ram 8192 --disk 80
> > --vcpus 4 --property "pci_passthrough:alias"="testnic:1"
> this is also correct
>
> >
> > 2.3. Restart openstack-nova-api
> >
> > 3. On each compute node
> > 3.1. Create a PCI alias based on the vendor and product ID
> > alias = { "vendor_id":"19fg", "product_id":"4000",
> > "device_type":"type-PF", "name":"testnic" }
> >
> > 3.2. Create passthrough_whitelist entry
> > passthrough_whitelist = { "vendor_id":"19fg", "product_id":"4000" }
> assuming this is set in the pci section it also looks correct
> https://docs.openstack.org/nova/latest/configuration/config.html#pci.passthrough_whitelist
> >
      I actually put the passthrough_whitelist entry just below the
alias one, which is below the [pci] label in the nova.conf file. Make
it easier for me to find them later on.

> > 3.3. Restart openstack-nova-compute
> >
> > 4. Create instance (vm guest) using the n1.large flavor.
> >
> > 5. Login to instance and discover dmesg and lspci does not list card
> >
> > 6. Do a "virsh dumpxml" for the instance on its compute node and
> > discover there is no entry for the card listed in the xml file. I take
> > nova would automagically do what I would if this was a kvm install,
> > namely ensure card cannot be accessed/used by the host and then edit
> > the guest xml file so it can see said card.
> yes you should have seen it in the xml and the card should have been passed through to the guest.
> >
> > Questions:
> > Q1: If a device is sr-iov capable, do I have to use that or can I just
> > pass the entire card to the vm guest?
> you can passthorugh the entire card to the guest yes.
>
      Now, when I ask for the list of pci devices available in the
compute nodes, why are they listed as type-PF? I am a bit concerned
because it feels like it will be anxiously trying to virtualize it
instead of just leaving said card alone, which I would expect with
type-PCI.

>
> > Q2: Is there anywhere I can look for clues to why is the libvirt xml
> > file for the instance not being populated with the pci card info? So
> > far I only looked in the controller node's nova_scheduler.log file.
> there are several things to check.
> first i would check the nova compute agenet log and see if there are any tracebacks or errors
> second in the nova cell db, often called just nova or nova_cell1 (not nova_cell0) check the pci_devices
> table and  see if the devices are listed.
> >
>
      Well, logs I can understand (we are talking about the
nova-compute.log, right?) but I guess this is where my completely
cluelessness shows up in grand style: I do not know where to look for
that in the database. Nor could figure out how to talk to the REST
interface using curl other than getting a token. So, I did a kludgy
workaround and got the pci pool associated with each node, say

[{'count': 1, 'product_id': u'4000', u'dev_type': u'type-PF',
'numa_node': 0, 'vendor_id': u'19fg'}]

My ASSumption (yes, I know what they say about them) here is that when
the object defining the compute node is updated, the database entry
associated with it gets fed the pci pool I am seeing. In other words,
that is the list of pci devices openstack things the node has.

I guess this is the time I have to sheepishly admit that while one of
the nodes has a single card, the other one has two; they are
identified by being 'numa_node': 0 and 'numa_node': 1. Hopefully that
will not cause issues.

Then, I compared it to the pci request made before the instance is created:

(alias_name='testnic',count=1,is_new=<?>,numa_policy='legacy',request_id=None,requester_id=<?>,spec=[{dev_type='type-PF',product_id='4000',vendor_id='19fg'}])

Since they both match, they satisfied the pci passthrough filter test
and the instance was allowed to be spawned. That is as far as I went.