git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[placement] update 19-19


HTML: https://anticdent.org/placement-update-19-19.html

Woo! Placement update 19-19. First one post PTG and Summit. Thanks
to everyone who helped make it a useful event for Placement. Having
the pre-PTG meant that we had addressed most issues prior to getting
there meaning that people were freed up to work in other areas and
the discussions we did have were highly coherent.

Thanks, also, to everyone involved in getting placement deleted from
nova. We did that while at the PTG and had a little
[celebration](https://tank-binaries.s3.amazonaws.com/8e922a32c7ff4116a68d7309ec079ec4.jpe).

# Most Important

We're still working on narrowing priorities and focusing the details
of those priorities. There's an
[etherpad](https://etherpad.openstack.org/p/placement-ptg-train-rfe-voter)
where we're taking votes on what's important. There are three specs
in progress from that that need review and refinement. There are two
others which have been put on the back burner (see specs section
below).

# What's Changed

* We're now [running a
   subset](https://review.opendev.org/657077) of nova's
   functional tests in placement's gate.

* osc-placement is using the PlacementFixture to run its functional
   tests making them _much_ faster.

* There's a set of StoryBoard
   [worklists](https://docs.openstack.org/placement/latest/contributor/contributing.html#storyboard)
   that can be used to help find in progress work and new bugs. That
   section also describes how tags are used.

* There's a [summary of summaries](http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006278.html)
   email message that summarizes and links to various results from
   the PTG.

# Specs/Features

As the summary of summaries points out, we have two major features
this cycle, one of which is _large_: getting consumer types going
and getting a whole suite of features going to support nested
providers in a more effective fashion.

* <https://review.opendev.org/654799>
   Support Consumer Types. This is very close with a few details to
   work out on what we're willing and able to query on. It only has
   reviews from me so far.

* <https://review.opendev.org/658510>
   Spec for Nested Magic. This is associated with a [lengthy
   story](https://storyboard.openstack.org/#!/story/2005575) that
   includes visual artifacts from the PTG. It covers several related
   features to enable nested-related requirements from nova and
   neutron. It is a work in progress, with several unanswered
   questions. It is also something that efried started but will be
   unable to finish so the rest of us will need to finish it up as
   the questions get answered. And it also mostly subsumes a previous
   spec on [subtree
   affinity](https://review.opendev.org/#/c/650476/). (Eric, please
   correct me if I'm wrong on that.)

* <https://review.opendev.org/657582>
   Resource provider - request group mapping in allocation candidate.
   This spec was copied over from nova. It is a requirement of the
   overall nested magic theme. While it has a well-defined and
   refined design, there's currently no one on the hook implement
   it.

There are also two specs that are still live but de-prioritized:

* <https://review.openstack.org/649992>
   support any trait in allocation candidates

* <https://review.openstack.org/649368>
   support mixing required traits with any traits

These and other features being considered can be found on the
[feature
worklist](https://storyboard.openstack.org/#!/worklist/594).

Some non-placement specs are listed in the Other section below.

# Stories/Bugs

There are 23 stories in [the placement
group](https://storyboard.openstack.org/#!/project_group/placement).
0 are [untagged](https://storyboard.openstack.org/#!/worklist/580).
4 are [bugs](https://storyboard.openstack.org/#!/worklist/574). 5 are
[cleanups](https://storyboard.openstack.org/#!/worklist/575). 12 are
[rfes](https://storyboard.openstack.org/#!/worklist/594). 2 are
[docs](https://storyboard.openstack.org/#!/worklist/637).

If you're interested in helping out with placement, those stories
are good places to look.

On launchpad:

* Placement related nova [bugs not yet in progress](https://goo.gl/TgiPXb)
   on launchpad: 16. +3

* [In progress placement bugs](https://goo.gl/vzGGDQ) on launchpad: 6.
   +2. These are placement-related, in nova.

Of those there two interesting ones to note:

* <https://bugs.launchpad.net/nova/+bug/1829062>
   nova placement api non-responsive due to eventlet error.
   When using placement-in-nova in stein, recent eventlet changes can
   cause issues. As I've mentioned on the bug the best way out of
   this problem is to use placement-in-placement but there are other
   solutions.

* <https://bugs.launchpad.net/nova/+bug/1829479>
   The allocation table has residual records when instance is evacuated
   and the source physical node is removed.
   This appears to be yet another issue related to orphaned
   allocations during one of the several move operations. The impact
   they are most concerned with, though, seems to be the common "When
   I bring up a new compute node with the same name there's an
   existing resource provider in the way" that happens because of the
   unique constrain on the rp name column.

I'm still not sure that constraint is the right thing unless we want
to make people's lives hard when they leave behind allocations. We
may want to make it hard because it will impact quota...

# osc-placement

osc-placement is currently behind by 11 microversions. No change
since the last report.

Pending changes:

_Note: a few of these having been sitting for some time with my +2
awaiting review by some other placement core. Please remember
osc-placement when reviewing._

* <https://review.openstack.org/#/c/640898/>
   Add 'resource provider inventory update' command (that helps with
   aggregate allocation ratios).

* <https://review.openstack.org/#/c/651783/>
   Add support for 1.22 microversion

* <https://review.openstack.org/586056>
   Provide a useful message in the case of 500-error

* <https://review.openstack.org/650257>
   Remove unused cruft from doc and releasenotes config

* <https://review.openstack.org/652100>
   Improve aggregate version check error messages with min_version

* <https://review.opendev.org/653285>
   Expose version error message generically

# Main Themes

Now that the PTG has passed some themes have emerged. Since the
Nested Magic one is rather all encompassing and Cleanup is a
catchall, I think we can consider three enough. If there's some
theme that you think is critical that is being missed, let me know.

For people coming from the nova-side of the world who need or want
something like review runways to know where they should be focusing
their review energy, consider these themes and the links within them
as a runway. But don't forget bugs and everything else.

## Nested Magic

At the PTG we decided that it was worth the effort, in both Nova and
Placement, to make the push to make better use of nested providers â??
things like NUMA layouts, multiple devices, networks â?? while keeping
the "simple" case working well. The general ideas for this are
described in a [story](https://storyboard.openstack.org/#!/story/2005575)
and an evolving [spec](https://review.opendev.org/658510).

Some code has started, mostly to reveal issues:

* <https://review.opendev.org/657419>
   Changing request group suffix to string

* <https://review.opendev.org/657510>
   WIP: Allow RequestGroups without resources

* <https://review.opendev.org/657463>
   Add NUMANetworkFixture for gabbits

* <https://review.opendev.org/658192>
   Gabbi test cases for can_split

## Consumer Types

Adding a type to consumers will allow them to be grouped for various
purposes, including quota accounting. A
[spec](https://review.opendev.org/654799) has started. There are
some questions about request and response details that need to be
resolved, but the overall concept is sound.

## Cleanup

As we explore and extend nested functionality we'll need to do some
work to make sure that the code is maintainable and has suitable
performance. There's some work in progress for this that's important
enough to call out as a theme:

* <https://storyboard.openstack.org/#!/story/2005712>
   Some work from Tetsuro exploring ways to remove redundancies in
   the code. There's a [related WIP](https://review.opendev.org/658778)

* <https://review.opendev.org/659522>
   Enhance debug logging in allocation candidate handling

* <https://review.opendev.org/658164>
   Start of a stack that will allow us to remove the protections
   against null root providers (which turns out is a pretty
   significant performance hit).

* <https://review.opendev.org/643269>
   WIP: Optionally run a wsgi profiler when asked.
   This was used to find some of the above issues. Should we make it
   generally available or is it better as a thing to base off when
   exploring?

Ed Leafe has also been doing some intriguing work on using graph
databases with placement. It's not yet clear if or how it could be
integrated with mainline placement, but there are likely many things
to be learned from the experiment.

# Other Placement

* <https://review.opendev.org/#/q/topic:refactor-classmethod-diaf>
   A suite of refactorings that given their lack of attention perhaps
   we don't need or want, but let's be explicit about that rather
   than ignoring the patches if that is indeed the case.

* <https://review.opendev.org/645255>
   A start at some unit tests for the PlacementFixture which got lost
   in the run up to the PTG. They may be less of a requirement now
   that placement is running nova's functional tests. But again, we
   should be explicit about that decision.

# Other Service Users

New discoveries are added to the end. Merged stuff is removed.

* <https://review.openstack.org/552924>
   Nova: Spec: Proposes NUMA topology with RPs

* <https://review.openstack.org/622893>
   Nova: Spec: Virtual persistent memory libvirt driver
   implementation

* <https://review.openstack.org/641899>
   Nova: Check compute_node existence in when nova-compute reports
   info to placement

* <https://review.openstack.org/601596>
   Nova: spec: support virtual persistent memory

* <https://review.openstack.org/#/q/topic:bug/1790204>
   Workaround doubling allocations on resize

* <https://review.openstack.org/645316>
   Nova: Pre-filter hosts based on multiattach volume support

* <https://review.openstack.org/647396>
   Nova: Add flavor to requested_resources in RequestSpec

* <https://review.openstack.org/633204>
   Blazar: Retry on inventory update conflict

* <https://review.openstack.org/#/q/topic:bp/count-quota-usage-from-placement>
   Nova: count quota usage from placement

* <https://review.openstack.org/#/q/topic:bug/1819923>
   Nova: nova-manage: heal port allocations

* <https://review.openstack.org/648665>
   Nova: Spec for a new nova virt driver to manage an RSD

* <https://review.openstack.org/625284>
   Cyborg: Initial readme for nova pilot

* <https://review.openstack.org/629142>
   Tempest: Add QoS policies and minimum bandwidth rule client

* <https://review.openstack.org/648687>
   Nova-spec: Add PENDING vm state

* <https://review.openstack.org/650188>
   nova-spec: Allow compute nodes to use DISK_GB from shared storage RP

* <https://review.openstack.org/651024>
   nova-spec: RMD Plugin: Energy Efficiency using CPU Core P-State control

* <https://review.openstack.org/651455>
   puppet: Debian: Add support for placement-api over uwsgi

* <https://review.openstack.org/650963>
   nova-spec: Proposes NUMA affinity for vGPUs. This describes a
   legacy way of doing things because affinity in placement may be a
   ways off. But it also [may not
   be](https://review.openstack.org/650476).

* <https://review.openstack.org/#/q/topic:heal_allocations_dry_run>
   Nova: heal allocations, --dry-run

* <https://review.openstack.org/642527>
   Neutron: Fullstack test for placement sync

* <https://review.opendev.org/656448>
   Watcher spec: Add Placement helper

* <https://review.opendev.org/659233>
   Cyborg: Placement report

* <https://review.opendev.org/657884>
   Nova: Spec to pre-filter disabled computes with placement

* <https://review.opendev.org/657801>
   rpm-packaging: placement service

* <https://review.opendev.org/657016>
   Delete resource providers for all nodes when deleting compute service

# End

I'm out of practice on these things. This one took a long time.


-- 
Chris Dent                       Ù©â??̯â??Û¶           https://anticdent.org/
freenode: cdent