git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[TripleO] Scaling node counts with only Ansible (N=1)


On Wed, Jul 10, 2019 at 4:24 PM James Slagle <james.slagle at gmail.com> wrote:

> There's been a fair amount of recent work around simplifying our Heat
> templates and migrating the software configuration part of our
> deployment entirely to Ansible.
>
> As part of this effort, it became apparent that we could render much
> of the data that we need out of Heat in a way that is generic per
> node, and then have Ansible render the node specific data during
> config-download runtime.
>
> To illustrate the point, consider when we specify ComputeCount:10 in
> our templates, that much of the work that Heat is doing across those
> 10 sets of resources for each Compute node is duplication. However,
> it's been necessary so that Heat can render data structures such as
> list of IP's, lists of hostnames, contents of /etc/hosts files, etc
> etc etc. If all that was driven by Ansible using host facts, then Heat
> doesn't need to do those 10 sets of resources to begin with.
>
> The goal is to get to a point where we can deploy the Heat stack with
> a count of 1 for each role, and then deploy any number of nodes per
> role using Ansible. To that end, I've been referring to this effort as
> N=1.
>
> The value in this work is that it directly addresses our scaling
> issues with Heat (by just deploying a much smaller stack). Obviously
> we'd still be relying heavily on Ansible to scale to the required
> levels, but I feel that is much better understood challenge at this
> point in the evolution of configuration tools.
>
> With the patches that we've been working on recently, I've got a POC
> running where I can deploy additional compute nodes with just Ansible.
> This is done by just adding the additional nodes to the Ansible
> inventory with a small set of facts to include IP addresses on each
> enabled network and a hostname.
>
> These patches are at
> https://review.opendev.org/#/q/topic:bp/reduce-deployment-resources
> and reviews/feedback are welcome.
>

This is a fabulous proposal in my opinion.
I've added (and will continue to add) TODO ideas in the etherpad.
Anyone willing to help, please ping us if needed.

Another point, somewhat related: I took the opportunity of this work to
reduce the complexity around the number of hieradata files.
I would like to investigate if we can generate one data file which would be
loaded by both Puppet and Ansible for doing the configuration management.
I'll create a separated thread on that effort very soon.


> Other points:
>
> - Baremetal provisioning and port creation are presently handled by
> Heat. With the ongoing efforts to migrate baremetal provisioning out
> of Heat (nova-less deploy), I think these efforts are very
> complimentary. Eventually, we get to a point where Heat is not
> actually creating any other OpenStack API resources. For now, the
> patches only work when using pre-provisioned nodes.
>
> - We need to consider how we'd manage the Ansible inventory going
> forward if we open up an interface for operators to manipulate it
> directly. That's something we'd want to manage and preserve (version
> control) as it's critical data for the deployment.
>
> Given the progress that we've made with the POC, my sense is that
> we'll keep pushing in this overall direction. I'd like to get some
> feedback on the approach. We have an etherpad we are using to track
> some of the work at a high level:
>
> https://etherpad.openstack.org/p/tripleo-reduce-deployment-resources
>
> I'll be adding some notes on how I setup the POC to that etherpad if
> others would like to try it out.
>
> --
> -- James Slagle
> --
>
>

-- 
Emilien Macchi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190711/143370a5/attachment.html>