[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ironic][ops] Taking ironic nodes out of production

Dear all,

One of the discussions at the PTG in Denver raised the need for
a mechanism to take ironic nodes out of production (a task for
which the currently available 'maintenance' flag does not seem
appropriate [1]).

The use case there is an unhealthy physical node in state 'active',
i.e. associated with an instance. The request is then to enable an
admin to mark such a node as 'faulty' or 'in quarantine' with the
aim of not returning the node to the pool of available nodes once
the hosted instance is deleted.

A very similar use case which came up independently is node
retirement: it should be possible to mark nodes ('active' or not)
as being 'up for retirement' to prepare the eventual removal from
ironic. As in the example above, ('active') nodes marked this way
should not become eligible for instance scheduling again, but
automatic cleaning, for instance, should still be possible.

In an effort to cover these use cases by a more general 
"quarantine/retirement" feature:

- are there additional use cases which could profit from such a
   "take a node out of service" mechanism?

- would these use cases put additional constraints on how the
   feature should look like (e.g.: "should not prevent cleaning")

- are there other characteristics such a feature should have
   (e.g.: "finding these nodes should be supported by the cli")

Let me know if you have any thoughts on this.


[1], l. 360