git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[placement] db query analysis


On 6/19/19 8:39 AM, Chris Dent wrote:
> One of the queries that has come up recently with placement
> performance is whether there may be opportunities to gain some
> improvement by making fewer queries to the db. That is:
> 
> * are there redundant queries
> * are there places where data is gathered in multiple queries that
>    could be in one (or at least fewer)
> * are there queries that are not doing what we think
> 
> to that end I've done some analysis of logs produced when
> [placement_database]/connection_debug is set to 50 (which dumps SQL
> queries to the INFO log).
> 
> The collection of queries made during a single request to GET
> /allocation_candidates is at http://paste.openstack.org/show/753183/
> 
> The data set is a single resource resource provider of the same form
> as used in the placement-perfload job (where 1000 providers are
> used). Only 1 is used in this case as several queries use 'IN'
> statements that list all the resource provider ids currently in
> play, and that gets dumped to the log making it inscrutable. I've
> noted in the paste where this happens.
> 
> Each block of SQL is associated with the method that calls it. The
> queries are in the order they happen. One query that happens three
> times (once for each resource class requested) is listed once.
> 
> Observations:
> 
> * The way we use IN could be improved using a bindparam:
>    
> https://docs.sqlalchemy.org/en/13/core/sqlelement.html?highlight=expanding%20bindparam#sqlalchemy.sql.operators.ColumnOperators.in_ 
> 
> 
> * That we use IN in that fashion at all, where we are carrying lists
>    of rp ids around and making multiple queries, instead of one giant
>    one, might be an area worth exploring.
> 
> * There are a couple of places where get get a trait id (via name) in
>    a separate query from using the trait id.
> 
> * What can you see?
> 
> Please have a look to see if anything looks odd, wrong, etc.
> Basically what we're after is trying to find things that violate our
> expectations.
> 
> Note that this is just one of several paths through the database.
> When there are sharing or nested providers things change. I didn't
> bother to do a more complex set of queries at this time as it seemed
> starting simple would help us tease out how best to communicate
> these sorts of things.
> 
> Related to that, I've started working on a nested-perfload at
> https://review.opendev.org/665695

Please note that there used to be fewer queries performed in the 
allocation candidate and get resource provider functions. We replaced 
the giant SQL statements with multiple smaller SQL statements to assist 
in debuggability and tracing.

Best,
-jay