Derive Additional Filters
there's the following comment
// TODO - add logic to derive additional filters. E.g., from
// (t1.a = 1 AND t2.a = 2) OR (t1.b = 3 AND t2.b = 4), you can
// derive table filters:
// (t1.a = 1 OR t1.b = 3)
// (t2.a = 2 OR t2.b = 4)
When perf-testing TPC-H/DS queries, we found that such additional filters
would have been very profitable in several cases.
Has anybody worked on this issue or is there already a util to that effect?
The following algorithm is what I am thinking; please comment if interested.
Given a predicate p, we want to derive an implied predicate q, i.e. p->q,
and q only references columns on a specified side of a join. Then we can
transform p to AND(p, q); q will be pushed down at a later stage.
Since we are dealing with 3-value booleans, we need to be very careful.
To be precise, we need a function T(p) which returns a predicate such that
for any row, if p evaluates to TRUE, T(p) must also evaluate to TRUE.
To handle NOT, we also need a function F(p) such that for any row,
if p evaluates to FALSE, F(p) must also evaluate to FALSE.
T(p) can be defined as such --
T( p ) => p, if p only references a specified side of a join
T( OR(a,b) ) => OR(T(a), T(b))
T( AND(a,b) ) => AND(T(a), T(b))
T( NOT(p) ) => NOT( F(p) )
T( p ) => TRUE, for all other cases, as the fallback
// not a present concern, but more rules can be plugged into T(p) if
// desired in some applications, e.g. T( x*x+y*y<100 ) => x*x<100
F(p) is defined similarly --
F( p ) => p, if p only references a specified side of a join
F( OR(a,b) ) => OR(F(a), F(b))
F( AND(a,b) ) => AND(F(a), F(b))
F( NOT(p) ) => NOT( T(p) )
F( p ) => FALSE, for all other cases, as the fallback
We can unify T(p) and F(p) as one function X(v,p), where v is TRUE/FALSE.
For any row, if p evaluates to v, then X(v,p) also evaluates to v.
(The case of UNKNOWN seems very difficult, I'm not considering it here.)