git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Calcite IN operator handling


Hello calcite devs!
My name is Mykola, a am software engineer. We actively use calcite
framework in our project, and recently faced with next issue:
Some of SQL has huge IN list (more than 1500).  Right now Calcite's SQL
parser replaces INs with ORs or sub-queries depending on the
inSubQueryThreshold.
In case of OR transformation sql become really huge and complex, some of
databases (in our case - Vertica and Amazon Athena) cannot execute such
queries. Moreover this transformation
can be the reason of StackOverFlow error.
In other case IN is converted into full table scans that is not appropriate
for our requirements.
So we want to have an option do not make any conversions to IN.
I found kinda similar question here:
https://www.mail-archive.com/dev@xxxxxxxxxxxxxxxxxx/msg06929.html
As i understand that this could be the reason of some expression
simplification problem.
This change is very important to us, so we want to create some fix.
On quick look we see that there are conversions in SqlToRelConverter where
is check should we do OR conversion or translate values list into inline
table. We want to add some additional flow there to assign to subQuery.expr
some RexNode that will left IN "as is".
It would be nice to get from you some help or suggestion to make it
possible. Maybe there are some other places we have to pay attention.
If you see this fix necessary for calcite in future, we can make it
available in next calcite releases.

Thank you.
Mykola Zerniuk