git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SqlValidator can't prohibit the data type of ORDER BY with RANGE prober.


That makes sense.

> On May 14, 2018, at 8:08 PM, John Fang <hustfxj@xxxxxxxxx> wrote:
> 
> I think calcite shouldn't prohibit the compound ORDER BY for  the window(
> *UNBOUNDED PRECEDING and current row*).
> 
> John Fang <hustfxj@xxxxxxxxx> 于2018年5月15日周二 上午11:05写道:
> 
>> *Thanks for Julian.*
>> 
>> *I test those query in Oracle for the table test(id int, name varchar(50),
>> score int).*
>> 
>> 
>> * For range based vachar type of ORDER BY, those query's behavior is same
>> with Oracle.*
>> 
>> *In Oracle:*
>> query-1:  select id, count(1) over (partition by id order by name range
>> between UNBOUNDED PRECEDING and current row) from test;  ----> ok
>> query-2: select id, count(1) over (partition by id order by name) from
>> test;  ----> ok
>> query-3: select id, count(1) over (partition by id order by name range between
>> UNBOUNDED PRECEDING and 0 following) from test;  ----> fail("invalid
>> datatype")
>> *In Calcite:*
>> query-1:  select id, count(1) over (partition by id order by name range
>> between UNBOUNDED PRECEDING and current row) from test;  ----> ok
>> query-2: select id, count(1) over (partition by id order by name) from
>> test;  ----> ok
>> query-3: select id, count(1) over (partition by id order by name range between
>> UNBOUNDED PRECEDING and 0 following) from test;  ----> fail("Data type of
>> ORDER BY prohibits use of RANGE clause")
>> 
>> *But for range based compound ORDER BY , those query's behavior is
>> different from Oracle.*
>> *In Oracle:*
>> query-1:  select id, count(1) over (partition by id order by id, score
>> range between
>> *UNBOUNDED PRECEDING and current row) from test;  ----> ok*
>> query-2: select id, count(1) over (partition by id order by  id, score )
>> from test;  ----> ok
>> query-3: select id, count(1) over (partition by id order by  id, score  between
>> UNBOUNDED PRECEDING and 0 following) from test;  ----> fail("invalid
>> window aggregation group in the window specification")
>> *In Calcite:*
>> query-1:  select id, count(1) over (partition by id order by id, score
>> range between
>> *UNBOUNDED PRECEDING and current row) from test;  ----> fail("RANGE clause
>> cannot be used with compound ORDER BY clause")*
>> query-2: select id, count(1) over (partition by id order by  id, score )
>> from test;  ----> ok
>> query-3: select id, count(1) over (partition by id order by  id, score  between
>> UNBOUNDED PRECEDING and 0 following) from test;  ----> fail("RANGE clause
>> cannot be used with compound ORDER BY clause")
>> 
>> 
>> 
>> Julian Hyde <jhyde@xxxxxxxxxx> 于2018年5月15日周二 上午12:51写道:
>> 
>>> I still think that the first query is invalid and the second is valid.
>>> The first query requires not just a range but a distance. You can compute
>>> the distance between two numeric or timestamp values (by subtracting them)
>>> but not between two varchar values. The second query requires a “unbounded
>>> preceding to current row” range but there is no distance in that
>>> specification.
>>> 
>>> For further proof (I haven’t checked) I would see what the SQL standard
>>> says, and what Oracle does.
>>> 
>>> It is possible that they would use a range “unbounded preceding to
>>> unbounded following”. In which case we need to comply with the standard -
>>> but we could make an exception for streaming queries.
>>> 
>>> Julian
>>> 
>>> 
>>>> On May 14, 2018, at 12:26 AM, John Fang <hustfxj@xxxxxxxxx> wrote:
>>>> 
>>>> win("window w as (order by ename range ^100^ preceding)")
>>>>   .fails("Data type of ORDER BY prohibits use of RANGE clause");
>>>> 
>>>> The query will be prohibited in calcite because the ename's type is
>>>> varchar(20). But the followed query will be allowed in calcite.
>>>> 
>>>> win("window w as (partition by ename order by ename)").ok();
>>>> 
>>>> The former window still be range window frame  because RANGE UNBOUNDED
>>>> PRECEDING AND CURRENT ROW is used as default for window frame if
>>> ROWS/RANGE
>>>> is not specified but ORDER BY is specified. So I think the former query
>>>> should also be prohibited.
>>>> 
>>>> --
>>>> 
>>>> Regards
>>>> 
>>>> John Fang
>>> 
>>> 
>> 
>> --
>> 
>> Regards
>> 
>> John Fang
>> 
>> 
>> 
> 
> 
> -- 
> 
> Regards
> 
> John Fang