git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Gandiva


That's great! If you could create a JIRA case to track your progress, that
would be helpful for others who might want to follow along or contribute.
Thanks!

--
Michael Mior
mmior@xxxxxxxxxx



Le mar. 26 juin 2018 à 10:36, Masayuki Takahashi <masayuki038@xxxxxxxxx> a
écrit :

> Hi Julian,
>
> > Masayuki Takahashi has started to develop an Arrow adapter for
> Calcite[2], but a lot of work remains to implement all SQL built-in
> functions and basic relational operators. Building on top of Gandiva we
> could save a lot of this effort.
>
> I will start to build Gandiva development environment and try to
> consider a way to incorporate.
>
> thanks.
>
>
>
> 2018年6月23日(土) 3:54 Julian Hyde <jhyde@xxxxxxxxxx>:
> >
> > Suppose a company wishes to build a graph database using their own
> innovative graph index data structure. They nevertheless need to implement
> core relational algebra, core data types, and core built-in functions (+,
> CASE, SUM, SUBSTRING). And they want to implement these on a
> memory-efficient data structure (tens of thousands of rows, stored
> column-oriented, per memory block). This is a massive effort.
> >
> > With Calcite+Gandiva+Arrow they just need to create a sequence of
> relational operators (using RelBuilder, say) and efficient machine code is
> generated. They can then start adding their own data types, built-in
> functions, and relational operators, using the same architecture.
> >
> > Julian
> >
> >
> > > On Jun 22, 2018, at 11:33 AM, Xiening Dai <xndai.git@xxxxxxxx> wrote:
> > >
> > > I was in a talk regarding Gandiva yesterday. Impressive work!
> > >
> > > But I am not sure why Calcite would like to integrate with it. To me
> Gandiva is on execution side, in which scenarios a query planner would need
> a arrow engine? I read the original Jira about implementing file
> enumerator, but the intent is still not clear to me. Would appreciate if
> you can elaborate. Thanks.
> > >
> > >
> > >> On Jun 22, 2018, at 11:20 AM, Julian Hyde <jhyde@xxxxxxxxxx> wrote:
> > >>
> > >> There is a discussion on dev@arrow about Gandiva, a kernel for
> Arrow[1].
> > >>
> > >> I think it would be an interesting library on which to build our
> Arrow engine. (Without a kernel, Arrow is just a data format, but with
> Gandiva it becomes an engine upon which we can implement all relational
> operations, albeit on a multi-threaded single node. Potentially this
> approach can process each row in a few machine cycles, i.e. billions of
> records per second. Therefore single-node would be sufficient for many
> queries.)
> > >>
> > >> Masayuki Takahashi has started to develop an Arrow adapter for
> Calcite[2], but a lot of work remains to implement all SQL built-in
> functions and basic relational operators. Building on top of Gandiva we
> could save a lot of this effort.
> > >>
> > >> Julian
> > >>
> > >> [1]
> https://lists.apache.org/thread.html/f099b3d1e2aaf9803c5c756f872a594baf17e9f25974e3496c9706d9@%3Cdev.arrow.apache.org%3E
> <
> https://lists.apache.org/thread.html/f099b3d1e2aaf9803c5c756f872a594baf17e9f25974e3496c9706d9@%3Cdev.arrow.apache.org%3E
> >
> > >>
> > >> [2] https://issues.apache.org/jira/browse/CALCITE-2173 <
> https://issues.apache.org/jira/browse/CALCITE-2173>
> > >
> >
>
>
> --
> 高橋 真之
>