git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Gandiva Initiative


Hi,

I think JIT-compiling of kernels operating on Arrow data is an important
development path, but just for the record, LLVM doesn't have a stable
C++ API (the API changes at each feature release).  Just something to
keep a mind for the ensuing packaging discussions ;-)

(it also raises interesting questions such as "what happens if a user
wants to use both PyArrow and Numba in a given process, and they don't
target the same LLVM API version")

Regards

Antoine.


Le 22/06/2018 à 01:26, Wes McKinney a écrit :
> hi Jacques,
> 
> This is very exciting! LLVM codegen for Arrow has been on my wishlist
> since the early days of the project. I always considered it more of a
> "when" question more than "if".
> 
> I will take a closer look at the codebase to make some comments, but
> my biggest initial question is whether we could work to make Gandiva
> the official community-supported LLVM framework for creating
> JIT-compiled Arrow kernels. In the Ursa Labs (a new lab I am building
> to focus 90+% on Apache Arrow development) tech roadmap we discussed
> the need for a subgraph compiler using LLVM:
> https://ursalabs.org/tech/#subgraph-compilation-code-generation.
> 
> I would be interesting in getting involved in the project, and I
> expect in time many others will, as well. An obvious question would be
> whether you would be interested in donating the project to Apache
> Arrow and continuing the work there. We would benefit from common
> build, testing/CI, and packaging/deployment infrastructure. I'm keen
> to see JIT-powered predicate pushdown in Parquet files, for example.
> Phillip and I could look into building a Gandiva backend for compiling
> a subset of expressions originating from Ibis, a lazy-evaluation DSL
> system with similar API to pandas
> (https://github.com/ibis-project/ibis).
> 
> best
> Wes
> 
> On Thu, Jun 21, 2018 at 4:13 PM, Dimitri Vorona
> <alendit@xxxxxxxxxxxxxx.invalid> wrote:
>> Hey Jaques,
>>
>> Great stuff! I'm actually researching the integration of arrow and flight
>> into a main memory database which also uses LLVM for dynamic query
>> generation! Excited to have a more detailed look at Gandiva!
>>
>> Cheers,
>> Dimitri.
>>
>> On Thu, Jun 21, 2018, 21:15 Jacques Nadeau <jacques@xxxxxxxxxx> wrote:
>>
>>> Hey Guys,
>>>
>>> Dremio just open sourced a new framework for processing data in Arrow data
>>> structures [1], built on top of the Apache Arrow C++ APIs and leveraging
>>> LLVM (Apache licensed). It also includes Java APIs that leverage the Apache
>>> Arrow Java libraries. I expect the developers who have been working on this
>>> will introduce themselves soon. To read more about it, take a look at our
>>> Ravindra's blog post (he's the lead developer driving this work): [2].
>>> Hopefully people will find this interesting/useful.
>>>
>>> Let us know what you all think!
>>>
>>> thanks,
>>> Jacques
>>>
>>>
>>> [1] https://github.com/dremio/gandiva
>>> [2] https://www.dremio.com/announcing-gandiva-initiative-for-apache-arrow/
>>>



( ! ) Warning: include(msgfooter.php): failed to open stream: No such file or directory in /var/www/git/apache-arrow-development/msg04753.html on line 147
Call Stack
#TimeMemoryFunctionLocation
10.0019368520{main}( ).../msg04753.html:0

( ! ) Warning: include(): Failed opening 'msgfooter.php' for inclusion (include_path='.:/var/www/git') in /var/www/git/apache-arrow-development/msg04753.html on line 147
Call Stack
#TimeMemoryFunctionLocation
10.0019368520{main}( ).../msg04753.html:0