git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Refactoring the Rust API


Here's my blog post explaining the refactor:
https://andygrove.io/2018/05/apache-arrow-traits-generics/

The Reddit thread is going to be here for anyone wanting to see the
feedback:
https://www.reddit.com/r/rust/comments/8gy45t/refactoring_apache_arrow_to_use_traits_and/

Thanks,

Andy.

On Wed, May 2, 2018 at 5:10 PM, Andy Grove <andygrove73@xxxxxxxxx> wrote:

> There was an interesting blog post posted to Reddit a couple days ago that
> is very relevant to this refactor. The author is building a dataframe
> library in Rust and started out with an enum to represent arrays and then
> moved to using generic traits with the enum.
>
> https://www.reddit.com/r/rust/comments/8g2274/dataframes_
> traits_enums_generics_and_dynamic/
>
> I don't think that approach would work for us and I'm tempted to write a
> up a blog post myself explaining the current refactor and why it is needed.
> I think I'll try and do that this weekend. I'm keen to get a discussion
> going around the refactor to make sure we don't need to do another refactor
> in the future.
>
> Andy.
>
>
>
>
>
> On Sun, Apr 29, 2018 at 9:59 AM, Andy Grove <andygrove73@xxxxxxxxx> wrote:
>
>> So it turns out this refactor isn't as disruptive as I thought and I
>> mostly have it working already.
>>
>> The buffer/builder/list types barely change at all other than the fact
>> that we no longer need all those macros after moving to generics.
>>
>> It really is only array.rs that is pretty much a rewrite.
>>
>> Also, in my earlier email I got my dates wrong. I am aiming to have this
>> PR ready by Monday May 7th. The real test for me is integrating it with
>> DataFusion to make sure I haven't missed anything.
>>
>> Here's the branch where I'm working on this:
>> https://github.com/andygrove/arrow/tree/refactor_rust_api
>>
>> Thanks,
>>
>> Andy.
>>
>>
>>
>>
>> On Sat, Apr 28, 2018 at 2:10 PM, Andy Grove <andygrove73@xxxxxxxxx>
>> wrote:
>>
>>> I filed a PR to track this (https://issues.apache.org/jir
>>> a/browse/ARROW-2521) but thought it was worth raising on the mailing
>>> list too.
>>>
>>> I am running into limitations now of the way that Array is represented
>>> as an enum and I am unable to implement List<List<T>> with the current
>>> design.
>>>
>>> When Krisztian Szucs and I were working on the initial code we had two
>>> different approaches and we went with this enum approach at the time
>>> because we weren't able to make the other approach (traits + generics) work.
>>>
>>> Now that I'm further along the Rust learning curve, I can make the trait
>>> + generic approach work and I'm currently prototyping in a separate repo,
>>> and it is looking good so far. I have been able to create a struct array
>>> containing different type fields including List<List<T>>.
>>>
>>> I think I'm ready to start the refactor for real in my fork. We only
>>> have ~1k LOC so I don't think it will take too long, but because I'm doing
>>> this in my spare time I am going to estimate that I will have it complete
>>> in just over one week, aiming for having it complete by 4/30.
>>>
>>> I think it's fine to continue merging small PRs in the meanwhile but I
>>> think we should hold off any major changes in the coming week.
>>>
>>> Thanks,
>>>
>>> Andy.
>>>
>>>
>>>
>>>
>>>
>>>
>>
>


( ! ) Warning: include(msgfooter.php): failed to open stream: No such file or directory in /var/www/git/apache-arrow-development/msg04366.html on line 164
Call Stack
#TimeMemoryFunctionLocation
10.0007368648{main}( ).../msg04366.html:0

( ! ) Warning: include(): Failed opening 'msgfooter.php' for inclusion (include_path='.:/var/www/git') in /var/www/git/apache-arrow-development/msg04366.html on line 164
Call Stack
#TimeMemoryFunctionLocation
10.0007368648{main}( ).../msg04366.html:0