git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Refactoring the Rust API


There was an interesting blog post posted to Reddit a couple days ago that
is very relevant to this refactor. The author is building a dataframe
library in Rust and started out with an enum to represent arrays and then
moved to using generic traits with the enum.

https://www.reddit.com/r/rust/comments/8g2274/dataframes_traits_enums_generics_and_dynamic/

I don't think that approach would work for us and I'm tempted to write a up
a blog post myself explaining the current refactor and why it is needed. I
think I'll try and do that this weekend. I'm keen to get a discussion going
around the refactor to make sure we don't need to do another refactor in
the future.

Andy.





On Sun, Apr 29, 2018 at 9:59 AM, Andy Grove <andygrove73@xxxxxxxxx> wrote:

> So it turns out this refactor isn't as disruptive as I thought and I
> mostly have it working already.
>
> The buffer/builder/list types barely change at all other than the fact
> that we no longer need all those macros after moving to generics.
>
> It really is only array.rs that is pretty much a rewrite.
>
> Also, in my earlier email I got my dates wrong. I am aiming to have this
> PR ready by Monday May 7th. The real test for me is integrating it with
> DataFusion to make sure I haven't missed anything.
>
> Here's the branch where I'm working on this: https://github.com/andygrove/
> arrow/tree/refactor_rust_api
>
> Thanks,
>
> Andy.
>
>
>
>
> On Sat, Apr 28, 2018 at 2:10 PM, Andy Grove <andygrove73@xxxxxxxxx> wrote:
>
>> I filed a PR to track this (https://issues.apache.org/jir
>> a/browse/ARROW-2521) but thought it was worth raising on the mailing
>> list too.
>>
>> I am running into limitations now of the way that Array is represented as
>> an enum and I am unable to implement List<List<T>> with the current design.
>>
>> When Krisztian Szucs and I were working on the initial code we had two
>> different approaches and we went with this enum approach at the time
>> because we weren't able to make the other approach (traits + generics) work.
>>
>> Now that I'm further along the Rust learning curve, I can make the trait
>> + generic approach work and I'm currently prototyping in a separate repo,
>> and it is looking good so far. I have been able to create a struct array
>> containing different type fields including List<List<T>>.
>>
>> I think I'm ready to start the refactor for real in my fork. We only have
>> ~1k LOC so I don't think it will take too long, but because I'm doing this
>> in my spare time I am going to estimate that I will have it complete in
>> just over one week, aiming for having it complete by 4/30.
>>
>> I think it's fine to continue merging small PRs in the meanwhile but I
>> think we should hold off any major changes in the coming week.
>>
>> Thanks,
>>
>> Andy.
>>
>>
>>
>>
>>
>>
>