[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Sync Call Notes

Hi Uwe,

Le 19/04/2018 à 18:42, Uwe L. Korn a écrit :
>> 1) are we ok with paying the cost of pimpls? (mostly the indirection
>> cost I guess, and the fact that we can't have inline methods/accessors
>> anymore)
> I'm not sure about how much of the cost we're ready to pay. There is a certain element to keeping a stable ABI (this is done fantastically by the NumPy people), you can do patch releases without consumers worrying if they need to rebuild their binaries.
> The indirection on paths that call expensive functions is certainly no problem, i.e. if you have a table and select a column, this is an operation you don't do often, thus I think the overhead is acceptable. On the other hand, accessing the null_count or the length of an array is definitely an operation that is performed quite often. These should be as fast as possible.
> I cannot give you a certain answer, once I have the relevant time, I'll try to implement and profile some of the possible approaches. 
>> 2) how do we do for things like ArrayData, which seems publicly exposed
>> by design?
> ArrayData is marked as internal and thus I would feel ok to break its ABI between non-major releases. If people really depend on its usage, then we should think of a clear way to make it public / non-internal.

Perhaps we need a three-tiered approach?

1) a public and stable namespace ("arrow") with the goal to reach ABI
stability post-1.0;

2) a public but still moving namespace ("arrow::unstable"?) where we
generally try not to remove existing functionality and to honor API
compatibility, but do not guarantee any sort of ABI stability;
(this could have ArrayData, PrimitiveArray...)

3) an internal-use namespace ("arrow::internal"), which third-party
projects can use at their own risk.
(this should get all our internal helpers, including almost all CPython