Re: Sync Call Notes
Le 19/04/2018 à 18:42, Uwe L. Korn a écrit :
>> 1) are we ok with paying the cost of pimpls? (mostly the indirection
>> cost I guess, and the fact that we can't have inline methods/accessors
> I'm not sure about how much of the cost we're ready to pay. There is a certain element to keeping a stable ABI (this is done fantastically by the NumPy people), you can do patch releases without consumers worrying if they need to rebuild their binaries.
> The indirection on paths that call expensive functions is certainly no problem, i.e. if you have a table and select a column, this is an operation you don't do often, thus I think the overhead is acceptable. On the other hand, accessing the null_count or the length of an array is definitely an operation that is performed quite often. These should be as fast as possible.
> I cannot give you a certain answer, once I have the relevant time, I'll try to implement and profile some of the possible approaches.
>> 2) how do we do for things like ArrayData, which seems publicly exposed
>> by design?
> ArrayData is marked as internal and thus I would feel ok to break its ABI between non-major releases. If people really depend on its usage, then we should think of a clear way to make it public / non-internal.
Perhaps we need a three-tiered approach?
1) a public and stable namespace ("arrow") with the goal to reach ABI
2) a public but still moving namespace ("arrow::unstable"?) where we
generally try not to remove existing functionality and to honor API
compatibility, but do not guarantee any sort of ABI stability;
(this could have ArrayData, PrimitiveArray...)
3) an internal-use namespace ("arrow::internal"), which third-party
projects can use at their own risk.
(this should get all our internal helpers, including almost all CPython