git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)


Hi,

The current C API of Python is both a strength and a weakness of the
Python ecosystem as a whole. It's a strength because it allows to
quickly reuse a huge number of existing libraries by writing a glue
for them. It made numpy possible and this project is a big sucess!
It's a weakness because of its cost on the maintenance, it prevents
optimizations, and more generally it prevents to experiment modifying
Python internals.

For example, CPython cannot use tagged pointers, because the existing
C API is heavily based on the ability to dereference a PyObject*
object and access directly members of objects (like PyTupleObject).
For example, Py_INCREF() modifies *directly* PyObject.ob_refcnt. It's
not possible neither to use a Python compiled in debug mode on C
extensions (compiled in release mode), because the ABI is different in
debug mode. As a consequence, nobody uses the debug mode, whereas it
is very helpful to develop C extensions and investigate bugs.

I also consider that the C API gives too much work to PyPy (for their
"cpyext" module). A better C API (not leaking implementation) details
would make PyPy more efficient (and simplify its implementation in the
long term, when the support for the old C API can be removed). For
example, PyList_GetItem(list, 0) currently converts all items of the
list to PyObject* in PyPy, it can waste memory if only the first item
of the list is needed. PyPy has much more efficient storage than an
array of PyObject* for lists.

I wrote a website to explain all these issues with much more details:

   https://pythoncapi.readthedocs.io/

I identified "bad APIs" like using borrowed references or giving
access to PyObject** (ex: PySequence_Fast_ITEMS).

I already wrote an (incomplete) implementation of a new C API which
doesn't leak implementation details:

   https://github.com/pythoncapi/pythoncapi

It uses an opt-in option (Py_NEWCAPI define -- I'm not sure about the
name) to get the new API. The current C API is unchanged.

Ah, important points. I don't want to touch the current C API nor make
it less efficient. And compatibility in both directions (current C API
<=> new C API) is very important for me. There is no such plan as
"Python 4" which would break the world and *force* everybody to
upgrade to the new C API, or stay to Python 3 forever. No. The new C
API must be an opt-in option, and current C API remains the default
and not be changed.

I have different ideas for the compatibility part, but I'm not sure of
what are the best options yet.

My short term for the new C API would be to ease the experimentation
of projects like tagged pointers. Currently, I have to maintain the
implementation of a new C API which is not really convenient.

--

Today I tried to abuse the Py_DEBUG define for the new C API, but it
seems to be a bad idea:

   https://github.com/python/cpython/pull/10435

A *new* define is needed to opt-in for the new C API.

Victor