git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Python-Dev] PEP 580 and PEP 590 comparison.


Hi Petr,

Thanks for spending time on this.

I think the comparison of the two PEPs falls into two broad categories, 
performance and capability.

I'll address capability first.

Let's try a thought experiment.
Consider PEP 580. It uses the old `tp_print` slot as an offset to mark 
the location of the CCall structure within the callable. Now suppose 
instead that it uses a `tp_flag` to mark the presence of an offset field 
and that the offset field is moved to the end of the TypeObject. This 
would not impact the capabilities of PEP 580.
Now add a single line
nargs ~= PY_VECTORCALL_ARGUMENTS_OFFSET
here
https://github.com/python/cpython/compare/master...jdemeyer:pep580#diff-1160d7c87cbab324fda44e7827b36cc9R570
which would make PyCCall_FastCall compatible with the PEP 590 vectorcall 
protocol.
Now rebase the PEP 580 reference code on top of PEP 590 minimal 
implementation and make the vectorcall field of CFunction point to 
PyCCall_FastCall.
The resulting hybrid is both a PEP 590 conformant implementation, and is 
at least as capable as the reference PEP 580 implementation.

Therefore PEP 590, must be at least as capable at PEP 580.


Now performance.

Currently the PEP 590 implementation is intentionally minimal. It does 
nothing for performance. The benchmark Jeroen provides is a 
micro-benchmark that calls the same functions repeatedly. This is 
trivial and unrealistic. So, there is no real evidence either way. I 
will try to provide some.

The point of PEP 590 is that it allows performance improvements by 
allowing callables more freedom of implementation. To repeat an example 
from an earlier email, which may have been overlooked, this code reduces 
the time to create ranges and small lists by about 30%

https://github.com/markshannon/cpython/compare/vectorcall-minimal...markshannon:vectorcall-examples
https://gist.github.com/markshannon/5cef3a74369391f6ef937d52cca9bfc8

To speed up calls to builtin functions by a measurable amount will need 
some work on argument clinic. I plan to have that done before PyCon in May.


Cheers,
Mark.