Access violation in Python garbage collector (visit_decref) - how to debug?
Geoff Bache <geoff.bache at gmail.com> writes:
> We are running Python embedded in our C++ product and are now experiencing
> crashes (access violation reading 0xffffffffff on Windows) in the Python
> garbage collector.
Errors like this are very difficult to analyse. The main reason:
the memory corruption is likely far away from the point when
it is finally detected (by an access violation in your case).
Python can be built in a special way to add marks to
its allocated memory blocks and verify their validity.
This increases the chance to detect a memory corruption earlier
and thereby facilitates the detection of the root cause.
There are tools for the analysis of memory management problems
(e.g. "valgrind", though this may be for Linux). In my
experience, even with those tools, the analysis is very difficult.
I have several times successfully analysed memory corruption
problems. In those cases, I have been lucky that the corruption
was reproducible and affected typically the same address.
Thus, I could put a (hardware) memory breakpoint at this address
stopping the program as soon as this address was written and
then analyse the state in the debugger. This way, I could detect
precisely which code caused the corruption. However,
this was quite a long time ago; nowadays, modern operating systems
employ address randomization thus reducing significantly that
the corruption affects the same address (which may mean that
you need to deactivate address randomization to get a better chance
for this kind of analysis.