[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

keying by identity in dict and set

Steve White wrote:

> On Sun, Oct 20, 2019 at 7:57 PM Peter Otten <__peter__ at> wrote:
>> Steve White wrote:
>> >
>> > The point is, I don't think __eq__() is ever called in a situation as
>> > described in my post, yet the Python documentation states that if
>> > instances are to be used as keys, it must not be used to determine if
>> > non-identical instances are equivalent.
>> What else should be used if not __eq__()? Where in the docs did you you
>> see such a statement?
> Nothing.  Nowhere.
> As I said, it appears that if __hash__ returns values from the id()
> function, no collision ever occurs, so there is tie to break in a lookup,
> so in particular there is no need for the heuristic of calling __eq__.
> And again, the problem here is that the documentation is a little weak
> on just how these things really interact.  We're having to guess,
> experiment, and even look at C code of the Python implementations.
>> The only limitation for a working dict or set is that for its keys or
>> elements
>> (1) a == b implies hash(a) == hash(b)
>> (2) a == a
>> Does your custom class violate one of the above?
> Yes.  It's all in the original post, including code.
> Give it a shot!
> Thanks!


$ cat
class A:
    def __init__(self, hash):
        self._hash = hash
    def __eq__(self, other ):
        raise Exception("__eq__")
    def __hash__(self):
        return self._hash

def same_hash(x):
    return hash(A(x)) == x
$ python3 -i
>>> same_hash(2**63)
>>> same_hash(2**63-1)

The limit for smallish integers seems to be 2**63 for 64-bit CPython. As 
id() in CPython is the object's memory address you need data allocated above 
and below that limit to produce a hash collision and trigger __eq__. That 
appears to be rather unlikely.

But even if I have this right it looks like you are relying on internals and 
I lack the fantasy to imagine a compelling use case.