git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fwd: keying by identity in dict and set


---------- Forwarded message ---------
From: Steve White <stevan.white at gmail.com>
Date: Sun, Oct 20, 2019 at 11:38 PM
Subject: Re: keying by identity in dict and set
To: Peter Otten <__peter__ at web.de>


Hi Peter,

Thanks, that does seem to indicate something.
(But there was no need to define a class... you're basically saying
>>> hash(2**63-1) == 2**63-1
True
>>> hash(2**63) == 2**63
False
}

As was pointed out in a previous posting though, hash() really doesn't
come into the question.
It was perhaps a mistake of me to bring it up...

As to fantasy... I am a little surprised that some people find such a
thing surprising.
In other environments, I've often refereed to objects directly by
their identity...
in C and C++ this is often handled simply by using pointers.

Anyway... does anybody have a suggestion of how best to bring this up
to the powers
that be?  A little further explanation would have saved me a great deal of time
(even if the answer is that for some reason one must never never do
such a thing.)

Thanks!

On Sun, Oct 20, 2019 at 10:15 PM Peter Otten <__peter__ at web.de> wrote:
>
> Steve White wrote:
>
> > On Sun, Oct 20, 2019 at 7:57 PM Peter Otten <__peter__ at web.de> wrote:
> >>
> >> Steve White wrote:
> >> >
> >> > The point is, I don't think __eq__() is ever called in a situation as
> >> > described in my post, yet the Python documentation states that if
> >> > instances are to be used as keys, it must not be used to determine if
> >> > non-identical instances are equivalent.
> >>
> >> What else should be used if not __eq__()? Where in the docs did you you
> >> see such a statement?
> >>
> > Nothing.  Nowhere.
> >
> > As I said, it appears that if __hash__ returns values from the id()
> > function, no collision ever occurs, so there is tie to break in a lookup,
> > so in particular there is no need for the heuristic of calling __eq__.
> >
> > And again, the problem here is that the documentation is a little weak
> > on just how these things really interact.  We're having to guess,
> > experiment, and even look at C code of the Python implementations.
> >
> >> The only limitation for a working dict or set is that for its keys or
> >> elements
> >>
> >> (1) a == b implies hash(a) == hash(b)
> >> (2) a == a
> >>
> >> Does your custom class violate one of the above?
> >>
> > Yes.  It's all in the original post, including code.
> > Give it a shot!
> >
> > Thanks!
>
> OK.
>
> $ cat tmp.py
> class A:
>     def __init__(self, hash):
>         self._hash = hash
>     def __eq__(self, other ):
>         raise Exception("__eq__")
>     def __hash__(self):
>         return self._hash
>
> def same_hash(x):
>     return hash(A(x)) == x
> $ python3 -i tmp.py
> >>> same_hash(2**63)
> False
> >>> same_hash(2**63-1)
> True
>
> The limit for smallish integers seems to be 2**63 for 64-bit CPython. As
> id() in CPython is the object's memory address you need data allocated above
> and below that limit to produce a hash collision and trigger __eq__. That
> appears to be rather unlikely.
>
> But even if I have this right it looks like you are relying on internals and
> I lack the fantasy to imagine a compelling use case.
>
> --
> https://mail.python.org/mailman/listinfo/python-list