diff --git a/docs/hashing.rst b/docs/hashing.rst new file mode 100644 index 000000000..8bfd0413f --- /dev/null +++ b/docs/hashing.rst @@ -0,0 +1,56 @@ +Hashing +======= + +.. warning:: + + The overarching theme is to never set the ``@attr.s(hash=X)`` parameter yourself. + Leave it at ``None`` which means that ``attrs`` will do the right thing for you, depending on the other parameters: + + - If you want to make objects hashable by value: use ``@attr.s(frozen=True)``. + - If you want hashing and comparison by object identity: use ``@attr.s(cmp=False)`` + + Setting ``hash`` yourself can have unexpected consequences so we recommend to tinker with it only if you know exactly what you're doing. + +Under certain circumstances, it's necessary for objects to be *hashable*. +For example if you want to put them into a :class:`set` or if you want to use them as keys in a :class:`dict`. + +The *hash* of an object is an integer that represents the contents of an object. +It can be obtained by calling :func:`hash` on an object and is implemented by writing a ``__hash__`` method for your class. + +``attrs`` will happily write a ``__hash__`` method you [#fn1]_, however it will *not* do so by default. +Because according to the definition_ from the official Python docs, the returned hash has to fullfil certain constraints: + +#. Two objects that are equal, **must** have the same hash. + This means that if ``x == y``, it *must* follow that ``hash(x) == hash(y)``. + + By default, Python classes are compared *and* hashed by their :func:`id`. + That means that every instance of a class has a different hash, no matter what attributes it carries. + + It follows that the moment you (or ``attrs``) change the way equality is handled by implementing ``__eq__`` which is based on attribute values, this constraint is broken. + For that reason Python 3 will make a class that has customized equality unhashable. + Python 2 on the other hand will happily let you shoot your foot off. + Unfortunately ``attrs`` currently mimics Python 2's behavior for backward compatibility reasons if you set ``hash=False``. + + The *correct way* to achieve hashing by id is to set ``@attr.s(cmp=False)``. + Setting ``@attr.s(hash=False)`` (that implies ``cmp=True``) is almost certainly a *bug*. + +#. If two object are not equal, their hash **should** be different. + + While this isn't a requirement from a standpoint of correctness, sets and dicts become less effective if there are a lot of identical hashes. + The worst case is when all objects have the same hash which turns a set into a list. + +#. The hash of an object **must not** change. + + If you create a class with ``@attr.s(frozen=True)`` this is fullfilled by definition, therefore ``attrs`` will write a ``__hash__`` function for you automatically. + You can also force it to write one with ``hash=True`` but then it's *your* responsibility to make sure that the object is not mutated. + + This point is the reason why mutable structures like lists, dictionaries, or sets aren't hashable while immutable ones like tuples or frozensets are: + point 1 and 2 require that the hash changes with the contents but point 3 forbids it. + +For a more thorough explanation of this topic, please refer to this blog post: `Python Hashes and Equality`_. + + +.. [#fn1] The hash is computed by hashing a tuple that consists of an unique id for the class plus all attribute values. + +.. _definition: https://docs.python.org/3/glossary.html#term-hashable +.. _`Python Hashes and Equality`: https://hynek.me/articles/hashes-and-equality/ diff --git a/docs/index.rst b/docs/index.rst index bb24cd773..614decf57 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -36,6 +36,8 @@ Day-to-Day Usage ================ - Once you're comfortable with the concepts, our :doc:`api` contains all information you need to use ``attrs`` to its fullest. +- If you want to put objects into sets or use them as keys in dictionaries, they have to be hashable. + The simplest way to do that is to use frozen classes, but the topic is more complex than it seems and :doc:`hashing` will give you a primer on what to look out for. - ``attrs`` is built for extension from the ground up. :doc:`extending` will show you the affordances it offers and how to make it a building block of your own projects. @@ -68,6 +70,7 @@ Full Table of Contents overview why examples + hashing api extending how-does-it-work