Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a narrative chapter on object hashing #369

Merged
merged 5 commits into from
Apr 16, 2018
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 19 additions & 9 deletions docs/hashing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,10 @@ Hashing
.. warning::

The overarching theme is to never set the ``@attr.s(hash=X)`` parameter yourself.
Leave it at ``None`` which means that ``attrs`` will do the right thing for you, depending on the other parameters.
Leave it at ``None`` which means that ``attrs`` will do the right thing for you, depending on the other parameters:

- If you want to make objects hashable by value: use ``@attr.s(frozen=True)``.
- If you want hashing and comparison by object identity: use ``@attr.s(cmp=False)``

Setting ``hash`` yourself can have unexpected consequences so we recommend to tinker with it only if you know exactly what you're doing.

Expand All @@ -15,15 +18,10 @@ The *hash* of an object is an integer that represents the contents of an object.
It can be obtained by calling :func:`hash` on an object and is implemented by writing a ``__hash__`` method for your class.

``attrs`` will happily write a ``__hash__`` method you [#fn1]_, however it will *not* do so by default.
Because according to the definition_ from the official Python docs, the returned hash has to fullfill two constraints:

#. The hash of an object *must not* change.
This is the reason why mutable structures like lists, dictionaries, or sets aren't hashable while immutable ones like tuples or frozensets are.
Because according to the definition_ from the official Python docs, the returned hash has to fullfill certrain constraints:

This comment was marked as spam.

This comment was marked as spam.


If you create a class with ``@attr.s(frozen=True)`` this is fullfilled by definition therefore ``attrs`` will write a ``__hash__`` function for you automatically.
You can also force it to write one with ``hash=True`` but then it's *your* responsibility to make sure that the object is not mutated.

#. Two objects that are equal, must have the same hash. This means that if ``x == y``, it *must* follow that ``hash(x) == hash(y)``.
#. Two objects that are equal, **must** have the same hash.
This means that if ``x == y``, it *must* follow that ``hash(x) == hash(y)``.

By default, Python classes are compared *and* hashed by their :func:`id`.
That means that every instance of a class has a different hash, no matter what attributes it carries.
Expand All @@ -36,6 +34,18 @@ Because according to the definition_ from the official Python docs, the returned
The *correct way* to achieve hashing by id is to set ``@attr.s(cmp=False)``.
Setting ``@attr.s(hash=False)`` (that implies ``cmp=True``) is almost certainly a *bug*.

#. If two object are not equal, their hash **should** be different.

While this isn't a requirement from a standpoint of correctness, sets and dicts become less effective if there are a lot of identical hashes.

This comment was marked as spam.

The worst case is when all objects have the same hash which turns a set into a list.

#. The hash of an object **must not** change.

If you create a class with ``@attr.s(frozen=True)`` this is fullfilled by definition, therefore ``attrs`` will write a ``__hash__`` function for you automatically.
You can also force it to write one with ``hash=True`` but then it's *your* responsibility to make sure that the object is not mutated.

This point is the reason why mutable structures like lists, dictionaries, or sets aren't hashable while immutable ones like tuples or frozensets are:
point 1 and 2 require that the hash changes with the contents but point 3 forbids it.

For a more thorough explanation of this topic, please refer to this blog post: `Python Hashes and Equality`_.

This comment was marked as spam.

This comment was marked as spam.


Expand Down
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ Day-to-Day Usage

- Once you're comfortable with the concepts, our :doc:`api` contains all information you need to use ``attrs`` to its fullest.
- If you want to put objects into sets or use them as keys in dictionaries, they have to be hashable.
Unfortunately the topic is more complex than it seems but :doc:`hashing` will give you a primer on what to look out for.
The simplest way to do that is to use frozen classes, but the topic is more complex than it seems and :doc:`hashing` will give you a primer on what to look out for.
- ``attrs`` is built for extension from the ground up.
:doc:`extending` will show you the affordances it offers and how to make it a building block of your own projects.

Expand Down