-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Visibility problem when cluster.routing.allocation.awareness.attributes is misconfigured #16195
Labels
:Distributed/Allocation
All issues relating to the decision making around placing a shard (both master logic & on the nodes)
>enhancement
help wanted
adoptme
Comments
The shard allocation explain API (#14593) would be a big win here, but I agree that the logging and failure messages can be improved. |
lcawl
added
:Distributed/Distributed
A catch all label for anything in the Distributed Area. If you aren't sure, use this one.
and removed
:Allocation
labels
Feb 13, 2018
clintongormley
added
:Distributed/Allocation
All issues relating to the decision making around placing a shard (both master logic & on the nodes)
and removed
:Distributed/Distributed
A catch all label for anything in the Distributed Area. If you aren't sure, use this one.
labels
Feb 14, 2018
I think this is fixed by #14593. In 6.2.1,
This seems sufficient. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
:Distributed/Allocation
All issues relating to the decision making around placing a shard (both master logic & on the nodes)
>enhancement
help wanted
adoptme
Scenario: If a cluster is misconfigured such that all nodes have
cluster.routing.allocation.awareness.attributes=some_missing_attribute
and zero nodes actually set this attribute, then shard allocation will fail and new indexes will have unallocated shards.The symptoms are:
The challenge is that it is very difficult to debug this scenario. @dakrone was kind and pointed me at a nifty (advanced!) trick to ask Elasticsearch why it's allocation decision was made, and the decision is unhelpful (details below).
To reproduce this:
cluster.routing.allocation.awareness.attributes=foobar
Troubleshooting: Check the elasticsearch logs (at default level) and I don't see any information hinting at allocation issues.
Debugging: Try a dry run allocation via _cluster/reroute:
Overall, I believe Elasticsearch to be acting correctly (It has nowhere to route shards because of our configuration!). However, my concern is the lack of visibility into this issue for users:
_cluster/reroute
uses phrasing that I interpret to mean that allocation on is disabled.For a user, the correction would be to have at least one data node with
node.foobar: whatever
(more generally, that an awareness attribute must exist on at least one data node), and with some more clear logging and/or response/hinting to tell users something along the lines of "Could not allocate shard on any nodes because no nodes match the criteria: has attributefoobar
"The text was updated successfully, but these errors were encountered: