Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report more details of unobtainable ShardLock #61255

Merged
merged 3 commits into from
Aug 19, 2020

Commits on Aug 18, 2020

  1. Report more details of unobtainable ShardLock

    Today a common reason for a `ShardLockObtainFailedException` is when a
    shard is removed from a node and then assigned straight back to it again
    before the node has had a chance to shut the previous shard instance
    down. For instance, this can happen if a node briefly leaves the cluster
    holding a primary with no in-sync replicas.
    
    The message in this case is typically as follows:
    
        obtaining shard lock timed out after 5000ms, previous lock details: [shard creation] trying to lock for [shard creation]
    
    This is pretty hard to interpret, and doesn't raise the important
    question: "why didn't the shard shut down sooner?"
    
    With this change we reword the message a bit, report the age of the
    shard lock, and adjust the details to report that the lock is held by a
    closing shard:
    
        obtaining shard lock for [starting shard] timed out after [5000ms], lock already held for [closing shard] with age [12345ms]
    
    Relates elastic#38807
    DaveCTurner committed Aug 18, 2020
    Configuration menu
    Copy the full SHA
    00666f4 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    ab94d42 View commit details
    Browse the repository at this point in the history
  3. CR

    DaveCTurner committed Aug 18, 2020
    Configuration menu
    Copy the full SHA
    48bf874 View commit details
    Browse the repository at this point in the history