Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Add support to timeout segment uploads #13783

Closed
linuxpi opened this issue May 22, 2024 · 0 comments · Fixed by #13679
Closed

[Remote Store] Add support to timeout segment uploads #13783

linuxpi opened this issue May 22, 2024 · 0 comments · Fixed by #13679
Assignees
Labels
enhancement Enhancement or improvement to existing feature or request Storage:Resiliency Issues and PRs related to the storage resiliency Storage Issues and PRs relating to data and metadata storage v2.15.0 Issues and PRs related to version 2.15.0

Comments

@linuxpi
Copy link
Collaborator

linuxpi commented May 22, 2024

Is your feature request related to a problem? Please describe

  • Segment uploads for Remote Store happen without any timeout and can hold the thread on latch.await() for ever if there is an error which is swallowed silently by the code.
  • Since the future is never completed, the shard is stuck and is never able to retry the upload
  • This leads to remote time lag to increase and never come down until we restart the process.

Describe the solution you'd like

  • Add a timeout to segment uploads to recover automatically from such situations after the timeout is reached
  • Allow this timeout to be configurable via dynamic cluster settings

Related component

Storage:Performance

Describe alternatives you've considered

No response

Additional context

No response

@linuxpi linuxpi added enhancement Enhancement or improvement to existing feature or request untriaged labels May 22, 2024
@linuxpi linuxpi self-assigned this May 22, 2024
@linuxpi linuxpi added Storage:Resiliency Issues and PRs related to the storage resiliency Storage Issues and PRs relating to data and metadata storage v2.15.0 Issues and PRs related to version 2.15.0 and removed untriaged Storage:Performance labels May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Storage:Resiliency Issues and PRs related to the storage resiliency Storage Issues and PRs relating to data and metadata storage v2.15.0 Issues and PRs related to version 2.15.0
Projects
Status: ✅ Done
Development

Successfully merging a pull request may close this issue.

1 participant