Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[META] Insufficient guardrails leading to disk going full on nodes #5712

Open
5 of 8 tasks
RS146BIJAY opened this issue Jan 5, 2023 · 0 comments
Open
5 of 8 tasks
Labels
distributed framework enhancement Enhancement or improvement to existing feature or request Meta Meta issue, not directly linked to a PR Roadmap:Stability/Availability/Resiliency Project-wide roadmap label

Comments

@RS146BIJAY
Copy link
Contributor

RS146BIJAY commented Jan 5, 2023

Is your feature request related to a problem? Please describe.
Currently, we are observing multiple instances where data volume is getting 100% filled up on one or more node. We have guardrails like flood stage watermark in place which ensures that OpenSearch put blocks at the right time and enough amount of space is available for OpenSearch to perform internal operations (like segment merge, cluster state update etc.). Still, we sometime observe that available space on few or all data nodes of a domain goes to 0. This can cause node from getting removed from cluster (by either FSHealthService checks or due to cluster state update) which may ultimately result in red clusters (if it contains active primary).

Describe the solution you'd like
OpenSearch should ensure that guardrails like FloodStage watermarks are applied correctly and enough amount of space is available for OpenSearch to perform internal operations (like segment merge, cluster state update etc.).

OpenSearch Subtasks

@RS146BIJAY RS146BIJAY added enhancement Enhancement or improvement to existing feature or request untriaged labels Jan 5, 2023
@Bukhtawar Bukhtawar added Meta Meta issue, not directly linked to a PR distributed framework and removed untriaged labels Jan 5, 2023
@Bukhtawar Bukhtawar changed the title [META] Disk space full issue in OpenSearch [META] Insufficient Guardrails leading to disk going full on nodes Jan 5, 2023
@Bukhtawar Bukhtawar changed the title [META] Insufficient Guardrails leading to disk going full on nodes [META] Insufficient guardrails leading to disk going full on nodes Jan 5, 2023
@andrross andrross added the Roadmap:Stability/Availability/Resiliency Project-wide roadmap label label May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
distributed framework enhancement Enhancement or improvement to existing feature or request Meta Meta issue, not directly linked to a PR Roadmap:Stability/Availability/Resiliency Project-wide roadmap label
Projects
Status: New
Development

No branches or pull requests

3 participants