Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Handle initialization timeout better #81024

Closed
mostlyjason opened this issue Oct 19, 2020 · 4 comments
Closed

[Fleet] Handle initialization timeout better #81024

mostlyjason opened this issue Oct 19, 2020 · 4 comments
Labels
bug Fixes for quality problems that affect the customer experience research Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@mostlyjason
Copy link
Contributor

I just launched a new deployment on staging with 7.10 and I got this error:

Unable to initialize Fleet
[process_cluster_event_timeout_exception] failed to process cluster event (put-pipeline-logs-endpoint.events.library-0.16.0) within 30s response from /_ingest/pipeline/logs-endpoint.events.library-0.16.0: {"error":{"root_cause":[{"type":"process_cluster_event_timeout_exception","reason":"failed to process cluster event (put-pipeline-logs-endpoint.events.library-0.16.0) within 30s"}],"type":"process_cluster_event_timeout_exception","reason":"failed to process cluster event (put-pipeline-logs-endpoint.events.library-0.16.0) within 30s"},"status":503}

It worked when I refreshed the page a few minutes later, indicating it was a timeout to a background process that resolved on its own.

This looks like bad error because it happens during the getting started flow, its ugly and doesn't tell the user what to do to overcome the problem. It'd be better if we could extend the timeout to make it less likely users will encounter it. Alternatively, we could update our error message for 503s to say "please try again in a few minutes".

@mostlyjason mostlyjason added the Team:Fleet Team label for Observability Data Collection Fleet team label Oct 19, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/ingest-management (Team:Ingest Management)

@ph ph added the bug Fixes for quality problems that affect the customer experience label Oct 19, 2020
@Aqualie
Copy link

Aqualie commented Mar 26, 2021

Facing this issue after upgrading from 7.11.2 to 7.12.0:

Unable to initialize Fleet
[process_cluster_event_timeout_exception] failed to process cluster event (put-pipeline-logs-system.system-0.10.9) within 30s response from /_ingest/pipeline/logs-system.system-0.10.9: {"error":{"root_cause":[{"type":"process_cluster_event_timeout_exception","reason":"failed to process cluster event (put-pipeline-logs-system.system-0.10.9) within 30s"}],"type":"process_cluster_event_timeout_exception","reason":"failed to process cluster event (put-pipeline-logs-system.system-0.10.9) within 30s"},"status":503}

Does not work when refreshing the page minutes or hours later.

@jen-huang
Copy link
Contributor

This might be alleviated with the recent changes we made to not have /setup errors block the UI (#97404). We need to double check.

@jlind23
Copy link
Contributor

jlind23 commented Apr 4, 2023

Closing as not relevant anymore

@jlind23 jlind23 closed this as not planned Won't fix, can't repro, duplicate, stale Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience research Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

No branches or pull requests

7 participants