-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fleet] Fleet takes 10 minutes from install of Agent & policy change to show Endpoint state 'Running' #80930
Comments
Pinging @elastic/ingest-management (Team:Ingest Management) |
@ph @nchaulet @ferullo @gogochan - Dan, do you think Endpoint could have any part to play in this, or is it all Agent side? I don't know if anyone can help break down the timing of what is sent and when by the Agent or Endpoint... If this is all just 'showing it in the logs' I still feel it doesn't show as well as we'd like. |
I think this might be more on the Kibana side, the changes on the Agent and endpoint should be almost instance. I've assigned @nchaulet to it and marked it as a bug. |
Did you mean 7.11 or 7.10? I also doubt this is an Agent or Endpoint issue. Watching what happens on the host and/or sharing logs from Agent and Endpoint would confirm/dispute if it is a Kibana or host-side issue. |
I've moved it back to 7.10, I also think this is a Kibana issue here? @nchaulet any progress on this? |
Yes I was just investigating it, there is a bug for Kibana for sure, looks like the action |
After more investigation there is two problem here: First problem Second problem |
There is a work-around of installing the Agent with the Endpoint integration in the policy up-front, instead of switching to it - this reduces the wait time to see the Logs to circa 2 minutes, which is tolerable, if not fully optimized. It is harder without seeing it in action, but knowing that we have a confirmed bug that have a PR open which will reduce the cited scenario (by a significant margin), I would be ok to get that pr (# 81376) merged for 7.10 and call it sufficient. I submit we can accept this until the refactor in 7.11 cycle Dan, I was using 7.11 as my test ground for some reason (can't remember why), knowing that the code I wanted to assess was there, it was an ok testing ground. I'm pleased we can get this improved for 7.10 - nice work. |
Bug Conversion: Created 01 Testcase for this ticket: |
We did some performance tuning and increased the general polling time to 5 minutes, I believe it is related to this:
#75552
Then we intended at least a partial fix here: #78493
I'm not sure between the 2 that we have the user experience we want. I have recorded a movie and plotted the relevant times in screen capture to tell the story.
I feel this is hurting the initial user experience (and demo experience) and I think it may be causing automated tests to fail that were written prior to the 5 min change.
Kibana version:
7.11 snapshot with 7.11 Observability CI agent deployed to cloud
Browser version:
Chrome on Mac
Describe the bug:
When I install Agent, it is fairly quick to show some progress in the Activity Details, but when I change policy it takes a full 5 minutes to see updates in the log that Agent has the policy change and then 5 more mins to show Endpoint running
Steps to reproduce:
Expected behavior:
My original thought (if this is at all feasible) was to have the check-in be dynamic after a policy change comes in... perhaps for the next 5 minutes, it can check every minute - and after that go back to the 5 min poll. Or the temporary 1 minute poll could be for a shorter interval if it isn't needed, I'm not 100% sure what is happening between Endpoint and Agent here, I just know it feels long.
the movie is boring but tells a good story, for the sake of posting to the issue, I have captured the moments where *something happens and is shown to the user in the Activity Log.
sequence:
1 0mins-21seconds-enrolling
2 0mins-29seconds-online
3 0mins-53seconds-running-and-start-to-change-policy
4 1min-0seconds-policy-reassigned-toast
5 5mins-33-seconds-no-endpoint-activity
6 10mins-13-seoncs-no-endpoint-activity-yet
7 5mins-44-seconds-a-policy-change-shows-up
8 - 10mins-47-seconds-in-3-security-check-ins-show-up-together
z
I'll post the movie to our internal google drive:
https://drive.google.com/file/d/1SaATukD3Y6UkThZTfrNGiAtJZiazaeDH
The text was updated successfully, but these errors were encountered: