-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add test for elasticsearch re-connection after network error & allow graceful shutdown #40794
base: main
Are you sure you want to change the base?
Conversation
This pull request is now in conflicts. Could you fix it? 🙏
|
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
|
5e4d4de
to
877dc31
Compare
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I have a question. I'll approve once it's answered
1e9dcf4
to
33bcac0
Compare
33bcac0
to
f2718c3
Compare
When the Elasticsearch client fails to publish events, it ends up calling `Close` in the connection (that is reused). To cancel the in-flight requests, the context is cancelled and a new one is created to used in future requests. The callback to check the version holds a reference to the connection via a closure, now the Elasticsearch client holds a pointer to that connection, so whenever Close is called, the callback can create a request with the new, not cancelled, context. An integration test is added to ensure the ES output can always recover from network errors.
This commit moves the creation of the request context to the connect method.
There are some cases where the Connection will be used without calling Connect, so we initialise reqsContext and cancelReqs in the NewConnection function to avoid panics.
Connection.Connect now accepts a context to control the life cycle of its requests.
Add a context to outputs.Connectable.Connect to correctly manage the life cycle of the connection and it's requests.
50a8b1c
to
505c760
Compare
Proposed commit message
This commit reworks the
eslegclient.Connection
to accept a context in itsConnect
method, this allows the caller to cancel any in flight requests made by the connection by cancelling the context.The libbeat
outputs.Connectable
interface (used byoutputs.NetworkClient
) had to be updated to accept the context, which required refactoring in most of the outputs to also accept a context on connect.The worker from libbeat/publisher/pipeline/client_worker.go now uses a context for it's cancellation instead of a channel,
this context is also used when creating a connection to Elasticsearch.
An integration test is added to ensure the
ES output can always recover from network errors.
Checklist
[ ] I have made corresponding changes to the documentation[ ] I have made corresponding change to the default configuration filesCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Disruptive User Impact
It's a bug fix, there is no disruptive user impact
## Author's ChecklistHow to test this PR locally
Related issues
## Use cases## Screenshots## Logs