Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x-pack/filebeat: AWS test failure #40503

Open
2 tasks
oakrizan opened this issue Aug 13, 2024 · 3 comments
Open
2 tasks

x-pack/filebeat: AWS test failure #40503

oakrizan opened this issue Aug 13, 2024 · 3 comments
Labels
flaky-test Unstable or unreliable test cases. Team:Security-Service Integrations Security Service Integrations Team

Comments

@oakrizan
Copy link
Contributor

oakrizan commented Aug 13, 2024

Flaky Test

On 7.17 branch AWS tests were enabled for specific changeset and were successful (eg. #35885).
After Beats migration from Jenkins to Buildkite, AWS tests were temporarily re-enabled for x-pack/filebeat on 8.*/main branches for validation purposes.

AWS test failure context: #36425

  • main/8.* branches
    On main/8.* branches previously reported github.com/elastic/beats/v7/x-pack/filebeat/input/cel was still occasionally failing with same error. Once retried the test was successful.
    Though the build failed while was running cometd container for localstack integration tests.

There was a similar issue opened when running test on Windows: #39657. It was fixed by increasing timeout from 5 to 10 (#39713). Apparently it's sometimes not enough when running tests on AWS.

I have created a #40162, where timeout is 20 and tests seems to be successful when executed on AWS: https://buildkite.com/elastic/beats-xpack-filebeat/builds/4118. Basically this problem can be bypassed either by enabling retry for AWS step, either by increasing timeout again.

beats-xpack-filebeat_build_3351_ubuntu-x-pack-slash-filebeat-aws-tests.log
beats-xpack-filebeat_build_3351_ubuntu-x-pack-slash-filebeat-aws-tests-retry.log

  • Add aws label to beats-related PR
  • Commit changes for x-pack/filebeat & build will be triggered in Buildkite

Tasks

Stack Trace

=== Failed
=== FAIL: x-pack/filebeat/input/cel TestInput/retry_failure (0.01s)
    input_test.go:1602: unexpected result for event 0: got:- want:+
          mapstr.M{
        - 	"error": map[string]any{
        - 		"message": string("failed eval: ERROR: <input>:2:19: failed to unmarshal JSON message: unexpected end of JSON input\n |  get(state.url).as(resp, {\n "...),
        - 	},
        + 	"hello": string("world"),
          }
=== FAIL: x-pack/filebeat/input/cel TestInput (38.59s)
@oakrizan oakrizan added the flaky-test Unstable or unreliable test cases. label Aug 13, 2024
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Aug 13, 2024
@ycombinator ycombinator added the Team:Security-Service Integrations Security Service Integrations Team label Aug 13, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-service-integrations (Team:Security-Service Integrations)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Aug 13, 2024
@rowlandgeoff
Copy link
Contributor

During the migration of beats-ci from Jenkins to Buildkite, a number of tests were failing consistently due to issues unrelated to the migration. Those tests were disabled to stabilize the CI, with the intent to revisit them post-migration. @oakrizan has reviewed them all in her draft PRs linked above in the description, and has opened tickets such as this one to highlight to the product teams the tests that are currently still disabled and could use some attention.

@oakrizan
Copy link
Contributor Author

I have potentially fixed problem with CURL by updating the version of observability/stream from v0.6.1 to v0.7.0 in testing/environments/docker/cometd/Dockerfile. But now AWS test step fails with AuthorizationError: https://buildkite.com/elastic/beats-xpack-filebeat/builds/5127#01915b0f-c5ba-41ba-9c70-453222f11cf3.
Will look for the solution for that.

In addition removed retry in initCloudEnv.sh since there is no log in case of failure of terraformApply.
Related PR: #40549

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky-test Unstable or unreliable test cases. Team:Security-Service Integrations Security Service Integrations Team
Projects
None yet
Development

No branches or pull requests

4 participants