Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] kafka v2 using parsers #27335

Merged
merged 10 commits into from
Sep 6, 2021

Conversation

mjmbischoff
Copy link
Contributor

@mjmbischoff mjmbischoff commented Aug 12, 2021

What does this PR do?

Moving it forward to version 2 of the input code removes tech dept. Adding parsers allows the use of parsers, allowing the payload to be interpenetrated in a flexible way which for example fixes the issue outlined in #26833 See also #26130 & #15324

Why is it important?

Addresses tech dept, allows the kafka input to be used in more setups.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

See the integration tests for the kafka input

Related issues

Use cases

Having parsers allows the payload to be interpreted in a more flexible way, for example json can be picked up as structured data. This means that if the data is in the correct structure it can be picked up by a module. #27154 and #26862 would also allows us to further correct things on the FB side but this PR allows preprocessing to happen in an external logstash or filebeat in front of the filebeat with the module

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Aug 12, 2021
@mjmbischoff mjmbischoff requested a review from kvch August 12, 2021 11:58
@kvch kvch requested a review from faec August 12, 2021 12:20
@elasticmachine
Copy link
Collaborator

elasticmachine commented Aug 12, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-09-02T13:53:35.460+0000

  • Duration: 100 min 48 sec

  • Commit: f4977e7

Test stats 🧪

Test Results
Failed 0
Passed 15165
Skipped 2315
Total 17480

Trends 🧪

Image of Build Times

Image of Tests

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 15165
Skipped 2315
Total 17480

Copy link
Contributor

@faec faec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good! The only obvious thing I see that needs work (other than updating generated files to make the linter happy) is handling of expandEventListFromField. Otherwise everything looks like a clear improvement over the previous version :-)

@jsoriano jsoriano added the Team:Elastic-Agent Label for the Agent team label Aug 20, 2021
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Aug 20, 2021
@mjmbischoff
Copy link
Contributor Author

Think should be pretty good now, I do still need to add documentation and update the description of the PR

I adapted the Ack for expandEventListFromField to only ack when all messages are send to elasticsearch. This to avoid it not sending at least once if messages spawn across two batches and filebeat fails. Also we seem to commit the offset multiple times which requires us to obtain ownership of a mutex, whcih feels a little expensive also as a coincidence https://twitter.com/gwenshap/status/1429147326455508992 showed up on my feed which gives me an idea for another PR.

Picked a reader over a parser because if we introduce generic behavior we might want different name / location expandEventListFromField config.

@mjmbischoff mjmbischoff changed the title Filebeat kafka v2 using parsers [Filebeat] kafka v2 using parsers Aug 22, 2021
@mjmbischoff
Copy link
Contributor Author

couldn't reuse input-filestream-reader-options.asciidoc as it wording is a little log specific

@mjmbischoff mjmbischoff marked this pull request as ready for review August 22, 2021 23:34
@elasticmachine
Copy link
Collaborator

Pinging @elastic/agent (Team:Agent)

@mjmbischoff
Copy link
Contributor Author

Pinging @faec / @kvch I think it's complete let me know if I missed something.

filebeat/input/kafka/input.go Outdated Show resolved Hide resolved
filebeat/input/kafka/input.go Outdated Show resolved Hide resolved
@mjmbischoff mjmbischoff reopened this Aug 31, 2021
Copy link
Contributor

@kvch kvch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, we are going in the right direction. Thanks!

Copy link
Contributor

@faec faec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thank you!

@mjmbischoff mjmbischoff merged commit 20d6038 into elastic:master Sep 6, 2021
@mjmbischoff
Copy link
Contributor Author

mjmbischoff commented Sep 6, 2021

Thank you both, @faec & @kvch !

@mjmbischoff mjmbischoff deleted the filebeat-kafka-using-parsers branch September 6, 2021 09:27
@kvch kvch added the backport-v7.16.0 Automated backport with mergify label Sep 6, 2021
mergify bot pushed a commit that referenced this pull request Sep 6, 2021
* using input v2
adding config for parsers

* implemented reader and parsers

* implemented expandEventListFromField in separate reader
Fixing lint issues

* Adding documentation

* Adding plugin test method

* making parseMultipleMessages a function of listFromFieldReader

* changing order of return values

* Adding replacement for input v1

* lint :(

(cherry picked from commit 20d6038)
mjmbischoff added a commit that referenced this pull request Sep 6, 2021
* using input v2
* implemented reader and parsers

(cherry picked from commit 20d6038)

Co-authored-by: Michael Bischoff <[email protected]>
mdelapenya added a commit to mdelapenya/beats that referenced this pull request Sep 9, 2021
* master: (39 commits)
  [Heartbeat] Move JSON tests from python->go (elastic#27816)
  docs: simplify permissions for Dockerfile COPY (elastic#27754)
  Osquerybeat: Fix osquery logger plugin severy levels mapping (elastic#27789)
  [Filebeat] Update compatibility function to remove processor description on ES < 7.9.0 (elastic#27774)
  warn log entry and no validation failure when both queue_url and buck… (elastic#27612)
  libbeat/cmd/instance: ensure test config file has appropriate permissions (elastic#27178)
  [Heartbeat] Add httpcommon options to ZipURL (elastic#27699)
  Add a header round tripper option to httpcommon (elastic#27509)
  [Elastic Agent] Add validation to ensure certificate paths are absolute. (elastic#27779)
  Rename dashboards according to module.yml files for master (elastic#27749)
  Refactor vagrantfile, add scripts for provisioning with docker/kind (elastic#27726)
  Accept syslog dates with leading 0 (elastic#27775)
  [Filebeat] Add timezone config option to decode_cef and syslog input (elastic#27727)
  [Filebeat] Threatintel compatibility updates (elastic#27323)
  Add support for ephemeral containers in elastic agent dynamic provider (elastic#27707)
  [Filebeat] Integration tests in CI for AWS-S3 input (elastic#27491)
  Fix flakyness of TestFilestreamEmptyLine (elastic#27705)
  [Filebeat] kafka v2 using parsers (elastic#27335)
  Update Kafka version parsing / supported range (elastic#27720)
  Update Sarama to 1.29.1 (elastic#27717)
  ...
Icedroid pushed a commit to Icedroid/beats that referenced this pull request Nov 1, 2021
* using input v2
adding config for parsers

* implemented reader and parsers

* implemented expandEventListFromField in separate reader
Fixing lint issues

* Adding documentation

* Adding plugin test method

* making parseMultipleMessages a function of listFromFieldReader

* changing order of return values

* Adding replacement for input v1

* lint :(
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v7.16.0 Automated backport with mergify enhancement Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants