Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change fields.ecs.yml to ECS 1.11.0 #27107

Merged
merged 4 commits into from
Aug 17, 2021
Merged

Conversation

leehinman
Copy link
Contributor

@leehinman leehinman commented Jul 28, 2021

What does this PR do?

  • fields.ecs.yml to ECS 1.11.0-dev
  • bump ecs.version in configs for modules that don't require changes.

Why is it important?

  • keep up to date with ECS changes
  • allows modules to use new ECS fields

Checklist

  • My code follows the style guidelines of this project
    - [ ] I have commented my code, particularly in hard-to-understand areas
    - [ ] I have made corresponding changes to the documentation
    - [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

cd filebeat && mage pythonIntegTest
cd x-pack/filebeat && mage pythonIntegTest

Related issues

@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Jul 28, 2021
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jul 28, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-08-16T23:30:00.134+0000

  • Duration: 233 min 23 sec

  • Commit: 0df0ce5

Test stats 🧪

Test Results
Failed 0
Passed 53106
Skipped 5320
Total 58426

Trends 🧪

Image of Build Times

Image of Tests

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 53106
Skipped 5320
Total 58426

@leehinman leehinman force-pushed the beats_ecs_1.11 branch 5 times, most recently from 43f02f5 to 36cc6f9 Compare July 28, 2021 22:22
@leehinman leehinman marked this pull request as ready for review July 29, 2021 03:27
@leehinman leehinman requested a review from a team as a code owner July 29, 2021 03:27
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@mergify
Copy link
Contributor

mergify bot commented Jul 30, 2021

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b beats_ecs_1.11 upstream/beats_ecs_1.11
git merge upstream/master
git push upstream beats_ecs_1.11

Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious how many new fields does this release add?

(My concern is that we add new fields to every beat with each ECS update (even if they are not utilized). This increases the mappings stored in cluster state and grows the response sizes in Kibana from the fields API.)

@leehinman leehinman force-pushed the beats_ecs_1.11 branch 4 times, most recently from f5d3efa to 2b8a519 Compare August 13, 2021 03:58
@mergify
Copy link
Contributor

mergify bot commented Aug 13, 2021

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b beats_ecs_1.11 upstream/beats_ecs_1.11
git merge upstream/master
git push upstream beats_ecs_1.11

Copy link
Contributor

@andrewstucki andrewstucki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments/questions.


nodeMap, err := kubemeta.GetValue(kind)
if err == nil {
node, _ := nodeMap.(common.MapStr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be cast/nil checked? If you want to do an unconditional cast, I'd remove the _ -- but assuming the above GetValue call returns a nil you probably want to be more explicit about the cast and nil checking here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments/questions.

* I'm thinking we should probably add orchestrator data to the processors as well:
  
  * https:/elastic/beats/tree/master/libbeat/processors/add_kubernetes_metadata
  * https:/elastic/beats/tree/master/x-pack/libbeat/processors/add_nomad_metadata

Added

* For the docker autodiscover/processor do you know if there's a way via Docker's API to detect if you're running in Swarm? If so maybe we should consider adding there as well? If it's too much of a heavy lift maybe we can add that in later as a separate enhancement though.

It looks like you might be able to "inspect swarm", and know if you are in a swarm or not. But I'm thinking a separate enhancement might be best.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be cast/nil checked? If you want to do an unconditional cast, I'd remove the _ -- but assuming the above GetValue call returns a nil you probably want to be more explicit about the cast and nil checking here.

changed this so we don't get intermediate maps, cleaner all around, don't know what I was thinking.

@@ -435,14 +437,16 @@ func (p *pod) podEvent(flag string, pod *kubernetes.Pod, ports common.MapStr, in
if len(namespaceAnnotations) != 0 {
kubemeta["namespace_annotations"] = namespaceAnnotations
}
orchestrator := genOrchestratorFields(kubemeta, "pod")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some guidance around the orchestrator.resource.type field? Just making sure it makes sense to put pod here, the examples in the docs all have service and not sure if there was a desire to make the language orchestrator agnostic or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kubernetes autodiscover provider can discover different kinds of resources, depending on the resource setting. When resource: service is used, it will collect events from services, and then I think it is fine to use orchestrator.resource.type: service there, but here it is discovering pods and containers, so I think that it is ok to use pod.

if err != nil {
return common.MapStr{}
}
nomad, _ := nomadMap.(common.MapStr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as previous casting comments on unconditional v. checked casts.


taskMap, err := nomad.GetValue("task")
if err == nil {
task, _ := taskMap.(common.MapStr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as previous casting comments on unconditional v. checked casts.

@leehinman
Copy link
Contributor Author

I'm curious how many new fields does this release add?

(My concern is that we add new fields to every beat with each ECS update (even if they are not utilized). This increases the mappings stored in cluster state and grows the response sizes in Kibana from the fields API.)

I counted 792 fields in ECS 1.10.0 fields.ecs.yml and 1207 in 1.11.0, for a difference of 415 fields added. That was after flattening the field names.

Diff is:
1.10-1.11_diff.txt

@leehinman leehinman changed the title Change fields.ecs.yml to ECS 1.11.0-dev Change fields.ecs.yml to ECS 1.11.0 Aug 13, 2021
@mergify
Copy link
Contributor

mergify bot commented Aug 13, 2021

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b beats_ecs_1.11 upstream/beats_ecs_1.11
git merge upstream/master
git push upstream beats_ecs_1.11

@leehinman
Copy link
Contributor Author

@jsoriano & @kaiyan-sheng could you take a look at the changes to kubernetes & nomad autodiscover and add_metadata processors in this PR. It is a start to adding the orchestrator fields, and I'd really like a sanity check.

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the nomad and kubernetes parts, added some comments about the types used, and also about possible interactions with the code that adds orchestrator.cluster.name/url fields in some cases.

As there can be more discussion in these parts, I would suggest to move them to a different PR if you want to move forward the upgrade of ECS.

ccing @MichaelKatsoulis for the kubernetes changes too.

taskName, err := meta.GetValue("nomad.task.name")
if err == nil {
orchestrator.Put("resource.name", taskName)
orchestrator.Put("resource.type", "task")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Events are generated at the allocation level, that is one level more specific than tasks (job -> task -> allocation), so I wonder if we should use allocations here instead of tasks.

For example, with filebeat, for a job with an nginx task in a task group with multiple instances I have events with:

  "nomad": {
    "allocation": {
      "id": "afb658cf-5a5f-aeef-8fa7-0755906919d1",
      "status": "running",
      "name": "redis.server[0]"
    },
    ...
    "job": {
      "name": "redis",
      "type": "service"
    },
    "task": {
      "name": "nginx",
      "service": {
    ...

And:

  "nomad": {
    "allocation": {
      "status": "running",
      "name": "redis.server[1]",
      "id": "feeddc6b-6f28-0cb6-8351-9dfec5dfcaac"
    },
    "job": {
      "name": "redis",
      "type": "service"
    },
    "region": "global",
    "task": {
      "name": "nginx",
      "service": {
      ...

if err == nil {
orchestrator.Put("namespace", namespace)
}

Copy link
Member

@jsoriano jsoriano Aug 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly add_nomad_metadata doesn't collect task-level metadata, this should be probably improved. But it collects allocation metadata, I think we should use this information if we want to fill type and name (in coherence with my other comment).

An example of the nomad metadata collected by add_nomad_metadata:

  "nomad": {
    "region": "global",
    "allocation": {
      "id": "feeddc6b-6f28-0cb6-8351-9dfec5dfcaac",
      "status": "running",
      "name": "redis.server[1]"
    },
    "job": {
      "name": "redis",
      "type": "service"
    },
    "namespace": "default",
    "datacenter": [
      "dc1"
    ]
  },

@@ -383,6 +383,7 @@ func (p *pod) containerPodEvents(flag string, pod *kubernetes.Pod, c *containerI
if len(namespaceAnnotations) != 0 {
kubemeta["namespace_annotations"] = namespaceAnnotations
}
orchestrator := genOrchestratorFields(kubemeta, "pod")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Events here are emitted at the container level, should we use the container fields instead?

Comment on lines 408 to 409
"meta": meta,
"orchestrator": orchestrator,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that to actually enrich the events, the new fields should be set in meta. I think that the other fields are used only in autodiscover (for templates, conditions and so on) but don't end up in the final documents. (But not 100% sure, I always get confused by that).

@@ -435,14 +437,16 @@ func (p *pod) podEvent(flag string, pod *kubernetes.Pod, ports common.MapStr, in
if len(namespaceAnnotations) != 0 {
kubemeta["namespace_annotations"] = namespaceAnnotations
}
orchestrator := genOrchestratorFields(kubemeta, "pod")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kubernetes autodiscover provider can discover different kinds of resources, depending on the resource setting. When resource: service is used, it will collect events from services, and then I think it is fine to use orchestrator.resource.type: service there, but here it is discovering pods and containers, so I think that it is ok to use pod.

event.Fields.DeepUpdate(kubeMeta)
event.Fields.DeepUpdate(orchestrator)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When beats are run in GKE, add_cloud_metadata adds the orchestrator.cluster.name/url fields, but I think that only if the orchestrator field doesn't exist (unless fields are flattened when they reach here). We will have to check how well both things play together.

Same problem may exist with the autodiscover provider.

For context, the name/url fields were added in #26056 and #26438.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsoriano Thank you for reviewing. And you've verified my suspicion that I should pull the orchestrator parts out and do that is a separate PR.

@leehinman leehinman force-pushed the beats_ecs_1.11 branch 3 times, most recently from 6c45afc to a811822 Compare August 16, 2021 20:34
@mergify
Copy link
Contributor

mergify bot commented Aug 16, 2021

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b beats_ecs_1.11 upstream/beats_ecs_1.11
git merge upstream/master
git push upstream beats_ecs_1.11

- fields.ecs.yml to ECS 1.11.0-dev
- bump ecs.version in configs for modules that don't require changes.

Relates elastic#26967
will do as separate PR
@leehinman leehinman merged commit 5869e1f into elastic:master Aug 17, 2021
leehinman added a commit that referenced this pull request Aug 17, 2021
* Change fields.ecs.yml to ECS 1.11.0

- fields.ecs.yml to ECS 1.11.0
- bump ecs.version in configs for modules that don't require changes.

Relates #26967

(cherry picked from commit 5869e1f)
leehinman added a commit that referenced this pull request Aug 17, 2021
* Change fields.ecs.yml to ECS 1.11.0

- fields.ecs.yml to ECS 1.11.0
- bump ecs.version in configs for modules that don't require changes.

Relates #26967

(cherry picked from commit 5869e1f)

Co-authored-by: Lee Hinman <[email protected]>
@leehinman leehinman deleted the beats_ecs_1.11 branch August 17, 2021 21:15
@simitt
Copy link
Contributor

simitt commented Aug 31, 2021

This PR breaks compatibility with older ES versions (tested with apm-server), which are not supporting flattened type. According to the support matrix, 7.x beats are supposed to support the whole range of ES 7.x.

update:

@leehinman
Copy link
Contributor Author

This PR breaks compatibility with older ES versions (tested with apm-server), which are not supporting flattened type. According to the support matrix, 7.x beats are supposed to support the whole range of ES 7.x.

Thanks for catching this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v7.15.0 Automated backport with mergify enhancement Filebeat Filebeat libbeat
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants