Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Azure] Migrate Azure logs package to adopt ecs@mappings #10224

Merged
merged 7 commits into from
Jul 8, 2024

Conversation

harnish-elastic
Copy link
Contributor

@harnish-elastic harnish-elastic commented Jun 24, 2024

  • Enhancement

Proposed commit message

Command

  go run github.com/andrewkroh/go-examples/ecs-update@014b35dfe4c9832b51e7c909a39a48257d6a005d \
    -ecs-version=8.11.0 \
    -ecs-git-ref=v8.11.0 \
    -fields-yml-drop-ecs \
    -kibana-version=^8.13.0 \
    -drop-import-mappings \
    packages/azure

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

@harnish-elastic harnish-elastic self-assigned this Jun 24, 2024
@ishleenk17
Copy link
Contributor

As part of migrating azure package to ecs@mappings, the below 2 datastreams have pipeline failures.

Firewall Logs:

Issue:
azure/firewall_logs test-dnsproxyrules-raw.log:
[0] parsing field value failed: field "dns.header_flags"'s value "QR" is not one of the expected values (AA, TC, RD, RA, AD, CD, DO)
The allowed values are [these](https://www.elastic.co/guide/en/ecs/current/ecs-dns.html#field-dns-header-flags). 

Solution: Considering that qr can't be a header flag, change the pipeline logs with appropriate flags

Platform Logs:

Issue: 
azure/platformlogs test-platformlogs-edgecases.log:
[0] parsing field value failed: field "event.outcome"'s value "succeeded" is not one of the allowed values (failure, success, unknown)
azure/platformlogs test-platformlogs-raw.log:
[0] parsing field value failed: field "event.outcome"'s value "succeeded" is not one of the allowed values (failure, success, unknown)

Azure platformlogs, the value for event.outcome is being set from the following processors.  [Refer](https:/elastic/integrations/blob/main/packages/azure/data_stream/platformlogs/elasticsearch/ingest_pipeline/default.yml#L174-L188)

  - convert:
      field: azure.platformlogs.Status
      target_field: event.outcome
      type: string
      if: "ctx?.event?.outcome == null && ctx?.azure?.platformlogs?.Status != null && ctx?.azure?.platformlogs?.Status instanceof String && ['success', 'failure', 'unknown', 'Succeeded', 'Failed'].contains(ctx.azure?.platformlogs?.Status)"

Solution: Remove the Succeeded, Failed. But, are those codes coming from the azure platform logs directly. Removing these might break things if user is already using these values. 

cc: @zmoog, @muthu-mps

@harnish-elastic harnish-elastic changed the title [Azure] Migrate ECS version to [email protected] [Azure] Migrate Azure package to ecs@mappings Jun 25, 2024
@muthu-mps
Copy link
Contributor

As part of migrating azure package to ecs@mappings, the below 2 datastreams have pipeline failures.

Firewall Logs:

Issue:
azure/firewall_logs test-dnsproxyrules-raw.log:
[0] parsing field value failed: field "dns.header_flags"'s value "QR" is not one of the expected values (AA, TC, RD, RA, AD, CD, DO)
The allowed values are [these](https://www.elastic.co/guide/en/ecs/current/ecs-dns.html#field-dns-header-flags). 

Solution: Considering that qr can't be a header flag, change the pipeline logs with appropriate flags
  • The QR is one of the header values for the Azure Firewall log. We cannot ignore this value for the Firewall logs.
  • Not much information available, But you can find the details on the QR header value here.

@muthu-mps
Copy link
Contributor

muthu-mps commented Jun 25, 2024

Platform Logs:

Issue: 
azure/platformlogs test-platformlogs-edgecases.log:
[0] parsing field value failed: field "event.outcome"'s value "succeeded" is not one of the allowed values (failure, success, unknown)
azure/platformlogs test-platformlogs-raw.log:
[0] parsing field value failed: field "event.outcome"'s value "succeeded" is not one of the allowed values (failure, success, unknown)

Azure platformlogs, the value for event.outcome is being set from the following processors.  [Refer](https:/elastic/integrations/blob/main/packages/azure/data_stream/platformlogs/elasticsearch/ingest_pipeline/default.yml#L174-L188)

  - convert:
      field: azure.platformlogs.Status
      target_field: event.outcome
      type: string
      if: "ctx?.event?.outcome == null && ctx?.azure?.platformlogs?.Status != null && ctx?.azure?.platformlogs?.Status instanceof String && ['success', 'failure', 'unknown', 'Succeeded', 'Failed'].contains(ctx.azure?.platformlogs?.Status)"

Solution: Remove the Succeeded, Failed. But, are those codes coming from the azure platform logs directly. Removing these might break things if user is already using these values. 

cc: @zmoog, @muthu-mps

  • In this case, When looking into the different Azure log integrations the event.outcome is not consistent from the Azure logs and when we take the sign-in logs as example the status_code:1 is considered as event.outcome: success and the status_code:0 as event.outcome:failure.
  • Ideally, we should convert the succeeded and failed values to success and failure respectively in the ingest pipeline.

Impact

  • By changing the status values if the user is using them in the filter they won't see any results for succeeded or failed. Instead they will be added to the success or failure. This is fine IMO.
  • If the chart groups values by event.outcome this will not include the succeeded value anymore.

@zmoog - What do you think?

@ishleenk17
Copy link
Contributor

  • The QR is one of the header values for the Azure Firewall log. We cannot ignore this value for the Firewall logs.

@harnish-elastic : To add this value to the expected values of DNS header flags, let's override the expected values of this field in ecs.yml, something like this example. Let's see if things work.

  • Ideally, we should convert the succeeded and failed values to success and failure respectively in the ingest pipeline.

@zmoog @muthu-mps : The question here is even if you convert succeeded -> success internally, will the user be expecting this status to be succeeded only?
IMO, the time it communicates the same meaning, it should not be a big deal. We can change these to expected values in the ingest pipeline. succeeded->success and failed->failure

Another option is overriding the event.outcome in ecs.yml like this example

@harnish-elastic
Copy link
Contributor Author

  • The QR is one of the header values for the Azure Firewall log. We cannot ignore this value for the Firewall logs.

@harnish-elastic : To add this value to the expected values of DNS header flags, let's override the expected values of this field in ecs.yml, something like this example. Let's see if things work.

@ishleenk17 I have checked by adding expected values like below and it worked!

- name: dns.header_flags
  external: ecs
  expected_values:
    - AA
    - TC
    - RD
    - RA
    - AD
    - CD
    - DO
    - QR

@zmoog @muthu-mps The question here is, is there a chance that we can also get the values out of the above mentioned list from azure firewall_logs?

@ishleenk17
Copy link
Contributor

@tommyers-elastic : As part of the PR, you moved the Geo ECS fields from ecs.yml to fields.yml.
As part migrating to ecs@mappings, ideally we should remove these fields as well and use the default ecs definition of geo fields. DO you see an issue here ?

@ishleenk17
Copy link
Contributor

@zmoog @muthu-mps The question here is, is there a chance that we can also get the values out of the above mentioned list from azure firewall_logs?

@muthu-mps @zmoog : Are there any concerns here else we will move on with these option overriding the default values for now.

@tommyers-elastic : As part of the https:/elastic/integrations/pull/8050, you moved the Geo ECS fields from ecs.yml to fields.yml.
As part migrating to ecs@mappings, ideally we should remove these fields as well and use the default ecs definition of geo fields. DO you see an issue here ?

@tommyers-elastic : WDYT ?

@muthu-mps
Copy link
Contributor

  • The QR is one of the header values for the Azure Firewall log. We cannot ignore this value for the Firewall logs.

@harnish-elastic : To add this value to the expected values of DNS header flags, let's override the expected values of this field in ecs.yml, something like this example. Let's see if things work.

@ishleenk17 I have checked by adding expected values like below and it worked!

- name: dns.header_flags
  external: ecs
  expected_values:
    - AA
    - TC
    - RD
    - RA
    - AD
    - CD
    - DO
    - QR

@zmoog @muthu-mps The question here is, is there a chance that we can also get the values out of the above mentioned list from azure firewall_logs?

So far we have not seen any other values other than QR.

@muthu-mps
Copy link
Contributor

@zmoog @muthu-mps The question here is, is there a chance that we can also get the values out of the above mentioned list from azure firewall_logs?

@muthu-mps @zmoog : Are there any concerns here else we will move on with these option overriding the default values for now.

These are the common ones and we can go-ahead with this values.

@ishleenk17
Copy link
Contributor

@zmoog @muthu-mps The question here is, is there a chance that we can also get the values out of the above mentioned list from azure firewall_logs?
@muthu-mps @zmoog : Are there any concerns here else we will move on with these option overriding the default values for now.

These are the common ones and we can go-ahead with this values.

@harnish-elastic : Lets proceed with these then

@muthu-mps
Copy link
Contributor

  • The QR is one of the header values for the Azure Firewall log. We cannot ignore this value for the Firewall logs.

@harnish-elastic : To add this value to the expected values of DNS header flags, let's override the expected values of this field in ecs.yml, something like this example. Let's see if things work.

  • Ideally, we should convert the succeeded and failed values to success and failure respectively in the ingest pipeline.

@zmoog @muthu-mps : The question here is even if you convert succeeded -> success internally, will the user be expecting this status to be succeeded only? IMO, the time it communicates the same meaning, it should not be a big deal. We can change these to expected values in the ingest pipeline. succeeded->success and failed->failure

Another option is overriding the event.outcome in ecs.yml like this example

I don't think user would expect the succeeded value. I would also like the hear from @zmoog based on the past experiences.

@harnish-elastic
Copy link
Contributor Author

@zmoog @muthu-mps The question here is, is there a chance that we can also get the values out of the above mentioned list from azure firewall_logs?
@muthu-mps @zmoog : Are there any concerns here else we will move on with these option overriding the default values for now.

These are the common ones and we can go-ahead with this values.

@harnish-elastic : Lets proceed with these then

Changes are already pushed!

@zmoog
Copy link
Contributor

zmoog commented Jul 3, 2024

@zmoog @muthu-mps The question here is, is there a chance that we can also get the values out of the above mentioned list from azure firewall_logs?
@muthu-mps @zmoog : Are there any concerns here else we will move on with these option overriding the default values for now.

These are the common ones and we can go-ahead with this values.

@harnish-elastic : Lets proceed with these then

Changes are already pushed!

I'm late to the party, but it's +1 from me.

The question here is even if you convert succeeded -> success internally, will the user be expecting this status to be succeeded only? IMO, the time it communicates the same meaning, it should not be a big deal. We can change these to expected values in the ingest pipeline. succeeded->success and failed->failure
Another option is overriding the event.outcome in ecs.yml like this example

I don't think user would expect the succeeded value. I would also like the hear from @zmoog based on the past experiences.

We usually try to maintain as much backward compatibility as possible. However, the current event.outcome field values are not the expected ones per ECS.

Platform logs are a semi-generic data stream. The Azure Native ISV uses platform logs as the default data stream when it doesn't have a specific routing.

Aligning the event.outcome field to the ECS expected values (succeeded->success and failed->failure) feels like a breaking change. Since platform logs are a generic data stream, users may use this field to check for event outcomes on various log categories, in ways that go beyond what the integration offers.

To take this as the opportunity to align the expected values (succeeded->success and failed->failure) for the event.outcome field, we probably need to highlight this release contains breaking changes, like bumping package version from v1 to v2. I would take this opportunity.

@jsoriano, what is the best practice to communicate breaking changes in integration releases?

@jsoriano
Copy link
Member

jsoriano commented Jul 3, 2024

@jsoriano, what is the best practice to communicate breaking changes in integration releases?

As you mention we try to avoid them, but if needed, you could bump the major version, and include a "breaking-change" entry in the changelog.
I think this would be more than enough for this case where we are slightly changing some values for coherence with ECS and other integrations.

@zmoog
Copy link
Contributor

zmoog commented Jul 3, 2024

@jsoriano, what is the best practice to communicate breaking changes in integration releases?

As you mention we try to avoid them, but if needed, you could bump the major version, and include a "breaking-change" entry in the changelog. I think this would be more than enough for this case where we are slightly changing some values for coherence with ECS and other integrations.

Yeah, I try to avoid breaking changes as much as possible. In this case, aligning the event.outcome field value to ECS and other integrations is probably worth the cost.

To properly mark this change as breaking, do we need to set type as in the following example?

- version: "0.4.0"
  changes:
    - description: added anonymous auth config, replaced some RUM config
      type: breaking-change
      link: https:/elastic/apm-server/pull/5623

@jsoriano
Copy link
Member

jsoriano commented Jul 3, 2024

In this case, aligning the event.outcome field value to ECS and other integrations is probably worth the cost.

Agree. And if any user is affected by this change the adaptation would be easy, by modifying queries or dashboards to use the new values.

To properly mark this change as breaking, do we need to set type as in the following example?

Correct.

@ishleenk17
Copy link
Contributor

To properly mark this change as breaking, do we need to set type as in the following example?

@harnish-elastic : Let's take care of this.
Please resolve the CI error

@elasticmachine
Copy link

elasticmachine commented Jul 5, 2024

🚀 Benchmarks report

Package azure 👍(7) 💚(2) 💔(2)

Expand to view
Data stream Previous EPS New EPS Diff (%) Result
firewall_logs 1976.28 1547.99 -428.29 (-21.67%) 💔
platformlogs 4694.84 3508.77 -1186.07 (-25.26%) 💔

To see the full report comment with /test benchmark fullreport

@harnish-elastic harnish-elastic marked this pull request as ready for review July 5, 2024 11:11
@harnish-elastic harnish-elastic requested review from a team as code owners July 5, 2024 11:11
@harnish-elastic
Copy link
Contributor Author

To properly mark this change as breaking, do we need to set type as in the following example?

@harnish-elastic : Let's take care of this. Please resolve the CI error

Updated, thanks!

CI is getting failed due to SonarQube Code Analysis which can be ignored as per this comment!

@muthu-mps muthu-mps changed the title [Azure] Migrate Azure package to ecs@mappings [Azure] Migrate Azure logs package to adopt ecs@mappings Jul 5, 2024
@@ -1,3 +1,11 @@
- version: "1.12.0"
changes:
- description: ECS version updated to 8.11.0. Update the kibana constraint to ^8.13.0. Modified the field definitions to remove ECS fields made redundant by the ecs@mappings component template.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changelog has multiple updates, Can we have separate descriptions for each.

  • Update ECS version to 8.11.0
  • Update the kibana constraint to ^8.13.0 to adopt ecs@mappings component template.

No need to add this sentence in the changelog entry,
Modified the field definitions to remove ECS fields made redundant by the ecs@mappings component template.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the text that the automation writes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate entries sounds like a reasonable enhancement to make to ecs-update. The related code is at https:/andrewkroh/go-examples/blob/7bd2d3beac526b97b7734b2b51433d4be54d0fb6/ecs-update/ecs-update.go#L343 if anyone wants to contribute.

@muthu-mps
Copy link
Contributor

@harnish-elastic - After making the changes, Did we test the data collection for any of the Azure logs integration?

Copy link
Contributor

@efd6 efd6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

application_gateway and firewall_logs LGTM

@@ -1,3 +1,11 @@
- version: "1.12.0"
changes:
- description: ECS version updated to 8.11.0. Update the kibana constraint to ^8.13.0. Modified the field definitions to remove ECS fields made redundant by the ecs@mappings component template.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the text that the automation writes.

- AD
- CD
- DO
- QR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In coredns this flag is removed.

- script:
lang: painless
ignore_failure: true
description: remove QR header since it is not allowed per ECS spec in 8.4.0
if: ctx.dns?.header_flags != null && ctx.dns.header_flags instanceof List
source: |
for (int i=0; i<ctx.dns.header_flags.length; i++) {
if (ctx.dns.header_flags[i] == 'QR') {
ctx.dns.header_flags.remove(i);
}
}

ISTM that either it should be removed, used to otherwise colour the document or be added to the allowed set in ECS (or a combination of these).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QR (Query/Rsponse) seems to be a valid DNS header flag.
IMO, we should not remove it here, rather have it added in the allowed list.
@jsoriano

Copy link
Contributor

@ishleenk17 ishleenk17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!
Please address others' comments if pending.

@harnish-elastic
Copy link
Contributor Author

@harnish-elastic - After making the changes, Did we test the data collection for any of the Azure logs integration?

I have done the pipeline tests as Azure package is not having the system tests for platformlogs data stream. Pipeline tests are working as expected!

Copy link
Contributor

@niraj-elastic niraj-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@muthu-mps muthu-mps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@zmoog zmoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only a minor fix to the repo name in the changelog link.

packages/azure/changelog.yml Outdated Show resolved Hide resolved
@elasticmachine
Copy link

💚 Build Succeeded

History

cc @harnish-elastic

@ishleenk17 ishleenk17 merged commit b99f669 into elastic:main Jul 8, 2024
5 checks passed
@elasticmachine
Copy link

Package azure - 1.12.0 containing this change is available at https://epr.elastic.co/search?package=azure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants