Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Change Proposal] Support external fields as dimensions #247

Closed
2 tasks done
jsoriano opened this issue Nov 9, 2021 · 7 comments · Fixed by elastic/elastic-package#575
Closed
2 tasks done

[Change Proposal] Support external fields as dimensions #247

jsoriano opened this issue Nov 9, 2021 · 7 comments · Fixed by elastic/elastic-package#575
Assignees
Labels
discuss Issue needs discussion

Comments

@jsoriano
Copy link
Member

jsoriano commented Nov 9, 2021

As discussed in elastic/integrations#2076 (comment), ECS fields could be dimensions too.

One option could be to add the dimension flag directly in ECS, and import it from there. But I think that the same ECS field may be a dimension in some integrations, but not in others. As mentioned in #215, the set of dimensions should be minimal for a given data stream, so an integration including many ECS dimension fields may actually need only a subset of them to identify individual time series. For example host.id can be needed to identify time series in the system integration, but not to identify time series of pods in the kubernetes integration.

So we would need to support something like this:

- external: ecs
  name: container.id
  dimension: true

Current implementation doesn't work because it checks the type and it is not checking now external fields. It fails with this error:

Error: checking package failed: linting package failed: found 2 validation errors:
   1. file "/home/jaime/gocode/src/github.com/elastic/integrations/packages/kubernetes/data_stream/container/fields/ecs.yml" is invalid: field "container.id" of type  can't be a dimension, allowed types for dimensions: constant_keyword, keyword, long, integer, short, byte, double, float, half_float, scaled_float, unsigned_long, ip

Changes required:

@mtojek do you see problems in this plan? ^

@mtojek
Copy link
Contributor

mtojek commented Nov 9, 2021

Thanks for mentioning me.

LGTM, even for temporary mitigation/unblocking.

One option could be to add the dimension flag directly in ECS, and import it from there. But I think that the same ECS field may be a dimension in some integrations, but not in others

Is it possible to have different value types (dimension or not)? We could contact the ECS team and ask for guidance here. It would be better to have a single source of information. In the future the ECS can introduce the property dimension: true and a package field can inherit the property.

@jsoriano
Copy link
Member Author

jsoriano commented Nov 9, 2021

Is it possible to have different value types (dimension or not)?

Metric types are a different story, there I think that for consistency a given field should be always used with the same metric type. Currently some fields that should be used with a metric type have it in the documentation string, but this cannot be enforced in implementations. I have found a related issue here: elastic/ecs#721

But even being a different story, we also need a solution for that, there will be ECS fields that could benefit of having the metric type and unit defined.

It would be better to have a single source of information.

+1, but I think that for integrations there are going to be cases where the same field doesn't need to be always a dimension. Maybe ECS can include something to indicate fields that can be used as dimensions, as a recommendation, but this would be a bit fuzzy, and it also depends on the data types supported by Elasticsearch to be dimensions.

We want to enforce a conscious decision on the selected dimensions per integration. Having too many of them may defeat their purpouse.

@mtojek
Copy link
Contributor

mtojek commented Nov 9, 2021

Thanks for explanation. Do you think that can be implemented in a generic way? Instead of supporting only dimensions, we could introduce local "override" logic for field properties.

@jsoriano
Copy link
Member Author

jsoriano commented Nov 9, 2021

Thanks for explanation. Do you think that can be implemented in a generic way? Instead of supporting only dimensions, we could introduce local "override" logic for field properties.

This can be a good idea, yes. For that, maybe we need a list of fields that cannot be overriden? For example it can make sense to override the description for integration-specific docs, but if you need to override the type you should be probably using other field.

@mtojek
Copy link
Contributor

mtojek commented Nov 9, 2021

AFAIR The basic set includes: name, type, description :) We can start with them.

@jsoriano
Copy link
Member Author

jsoriano commented Nov 9, 2021

AFAIR The basic set includes: name, type, description :) We can start with them.

Ok, I am going to change the following:

  • In the package-spec, by now allowing the definition of external fields as dimensions, without checking their type, so we don't need to resolve external fields in spec validators.
  • In elastic-package, allow overrides on external fields, protecting some of their settings.

@mtojek
Copy link
Contributor

mtojek commented Nov 9, 2021

@jsoriano

I think you might be interested in this issue: elastic/elastic-package#392

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issue needs discussion
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants