Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate user-supplied missing values on unmapped fields #43718

Merged

Conversation

not-napoleon
Copy link
Member

This PR adds the ability for Aggregation factories to have finer grained control over how missing values are mapped to ValuesSources. Technically speaking, this addresses a minor bug in the terms aggregation, but more importantly it resolves a design constraint for Range field aggregations. I opened the PR against master however because it doesn't depend on any of the range field aggregation work and touches shared infrastructure for aggregations, so getting it in master sooner will avoid merge conflicts down the road.

I've added a bunch of tests, and done my best to preserve the existing behavior wherever possible. The only noticeable change should be that if you pass something exceedingly unusual as a missing value for an unmapped field to the terms agg, it will no longer silently convert it to a string and continue. Instead, it will throw an AggregationExecutionException same as if you'd had that value in a mapped field.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo

Copy link
Contributor

@polyfractal polyfractal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Core changes look good, left a few minor comments/questions.

The only noticeable change should be that if you pass something exceedingly unusual as a missing value for an unmapped field to the terms agg, it will no longer silently convert it to a string and continue. Instead, it will throw an AggregationExecutionException same as if you'd had that value in a mapped field.

Do we have a good feeling for how "breaking" this is? E.g. is there a case where someone was using this behavior (accidentally or otherwise) and now we'll throw an exception? I'm not really sure what you could pass in the REST API that would be non-string/non-numeric... an object I guess? Or is it only an issue for java users that can send in objects which have a valid toString()?

Copy link
Contributor

@polyfractal polyfractal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, although I am curious about the bwc question in my last comment. E.g. we should probably backport this to 7.x, but if it's a "real" breaking change it might be tricky/impossible.

@not-napoleon
Copy link
Member Author

@polyfractal Ok, as per conversation, I updated the PR so there's zero behavior change, and this just adds hooks and tests.

@polyfractal
Copy link
Contributor

👍 when CI is happy it LGTM!

@not-napoleon
Copy link
Member Author

@elasticmachine update branch

@not-napoleon not-napoleon merged commit e84a79a into elastic:master Jul 3, 2019
@not-napoleon not-napoleon deleted the bugfix/validate-unmapped-missing branch July 3, 2019 20:12
not-napoleon added a commit that referenced this pull request Jul 8, 2019
…3718) (#43940)

Provides a hook for aggregations to introspect the `ValuesSourceType` for a user supplied Missing value on an unmapped field, when the type would otherwise be `ANY`.  Mapped field behavior is unchanged, and still applies the `ValuesSourceType` of the field.  This PR just provides the hook for doing this, no existing aggregations have their behavior changed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants