Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LIR: merge hash attributes of same name to support legacy configurations #8597

Closed
wants to merge 1 commit into from

Conversation

yaauie
Copy link
Member

@yaauie yaauie commented Nov 6, 2017

A build failure of the grok plugin when run through LIR/lscl indicates that
there is an expectation for multiple Attributes of the same name to be merged
together.

This commit ports the failing spec in the plugin to an abstraction that can be
tested within logstash-core, and adds a caveat to the LSCL LIR-builder to
ensure that we merge the hashes in a way that is compatible with legacy
behaviour.

NOTE: when multiple Attributes of the same name are used in a single config,
it's possible to create configurations that circumvent AST::Hash's ability to
report duplicate keys.

Discovered while working on logstash-plugins/logstash-filter-grok#124

@original-brownbear
Copy link
Member

@andrewvc @jordansissel can you guys take a look at this?

I don't know if we actually want to support this kind of settings merge?

I'm not sure why the specs in Grok failed against LS master, currently, they are passing just fine for me locally.

@jordansissel
Copy link
Contributor

jordansissel commented Nov 6, 2017

I don't know if we actually want to support this kind of settings merge?

Historically (years), Logstash has allowed these two things to be equivalent configurations:

# Example A
foo => "bar"
foo => "baz"

# Example B
foo => [ "bar", "baz" ]

The above (array merging) has been supported (on purpose) since at least Logstash 1.0.0 (6+ years)

And similar for hashes. Hash syntax was supported long before, but hash rocket { x => y } was added in Logstash 1.2.0

# Example A
foo => { "bar" => "hello" }
foo => { "baz" => "goodbye" }

# Example B
foo => {
  "bar" => "hello"
  "baz" => "goodbye"
}

Hash merging happens outside the grammar and in the config ast/compiler. My memory is old, but I think this is the code that does it in Logstash 1.2.0

@yaauie
Copy link
Member Author

yaauie commented Nov 6, 2017

Historically (years), Logstash has allowed these two things to be equivalent configurations:

# Example A
foo => "bar"
foo => "baz"

# Example B
foo => [ "bar", "baz" ]

The current LSCL/LIR doesn't implement this either; it falls back to +, which would concatenate the two strings:

foo => "barbaz"

@original-brownbear
Copy link
Member

@jordansissel alright thanks a lot, with that information I'm good to review this one :)

@original-brownbear
Copy link
Member

now the issue does reproduce for me on Travis: https://travis-ci.org/logstash-plugins/logstash-filter-grok/jobs/298174280#L1019 but not locally ... will investigate this a little

elsif existing.kind_of?(::Hash)
# For legacy reasons, a config can contain multiple `AST::Attribute`s with the same name (e.g., "match"
# in the grok filter); when the value of this attribute produces a `::Hash`, an attempt is made to silently
# merge-smash them together (NOTE: this bypasses `AST::Hash`'s ability to detect duplicate keys)
Copy link
Member

@original-brownbear original-brownbear Nov 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yaauie patch LGTM :) but can we improve this comment a little? Reading this in isolation I wouldn't understand what's going on here tbh.
Maybe we can just spell it out explicitly instead of that long explanation and say:

# For legacy reasons, a config can contain multiple `AST::Attribute`s with the same name 
# and a hash type value (e.g., "match" in the grok filter)
# which are merged into a single hash value; 
# E.g. `{"match" => {"baz" => "bar"}, "match" => {"foo" => "bulb"}}` is interpreted as
#  `{"match" => {"baz" => "bar", "foo" => "blub"}}`.   

IMO that makes it a lot clearer, wdyt?

@original-brownbear
Copy link
Member

@yaauie @jordansissel

quote @jordansissel from Slack on this:

I am open to simplifying the grammar if we think it'll make it easier for users

I think this is more a case of making things easier for us than for users. As far as I can see this is broken in all versions that have the LIR Pipeline logic in them (all 5.6.x right?) and we didn't get a single complaint about this being broken. Also, at least to me personally merging two hashes like that feels very unnatural.


Unless BwC is a concert here, I'd probably just stop supporting this (especially since it's not even documented, making it super unlikely many people use it).

@andrewvc
Copy link
Contributor

andrewvc commented Nov 7, 2017

This is tricky, 6.0.0 GA is right upon us.

We have two logical paths in front of us:

  1. Simplify hash handling and stop merging duplicate keys
  2. Maintain old behavior

Some thoughts:

  • Ideally, for 1, we would have made that decision earlier in the 6.x timeline.
  • At this juncture we are post RC2. Only critical bugfixes should be going in.
  • As @original-brownbear points out no one has complained yet! But then again, betas don't see a huge amount of usage

I'm leaning toward preserving this behavior. Since the change was unintentional I have a hard time arguing that this is not a bug.

I'm +1 on simplifying this behavior in the 7.x timeframe, and committing that to master.

@jordansissel WDYT?

A [build failure][] of the grok plugin when run through LIR/lscl indicates that
there is an expectation for multiple Attributes of the same name to be merged
together.

This commit ports the failing spec in the plugin to an abstraction that can be
tested within logstash-core, and adds a caveat to the LSCL LIR-builder to
ensure that we merge the hashes in a way that is compatible with legacy
behaviour.

NOTE: when multiple Attributes of the same name are used in a single config,
it's possible to create configurations that circumvent `AST::Hash`'s ability to
report duplicate keys.

[build failure]: https://travis-ci.org/logstash-plugins/logstash-filter-grok/builds/293778268
@yaauie yaauie force-pushed the fix/lir-legacy-merge-hashes branch from 429b3e0 to 451831a Compare November 7, 2017 21:17
@jordansissel
Copy link
Contributor

@andrewvc +1 simplifying if we can for 7.0; gonna be an interesting path towards a deprecation-and-upgrade scenario for users (telling users this feature is changing, helping them migrate, etc).

I talked with @yaauie today, and he has enough information to move forward. We can add more tests as desired for the supported-but-not-well-documented merging behavior to ensure we keep it until we intend to remove it.

@yaauie
Copy link
Member Author

yaauie commented Nov 7, 2017

@original-brownbear I've re-worded the comment; thanks for the critique.

@andrewvc: thoughts on merge targets for this? I know 6.0 is pretty much frozen due to the pending whole-stack release, but would the risk be worth including that in the targets so that we don't introduce this bug to a released version?

@andrewvc
Copy link
Contributor

andrewvc commented Nov 7, 2017 via email

@original-brownbear
Copy link
Member

@yaauie thanks, LGTM :)

@elasticsearch-bot
Copy link

Ry Biesemeyer merged this into the following branches!

Branch Commits
master 148ffd3

@elasticsearch-bot
Copy link

Ry Biesemeyer merged this into the following branches!

Branch Commits
6.x db50857

elasticsearch-bot pushed a commit that referenced this pull request Nov 7, 2017
A [build failure][] of the grok plugin when run through LIR/lscl indicates that
there is an expectation for multiple Attributes of the same name to be merged
together.

This commit ports the failing spec in the plugin to an abstraction that can be
tested within logstash-core, and adds a caveat to the LSCL LIR-builder to
ensure that we merge the hashes in a way that is compatible with legacy
behaviour.

NOTE: when multiple Attributes of the same name are used in a single config,
it's possible to create configurations that circumvent `AST::Hash`'s ability to
report duplicate keys.

[build failure]: https://travis-ci.org/logstash-plugins/logstash-filter-grok/builds/293778268

Fixes #8597
yaauie added a commit to yaauie/logstash that referenced this pull request Nov 8, 2017
A [build failure][] of the grok plugin when run through LIR/lscl indicates that
there is an expectation for multiple Attributes of the same name to be merged
together.

This commit ports the failing spec in the plugin to an abstraction that can be
tested within logstash-core, and adds a caveat to the LSCL LIR-builder to
ensure that we merge the hashes in a way that is compatible with legacy
behaviour.

NOTE: when multiple Attributes of the same name are used in a single config,
it's possible to create configurations that circumvent `AST::Hash`'s ability to
report duplicate keys.

[build failure]: https://travis-ci.org/logstash-plugins/logstash-filter-grok/builds/293778268

Backports elastic#8597
elasticsearch-bot pushed a commit that referenced this pull request Nov 8, 2017
A [build failure][] of the grok plugin when run through LIR/lscl indicates that
there is an expectation for multiple Attributes of the same name to be merged
together.

This commit ports the failing spec in the plugin to an abstraction that can be
tested within logstash-core, and adds a caveat to the LSCL LIR-builder to
ensure that we merge the hashes in a way that is compatible with legacy
behaviour.

NOTE: when multiple Attributes of the same name are used in a single config,
it's possible to create configurations that circumvent `AST::Hash`'s ability to
report duplicate keys.

[build failure]: https://travis-ci.org/logstash-plugins/logstash-filter-grok/builds/293778268

Backports #8597

Fixes #8602
@andrewvc
Copy link
Contributor

andrewvc commented Nov 8, 2017

Did we want to merge this into master? IIRC we discussed removing the extra functionality here and just making the hash work like a plain hash no? In this case I assume we'd raise an error for keys declared twice.

@original-brownbear
Copy link
Member

@andrewvc actually, it's not so bad keeping the GIT history clean that we have this in master and then adjust it there :)

Added #8605 for this.

@jakelandis
Copy link
Contributor

@yaauie to make the 6.0 release, you will need to merge to the 6.0 branch. (6.x is currently targeting 6.1)

@yaauie
Copy link
Member Author

yaauie commented Nov 9, 2017

@jakelandis this was separately backported to 6.0 in #8602

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants