Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Multiple users in an event, stage 3 PR #1017

Merged
merged 7 commits into from
Nov 11, 2020

Conversation

webmat
Copy link
Contributor

@webmat webmat commented Oct 7, 2020

Stage 3 criteria:

  • Completed field definitions
  • Included multiple real world example source documents.
    • This was done in the stage 2 PR, with the help of many contributors ❤️
    • Please let us know if you think an important usage example is missing
  • Existing or newly raised questions and concerns are addressed.
    • Deprecating host.user fields.

Additional TODOs

Preview of the RFC

@webmat webmat self-assigned this Oct 7, 2020
@webmat webmat added the RFC label Oct 7, 2020
@P1llus
Copy link
Member

P1llus commented Oct 29, 2020

As discussed with @webmat here is an example from a MySQL Audit log point of view, where you can have one user account being used for authentication, while another account is used for the activity after the initial authentication.

Dataset:

"class": "table_access",
	"event": "insert",
	"connection_id": 16,
	"account": {
		"user": "audit_test_user2",
		"host": "hades.home"
	},
	"login": {
		"user": "audit_test_user2",
		"os": "",
		"ip": "192.168.2.5",
		"proxy": ""
	},

The description of the User Object
user
A string representing a user name. The meaning differs depending on the item within which user occurs:
Within account items, user is a string representing the user that the server authenticated the client as. This is the user name that the server uses for privilege checking.
Within login items, user is a string representing the user name sent by the client.
Field documentation for audit logging:
https://dev.mysql.com/doc/refman/5.7/en/audit-log-file-formats.html#audit-log-file-json-format

EDIT: Just to make things more confusing, the login.os is the local user of the client if it is applicable, the local user running the mysql client connection.

@webmat
Copy link
Contributor Author

webmat commented Oct 30, 2020

I agree this is confusing. The non-JSON user names actually look less confusing than the JSON logs 🤦

Whenever you encounter confusing situations like this, you should absolutely capture all of these values with a naming as close as possible to the original, in custom fields. This way, even if we mess up which one is the effective user, at least folks can look up the original user names, figure it out and even override it.

Maybe something like mysql.account.user, mysql.login.user. There's also a "proxy_user" mentioned higher in the docs, although I'm not sure where in the JSON event it appears.

But looking at this, it's a bit more clear that this is similar to privilege escalation. So I would recommend mapping the user used for privilege checking to user.effective.*, and the client login user to user.*.

Note that these new user fields won't be in main ECS 1.7 (stack 7.11), they're still experimental. So if you want to use them, you'll need to explicitly define them for now, and remove your custom definitions in 7.12, where they're expected to be officially part of ECS 1.8.

@webmat
Copy link
Contributor Author

webmat commented Oct 30, 2020

I've updated the body of the issue with the criteria for merging as stage 3.

A few things to note:

  • Having not heard any pushback on eventual removal of the host.user.* fields, I've gone ahead and adjusted the text assuming that this is our plan. Check out the RFC text for the details. Now's a good time to chime in if anybody disagrees. This was the only concern left to address.
  • The criteria for the amount of examples for stage 3 were already met in the stage 2 RFC.

This may be the first stage 3 RFC we merge. With fields moving out of stage 2 (experimental) to stage 3 (beta in mainline ECS), this means close knowledge of the ECS build process is now required. Moreover, the current way of handling experimental changes is manual, and bound to change. It's not expected that RFC authors need to be familiar with the internals of the ECS build.

Therefore I propose that stage 3 RFCs (and stage 4s that come with field changes) be accompanied by a sister PR that effects the changes, authored by an ECS core team member. Both should be merged at the same time, when both are approved and ready. This decouples the simple RFC process of editing a markdown file from the intricacies of the ECS build process.

I'll start preparing the sister PR. When that PR is ready, in my opinion this RFC will be ready to be merged as stage 3. I'm shooting to finalize this next week.

Now's a good time for a last look, @jonathan-buttner @ebeahan @leehinman @willemdh @janniten @rylnd @MikePaquette

cc @epixa on process adjustment.

@webmat
Copy link
Contributor Author

webmat commented Oct 30, 2020

Merging this stage 3 RFC may be blocked by #1051. Not sure if this will handle beta field reuse.

@ebeahan
Copy link
Member

ebeahan commented Oct 30, 2020

Merging this stage 3 RFC may be blocked by #1051. Not sure if this will handle beta field reuse.

Ah yes I hadn't considered that need. Will reassess.

@dainperkins
Copy link
Contributor

This might be a much longer discussion (re: source/destination vs. client/server event scope) but would it make sense to remove source/destination.user in favor of client/server.user, given the current guidelines for source/destination fields?

@webmat
Copy link
Contributor Author

webmat commented Oct 30, 2020

@dainperkins What guidelines are you referring to? The guidance is that source & destination should be populated as much as possible, as a baseline for network stuff. Then if client/server makes sense, those should be populated too.

This means source/destination.user should also be populated as much as possible as well.

@webmat
Copy link
Contributor Author

webmat commented Oct 30, 2020

@ebeahan Let's have that discussion on #1051 however. There's other things I wonder about 🤔

ebeahan
ebeahan previously approved these changes Oct 30, 2020
Copy link
Member

@ebeahan ebeahan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything is looking great here @webmat.

Therefore I propose that stage 3 RFCs (and stage 4s that come with field changes) be accompanied by a sister PR that effects the changes, authored by an ECS core team member. Both should be merged at the same time, when both are approved and ready. This decouples the simple RFC process of editing a markdown file from the intricacies of the ECS build process.

No objections to keeping the proposal doc PR separate from the PR that actually implements the proposed changes. Should the sibling PR # should be captured in the proposal markdown prior to merging?

I expect we'd be open to the highly motivated author who wishes to also author the sibling PR. However, I also understand that case to be the rare exception, and the ECS core team is absolutely willing to translate the proposal into field definitions. 😄

rfcs/text/0007-multiple-users.md Show resolved Hide resolved
@dainperkins
Copy link
Contributor

@webmat

@dainperkins What guidelines are you referring to?
https://www.elastic.co/guide/en/ecs/1.7/ecs-mapping-network-events.html#_source_and_destination_baseline

Because source / destination can be unidirectional I'm not sure it makes sense to add user information in field sets that, anecdotally anyway, are there for a conglomeration of unidirectional and bidirectional basic network layer info.

User info is always going to be protocol related which suggests client/server, bidirectional records

Also effective visualization of combined unidirectional and bidirectional network data requires normalization to one or the other... which suggests it might not be worth the additional overhead for repetition of fields that don't seem to fit the mold

@webmat
Copy link
Contributor Author

webmat commented Oct 30, 2020

@dainperkins Yeah it's a bigger discussion if we want to reframe the meaning of source/destination vs client/server. Right now our recommendation is not to interpret them quite this way. This goes beyond the scope of this RFC. I would love it if captured these thoughts in an issue, however.

@webmat
Copy link
Contributor Author

webmat commented Oct 30, 2020

Further note on the process discussion above. We've discussed this out of band, and we all agree to do the decoupling and having two PRs, ideally quite close to one another. But we're not going to enforce that the second one is opened and approved before merging an RFC PR. We could decide to merge the RFC before the second PR exists.

Also the second PR doesn't have to be authored by an ECS core team member. If an RFC author is willing to work on that pull request, this is welcome. This was meant more as a statement that whoever the author is, they can depend on the ECS team to take care of the plumbing, if the author doesn't want to.

@webmat
Copy link
Contributor Author

webmat commented Oct 30, 2020

PR to apply these changes is in progress at #1066

@webmat
Copy link
Contributor Author

webmat commented Nov 2, 2020

PR to make these changes come to life is now "feature-complete" #1066, feedback welcome. Contributors familiar with this RFC will have a sense of deja-vu 😂 Some new contents in there as well, though.

This will be our first use of the new mechanism that lets us add a supplemental documentation page for a field set.

@willemdh
Copy link
Contributor

willemdh commented Nov 2, 2020

@webmat

feedback welcome

Well, glad to see this going forward. This should be the base on which UEBA functionalities should be added in Elastic Security. What I'm somehow missing in this whole story is how a username is supposed to be indexed and if there is room for:

  • the raw username for the related log
  • the ad logon name
  • the ad sam account name
  • the domain of the related user
  • possible related email addresses for the user

I've seen users such as:

All of the above 'could' be the same actual user. Imho there should be room for these kind of user info to prevent issues like I'm trying to escalate with host.name shortname and fqdn. It should also be clear from the start in what form user.name should be indexed preferably.

@webmat
Copy link
Contributor Author

webmat commented Nov 2, 2020

@willemdh Good point.

As you point out, the same person can have many different identifiers according to different systems, or even different types of event from the same system.

Laying out all of the correlations a UEBA product/feature has to do, in order to help build a precise picture of user identity across data sources is out of scope here.

But I agree it would be helpful to add a section clarifying some subtleties around capturing user identifiers. Here's a few points I could see addressed in there:

  • If the only identifier is an email address, which field(s) do we put it in?
  • If the identifier includes the domain name, should we parse it out, or leave the domain in the user name such as DOMAINX\jansspi?

Do you have other questions you think should be addressed? Other than these questions that come to mind based on your list, I think the other subtleties are better left to a full-fledged UEBA correlation engine. WDYT?

@willemdh
Copy link
Contributor

willemdh commented Nov 3, 2020

@webmat

I think the other subtleties are better left to a full-fledged UEBA correlation engine.

My thought => Multiple NG SIEMs have UEBA functionalities these days, I hope Elastic sees that evolution. There should be a module which allows to index active directory and Elastic should work towards correlating users with the AD index. There should be a Users tab in Kibana Security (Next to Hosts and Network).

But my UEBA dreams do not have to be a show stopper to get this RFC going forward. I think the updated user object has been discussed throughly and is definitely a step in the good direction. If one day more fields are required to get sth UEBA like started, a new object could be created, such as ad_user.* or sth more applicable.

Thanks a lot for all the effort everyone put in this.

ebeahan
ebeahan previously approved these changes Nov 5, 2020
@webmat
Copy link
Contributor Author

webmat commented Nov 10, 2020

@willemdh UEBA is floating in the air, but I can't give specifics there.

I will add a section about user identifiers, and try to nudge implementers to avoid problems like you've raised for hostnames. However I'll do this over on #1066, the PR that makes this proposal take shape in the main ECS documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants