Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing info in the harmonization file: hash.md5, hash.sha1, etc. #394

Closed
aaronkaplan opened this issue Nov 9, 2015 · 6 comments
Closed
Assignees
Labels
architecture bug Indicates an unexpected problem or unintended behavior data-format needs: discussion
Milestone

Comments

@aaronkaplan
Copy link
Member

There is a problem with assuming the implicit declaration of hash functions by prefixing them with $1$ etc . in the events table / data harmonization config file:

  • assume you are given a hash (sha1) of a piece of malware and you want to find it in the events table.
    However, you only stored the md5 since that is what you received even though the sender sent you both fields (sha1 and md4 - such as the n6 feed). Then you can not ever find the right entry again.

Solution: we unfortunately need to extend the harmonization.conf file:
Include

  • malware.hash.sha1
  • malware.hash.md5
  • malware.hash.sha256

Sorry...

@aaronkaplan aaronkaplan added bug Indicates an unexpected problem or unintended behavior architecture labels Nov 9, 2015
@aaronkaplan aaronkaplan self-assigned this Nov 9, 2015
@aaronkaplan aaronkaplan added this to the Release 2 - v1.1 milestone Nov 9, 2015
@sebix
Copy link
Member

sebix commented Nov 9, 2015

The next field where we can possibly have multiple values...

@aaronkaplan
Copy link
Member Author

On 09 Nov 2015, at 22:04, Sebastian [email protected] wrote:

The next field where we can possible have multiple values…

True but multiple values might be OK for JSON (-> array of values) but not so much for a SQL DB.

@mauroasilva
Copy link
Contributor

Yeah... this is something we are always fighting with... the problem is that we are trying to have a multivalue world while supporting singlevalue formats (like relational tables). One could argue that you could always split into multiple events whenever you are converting to a singlevalue reality but if you have multiple fields with multiple values, how do you pair them together?

I think in the future we will have to decide wether we want to support multivalue and disregard a bit of the singlevalue reality or if we want to stick with a singlevalue format.

Right now I think most of our sources are using singlevalue formats like CSV, so it isn't a big deal but this is a decision we will have to make sooner rather than later I think.

@SYNchroACK
Copy link
Contributor

IMHO:

  • v1.0 should follow the current format (singlevalue formats - relational tables)
  • v1.1 should moved to multivalue format because in the current days is a real requirement. Since the beginning of the project I tried to stick to the singlevalue but with all harmonization issues I think we need to move to the next step (multivalue). For me, is a MUST in v1.1.

@aaronkaplan
Copy link
Member Author

On 11 Nov 2015, at 14:26, Tomás Lima [email protected] wrote:

IMHO:

• v1.0 should follow the current format (singlevalue formats - relational tables)
• v1.1 should moved to multivalue format because in the current days is a real requirement. Since the begging of the project I tried to stick to the singlevalue but with all harmonization issues I think we need to move to the next step (multivalue). For me, is a MUST in v1.1.

TBD… I think this is a major discussion involving multiple projects (not just intelmq)

@ghost
Copy link

ghost commented Mar 15, 2017

fixed by #885

@ghost ghost closed this as completed Mar 15, 2017
@ghost ghost modified the milestones: v1.1 Feature release, v1.0 Stable Release Jul 5, 2017
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
architecture bug Indicates an unexpected problem or unintended behavior data-format needs: discussion
Projects
None yet
Development

No branches or pull requests

4 participants