Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsers for Bambenek Consulting and Netlab 360 OSINT feeds #772

Merged
merged 24 commits into from
Nov 21, 2016

Conversation

jgedeon120
Copy link
Contributor

I've created the required parsers for the feeds from Bambenek Consulting; C2 IP Feed, C2 Domain Feed, and DGA Domain Feed. The required test scripts are also included. Please let me know if you have any questions.

@sebix sebix self-assigned this Nov 13, 2016
@sebix sebix added component: bots feature Indicates new feature requests or new features labels Nov 13, 2016
@sebix sebix added this to the v1.1 Feature release milestone Nov 13, 2016
@codecov-io
Copy link

codecov-io commented Nov 14, 2016

Current coverage is 75.23% (diff: 92.50%)

Merging #772 into master will increase coverage by 4.43%

@@             master       #772   diff @@
==========================================
  Files           206        217    +11   
  Lines          7849       8031   +182   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits           5557       6042   +485   
+ Misses         2292       1989   -303   
  Partials          0          0          

Powered by Codecov. Last update 2162d51...16c11ad

@jgedeon120 jgedeon120 changed the title Parsers for Bambenek Consulting OSINT feeds Parsers for Bambenek Consulting and Netlab 360 OSINT feeds Nov 14, 2016
@jgedeon120
Copy link
Contributor Author

Parsers have now been added to this pull request for Netlab 360 Magnitude EK and DGA feeds.

http://data.netlab.360.com/

@dmth
Copy link
Contributor

dmth commented Nov 14, 2016

Hi thanks for your contribution.

I've a question concerning your parsers: Why are all IPs/Domains written as "destination" and not as "source"?

@jgedeon120
Copy link
Contributor Author

For these I have written them as the destination due to the relation to
where the connection was headed. If these lists would have been scanners
or something like that then they would have been written as source.

On Mon, Nov 14, 2016 at 5:45 AM, Dustin Demuth [email protected]
wrote:

Hi thanks for your contribution.

I've a question concerning your parsers: Why are all IPs/Domains written
as "destination" and not as "source"?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#772 (comment),
or mute the thread
https:/notifications/unsubscribe-auth/AIHVQsfz3FsXnDlmciIyvcdAfMmZxfKIks5q-DuygaJpZM4KwiF5
.

Registered Linux User # 379282

@jgedeon120
Copy link
Contributor Author

With putting these into a production state I found that Intelmq was filtering out some of the events. I will close this request and submit a new request once the issues have been worked out so that the proper information is recorded.

@jgedeon120 jgedeon120 closed this Nov 14, 2016
@sebix
Copy link
Member

sebix commented Nov 14, 2016

First, thanks for your contribution. I added some comments inline, most of them apply to all bots of course. Once you think it's ready, I will try it with real data too.

You can reuse this PR, just push your fixes here (you could even overwrite the history).


event = Event(report)

event.add('destination.ip', row_split[0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be source.ip, see https:/certtools/intelmq/blob/master/docs/Data-Harmonization.md#classification-1 (the second table and the text below).

@@ -0,0 +1,42 @@
# -*- coding: utf-8 -*-
"""
http://osint.bambenekconsulting.com/feeds/c2-dommasterlist.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also document it (with more details) at docs/Feeds.md


class Bambenekc2dommasterlistParserBot(Bot):

def process(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the ParserBot, see https:/certtools/intelmq/blob/master/docs/Developers-Guide.md#parsers
This provides much better error handling.

As the iteration over rows is already implemented, we only need lines 25-35 + a yield in parse_lines.

Example:

def parse_line(self, row, report):

@jgedeon120
Copy link
Contributor Author

Sebix,

Thanks again for the tips. The requested changed have been made and the bug that I found this morning corrected.

@jgedeon120 jgedeon120 reopened this Nov 14, 2016
if FQDN.is_valid(lvalue[1]):
event.add('source.fqdn', lvalue[1])
else:
event.add('source.ip', lvalue[1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is uncovered by the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed since this feed only has FQDN's and not IP addresses. It may have been carried over from the dga feed.

event.add('source.ip', lvalue[1])

event.add('raw', line)
event.add('classification.type', 'malware')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it a c&c?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, yes it is, not sure why I typed malware. Corrected.

event.add('time.source', lvalue[2] + " UTC")
event.add('event_description.url', lvalue[3])
event.add('classification.type', 'c&c')
event.add('status', 'online')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Puh, not sure if this field is intended for a status reported by the source. @aaronkaplan can you comment on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, according to the DHO that's for what it is. However, we have not clearly defined valid values in the DHO for this field ;-) Means: we might have to refactor this later.... But it's ok for now IMHO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should we proceed with this now?

Copy link
Member

@sebix sebix Nov 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should we proceed with this now?

It's fine.

event.add('event_description.text', lvalue[1])
event.add('time.source', lvalue[2] + " 00:00 UTC")
event.add('event_description.url', lvalue[3])
event.add('classification.type', 'ransomware')
Copy link
Member

@sebix sebix Nov 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description of ransomware is This IOC refers to a specific type of compromized machine, where the computer has been hijacked for ransom by the criminals. but the description of the feed says "known DGA generated
domains used by malware".

Please have a look @aaronkaplan

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this even might change over time ... I have no good solution for this at the moment.
Currently it seems to cover ransomeware DGA domains, if I understand it correctly.
But... I think we should call it "dga domain" since that is what we are actually addressing with this feed. ("source.fqdn")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I change the classification.type to "dga domain" and also make the needed changes to harmonization.py to allow dga domain and also map it to Malicious Code in the Taxonomy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please :)


event.add('classification.identifier', lvalue[0].lower())
event.add('time.source',
datetime.utcfromtimestamp(int(lvalue[1])).strftime('%Y-%m-%dT%H:%M:%S+00:00'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have a function in the libs for this: https:/certtools/intelmq/blob/master/intelmq/lib/harmonization.py#L211 :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into this more, before making the change to what it is now I kept getting an error in the tests about being out of range or something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if the implementation is incomplete or erroneous, I'd like to fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must have had something wrong when I first tried it. It is working now.


### Magnitude EK Feed

Status: Unknown
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU the upstream description (http://data.netlab.360.com/ek), this feed lists URLs with exploits? Please be more detailed here.

@sebix sebix merged commit 16c11ad into certtools:master Nov 21, 2016
sebix added a commit that referenced this pull request Nov 21, 2016
Parsers for Bambenek Consulting and Netlab 360 OSINT feeds

Signed-off-by: Sebastian Wagner <[email protected]>
@ghost ghost modified the milestones: v1.1 Feature release, v1.0 Stable Release Jul 5, 2017
@ghost ghost unassigned sebix Jul 5, 2017
@ghost ghost mentioned this pull request Sep 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: bots feature Indicates new feature requests or new features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants