Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsers for Bambenek Consulting and Netlab 360 OSINT feeds #772

Merged
merged 24 commits into from
Nov 21, 2016
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
3b5a392
Added Bambenek c2-dommasterlist parser
jgedeon120 Nov 12, 2016
5d99003
Updated Bambenek parsers
jgedeon120 Nov 12, 2016
8fc4934
Added Bambenek test files
jgedeon120 Nov 12, 2016
92020df
Added Bambenek corrected test files
jgedeon120 Nov 12, 2016
18b5dc9
Added Bambenek corrected test files
jgedeon120 Nov 12, 2016
8b31b7d
Added Bambenek corrected test files
jgedeon120 Nov 12, 2016
b8e9962
Added Bambenek corrected test files
jgedeon120 Nov 12, 2016
7189ce5
Added Bambenek corrected test files
jgedeon120 Nov 12, 2016
a1be3ca
Added Bambenek corrected test files
jgedeon120 Nov 12, 2016
3f8667f
Added netlab 360 parsers
jgedeon120 Nov 13, 2016
276f57f
Updated Netlab 360 parsers
jgedeon120 Nov 13, 2016
f159cfb
Correction to Netlab 360 test_parser_dga.py
jgedeon120 Nov 13, 2016
472295b
Corrected pep8 issues.
jgedeon120 Nov 14, 2016
154b9bc
Corrected pep8 error.
jgedeon120 Nov 14, 2016
8027959
Updated Bambenek parser_c2dommasterlist.py
jgedeon120 Nov 14, 2016
a50eb86
Updated Bambenek parser_c2dommasterlist.py
jgedeon120 Nov 14, 2016
d0a4bf6
Updated Bambenek parser_c2dommasterlist.py
jgedeon120 Nov 14, 2016
c03198a
Updated Bambenek parser_c2dommasterlist.py
jgedeon120 Nov 14, 2016
18396f0
Updated Netlab 360 parsers
jgedeon120 Nov 14, 2016
b80e4f7
Updated Bambenek and Netlab 360 test parsers.
jgedeon120 Nov 14, 2016
62b14fe
Updated Feeds.md with Bambenek and Netlab 360 information.
jgedeon120 Nov 14, 2016
c6ce6fe
Corrected unittest issue
jgedeon120 Nov 14, 2016
c5a0e2c
Some of the corrections requested
jgedeon120 Nov 15, 2016
16c11ad
Moved Bambenek dga feed to dga domain classification type
jgedeon120 Nov 21, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions intelmq/bots/BOTS
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,21 @@
"module": "intelmq.bots.parsers.autoshun.parser",
"parameters": {}
},
"Bambenek C2 Domain Feed": {
"description": "Bambenek C2 Domain parser is the bot responsible to parse and and sanatize the information.",
"module": "intelmq.bots.parsers.bambenek.parser_c2dommasterlist",
"parameters": {}
},
"Bambenek C2 IP Feed": {
"description": "Bambenek C2 IP parser is the bot responsible to parse and and sanatize the information.",
"module": "intelmq.bots.parsers.bambenek.parser_c2ipmasterlist",
"parameters": {}
},
"Bambenek DGA Domain Feed": {
"description": "Bambenek DGA Domain parser is the bot responsible to parse and and sanatize the information.",
"module": "intelmq.bots.parsers.bambenek.parser_dgafeed",
"parameters": {}
},
"BitSight Ciberfeed Stream": {
"description": "Bitsight Ciberfeed parser is the bot reponsible to parse and sanitize the information.",
"module": "intelmq.bots.parsers.bitsight.parser",
Expand Down Expand Up @@ -316,6 +331,16 @@
"module": "intelmq.bots.parsers.n6.parser_n6stomp",
"parameters": {}
},
"Netlab 360 DGA": {
"description": "Netlab 360 DGA Feed Parser is the bot responsible to parse the report and sanitize the information.",
"module": "intelmq.bots.parsers.netlab_360.parser_dga",
"parameters": {}
},
"Netlab 360 Magnitude": {
"description": "Netlab 360 Magnitude EK Feed Parser is the bot responsible to parse the report and sanitize the information.",
"module": "intelmq.bots.parsers.netlab_360.parser_magnitude",
"parameters": {}
},
"OpenBL": {
"description": "OpenBL Parser is the bot responsible to parse the report and sanitize the information.",
"module": "intelmq.bots.parsers.openbl.parser",
Expand Down
Empty file.
42 changes: 42 additions & 0 deletions intelmq/bots/parsers/bambenek/parser_c2dommasterlist.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# -*- coding: utf-8 -*-
"""
http://osint.bambenekconsulting.com/feeds/c2-dommasterlist.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also document it (with more details) at docs/Feeds.md

format:
ooxnererwyatanb.com,Domain used by banjori,2016-11-10 15:04,http://osint.bambenekconsulting.com/manual/banjori.txt
destination.fqdn,event_description.text,time.source,event_description.url
"""

import sys
from intelmq.lib import utils
from intelmq.lib.bot import Bot
from intelmq.lib.message import Event


class Bambenekc2dommasterlistParserBot(Bot):

def process(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the ParserBot, see https:/certtools/intelmq/blob/master/docs/Developers-Guide.md#parsers
This provides much better error handling.

As the iteration over rows is already implemented, we only need lines 25-35 + a yield in parse_lines.

Example:

def parse_line(self, row, report):

report = self.receive_message()
raw_report = utils.base64_decode(report.get("raw"))

for row in raw_report.splitlines():
if row.startswith('#'):
continue

row_split = row.split(',')

event = Event(report)

event.add('destination.fqdn', row_split[0])
event.add('event_description.text', row_split[1])
event.add('time.source', row_split[2] + " UTC")
event.add('event_description.url', row_split[3])
event.add('classification.type', 'c&c')
event.add('status', 'online')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Puh, not sure if this field is intended for a status reported by the source. @aaronkaplan can you comment on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, according to the DHO that's for what it is. However, we have not clearly defined valid values in the DHO for this field ;-) Means: we might have to refactor this later.... But it's ok for now IMHO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should we proceed with this now?

Copy link
Member

@sebix sebix Nov 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should we proceed with this now?

It's fine.

event.add('raw', row)

self.send_message(event)
self.acknowledge_message()

if __name__ == "__main__":
bot = Bambenekc2dommasterlistParserBot(sys.argv[1])
bot.start()
42 changes: 42 additions & 0 deletions intelmq/bots/parsers/bambenek/parser_c2ipmasterlist.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# -*- coding: utf-8 -*-
"""
http://osint.bambenekconsulting.com/feeds/c2-ipmasterlist.txt
format:
31.170.178.179,IP used by beebone C&C,2016-11-12 14:09,http://osint.bambenekconsulting.com/manual/beebone.txt
destination.ip,event_description.text,time.source,event_description.url
"""

import sys
from intelmq.lib import utils
from intelmq.lib.bot import Bot
from intelmq.lib.message import Event


class Bambenekc2ipmasterlistParserBot(Bot):

def process(self):
report = self.receive_message()
raw_report = utils.base64_decode(report.get("raw"))

for row in raw_report.splitlines():
if row.startswith('#'):
continue

row_split = row.split(',')

event = Event(report)

event.add('destination.ip', row_split[0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be source.ip, see https:/certtools/intelmq/blob/master/docs/Data-Harmonization.md#classification-1 (the second table and the text below).

event.add('event_description.text', row_split[1])
event.add('time.source', row_split[2] + " UTC")
event.add('event_description.url', row_split[3])
event.add('classification.type', 'c&c')
event.add('status', 'online')
event.add('raw', row)

self.send_message(event)
self.acknowledge_message()

if __name__ == "__main__":
bot = Bambenekc2ipmasterlistParserBot(sys.argv[1])
bot.start()
43 changes: 43 additions & 0 deletions intelmq/bots/parsers/bambenek/parser_dgafeed.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# -*- coding: utf-8 -*-
"""
feed:
http://osint.bambenekconsulting.com/feeds/dga-feed.txt
format:
xqmclnusaswvof.com,Domain used by Cryptolocker - Flashback DGA for 10 Nov 2016,2016-11-10,
http://osint.bambenekconsulting.com/manual/cl.txt
destination.fqdn,event_description.text,time.source,event_description.url
"""

import sys
from intelmq.lib import utils
from intelmq.lib.bot import Bot
from intelmq.lib.message import Event


class BambenekDGAfeedParserBot(Bot):

def process(self):
report = self.receive_message()
raw_report = utils.base64_decode(report.get("raw"))

for row in raw_report.splitlines():
if row.startswith('#'):
continue

row_split = row.split(',')

event = Event(report)

event.add('destination.fqdn', row_split[0])
event.add('event_description.text', row_split[1])
event.add('time.source', row_split[2] + " 00:00 UTC")
event.add('event_description.url', row_split[3])
event.add('classification.type', 'ransomware')
Copy link
Member

@sebix sebix Nov 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description of ransomware is This IOC refers to a specific type of compromized machine, where the computer has been hijacked for ransom by the criminals. but the description of the feed says "known DGA generated
domains used by malware".

Please have a look @aaronkaplan

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this even might change over time ... I have no good solution for this at the moment.
Currently it seems to cover ransomeware DGA domains, if I understand it correctly.
But... I think we should call it "dga domain" since that is what we are actually addressing with this feed. ("source.fqdn")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I change the classification.type to "dga domain" and also make the needed changes to harmonization.py to allow dga domain and also map it to Malicious Code in the Taxonomy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please :)

event.add('raw', row)

self.send_message(event)
self.acknowledge_message()

if __name__ == "__main__":
bot = BambenekDGAfeedParserBot(sys.argv[1])
bot.start()
Empty file.
43 changes: 43 additions & 0 deletions intelmq/bots/parsers/netlab_360/parser_dga.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# -*- coding: utf-8 -*-
"""
feed:
http://data.netlab.360.com/feeds/dga/dga.txt
format:
suppobox difficultdress.net 2016-11-12 11:58:56 2016-11-13 00:04:15
reference:
http://data.netlab.360.com/dga
"""

import sys
from intelmq.lib import utils
from intelmq.lib.bot import Bot
from intelmq.lib.message import Event


class Netlab360DGAParserBot(Bot):

def process(self):
report = self.receive_message()
raw_report = utils.base64_decode(report.get("raw"))

for row in raw_report.splitlines():
if row.startswith('#') or len(row) == 0:
continue

row_split = row.split('\t')

event = Event(report)

event.add('classification.identifier', row_split[0].lower())
event.add('time.source', row_split[3] + " UTC")
event.add('destination.fqdn', row_split[1])
event.add('raw', row)
event.add('classification.type', 'malware')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it a c&c?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, yes it is, not sure why I typed malware. Corrected.

event.add('event_description.url', 'http://data.netlab.360.com/dga')

self.send_message(event)
self.acknowledge_message()

if __name__ == "__main__":
bot = Netlab360DGAParserBot(sys.argv[1])
bot.start()
47 changes: 47 additions & 0 deletions intelmq/bots/parsers/netlab_360/parser_magnitude.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# -*- coding: utf-8 -*-
"""
feed:
http://data.netlab.360.com/feeds/ek/magnitude.txt
format:
Magnitude 1478939365 178.32.227.12 f57idayaa1.notsite.faith http://f57idayaa1.notsite.faith/1c472e11922f6a0a6c20fa5996aa35f4
Magnitude 1478907804 178.32.227.12 3w212q83hay0ao51fi.courtall.party N/A
reference:
http://data.netlab.360.com/ek
"""

import sys
from datetime import datetime
from intelmq.lib import utils
from intelmq.lib.bot import Bot
from intelmq.lib.message import Event


class Netlab360MagnitudeParserBot(Bot):

def process(self):
report = self.receive_message()
raw_report = utils.base64_decode(report.get("raw"))

for row in raw_report.splitlines():
if row.startswith('#') or len(row) == 0:
continue

row_split = row.split('\t')

event = Event(report)

event.add('classification.identifier', row_split[0].lower())
event.add('time.source', datetime.utcfromtimestamp(int(row_split[1])).strftime('%Y-%m-%dT%H:%M:%S+00:00'))
event.add('destination.ip', row_split[2])
event.add('destination.fqdn', row_split[3])
event.add('destination.url', row_split[4])
event.add('raw', row)
event.add('classification.type', 'exploit')
event.add('event_description.url', 'http://data.netlab.360.com/ek')

self.send_message(event)
self.acknowledge_message()

if __name__ == "__main__":
bot = Netlab360MagnitudeParserBot(sys.argv[1])
bot.start()
Empty file.
17 changes: 17 additions & 0 deletions intelmq/tests/bots/parsers/bambenek/c2-dommasterlist.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#############################################################
## Master Feed of known, active and non-sinkholed C&Cs domain
## names
##
## Feed generated at: 2016-11-12 15:16
##
## Feed Provided By: John Bambenek of Bambenek Consulting
## [email protected] // http://bambenekconsulting.com
## Use of this feed is governed by the license here:
## http://osint.bambenekconsulting.com/license.txt
##
## For more information on this feed go to:
## http://osint.bambenekconsulting.com/manual/c2-dommasterlist.txt
##
## All times are in UTC
#############################################################
sfkh21okrrnlu.com,Domain used by shiotob/urlzone/bebloh,2016-11-12 15:02,http://osint.bambenekconsulting.com/manual/bebloh.txt
17 changes: 17 additions & 0 deletions intelmq/tests/bots/parsers/bambenek/c2-ipmasterlist.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#############################################################
## Master Feed of known, active and non-sinkholed C&Cs IP
## addresses
##
## Feed generated at: 2016-11-12 18:17
##
## Feed Provided By: John Bambenek of Bambenek Consulting
## [email protected] // http://bambenekconsulting.com
## Use of this feed is governed by the license here:
## http://osint.bambenekconsulting.com/license.txt
##
## For more information on this feed go to:
## http://osint.bambenekconsulting.com/manual/c2-ipmasterlist.txt
##
## All times are in UTC
#############################################################
213.247.47.190,IP used by shiotob/urlzone/bebloh C&C,2016-11-12 18:02,http://osint.bambenekconsulting.com/manual/bebloh.txt
15 changes: 15 additions & 0 deletions intelmq/tests/bots/parsers/bambenek/dga-feed.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#############################################################
## Domain feed of known DGA domains from -2 to +3 days
##
## Feed generated at: Sat Nov 12 00:15:01 UTC 2016
##
## Feed Provided By: John Bambenek of Bambenek Consulting
## [email protected] // http://bambenekconsulting.com
##
## Use of this feed is governed by the license here:
## http://osint.bambenekconsulting.com/license.txt
## For more information on this feed go to:
## http://osint.bambenekconsulting.com/manual/dga-feed.txt
##
#############################################################
xqmclnusaswvof.com,Domain used by Cryptolocker - Flashback DGA for 10 Nov 2016,2016-11-10,http://osint.bambenekconsulting.com/manual/cl.txt
48 changes: 48 additions & 0 deletions intelmq/tests/bots/parsers/bambenek/test_parser_c2dommasterlist.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# -*- coding: utf-8 -*-

import os
import unittest

import intelmq.lib.test as test
import intelmq.lib.utils as utils
from intelmq.bots.parsers.bambenek.parser_c2dommasterlist import Bambenekc2dommasterlistParserBot

with open(os.path.join(os.path.dirname(__file__), 'c2-dommasterlist.txt')) as handle:
EXAMPLE_FILE = handle.read()

EXAMPLE_REPORT = {"feed.name": "Bambenek C2 Domain Feed",
"feed.url": "http://osint.bambenekconsulting.com/feeds/c2-dommasterlist.txt",
"raw": utils.base64_encode(EXAMPLE_FILE),
"__type": "Report",
"time.observation": "2016-01-01T00:00:00+00:00",
}
EVENTS = [{"feed.name": "Bambenek C2 Domain Feed",
"feed.url": "http://osint.bambenekconsulting.com/feeds/c2-dommasterlist.txt",
"__type": "Event",
"time.source": "2016-11-12T15:02:00+00:00",
"destination.fqdn": "sfkh21okrrnlu.com",
"classification.type": "c&c",
"status": "online",
"time.observation": "2016-01-01T00:00:00+00:00",
"event_description.text": "Domain used by shiotob/urlzone/bebloh",
"event_description.url": "http://osint.bambenekconsulting.com/manual/bebloh.txt",
"raw": "c2ZraDIxb2tycm5sdS5jb20sRG9tYWluIHVzZWQgYnkgc2hpb3RvYi91cmx6b25lL2JlYmxvaCwyMDE2LTExLTEyIDE1OjAyLGh0dHA6Ly9vc2ludC5iYW1iZW5la2NvbnN1bHRpbmcuY29tL21hbnVhbC9iZWJsb2gudHh0",
}]

class TestBambenekc2dommasterlistParserBot(test.BotTestCase, unittest.TestCase):
"""
A TestCase for Bambenekc2dommasterlistParserBot
"""

@classmethod
def set_bot(cls):
cls.bot_reference = Bambenekc2dommasterlistParserBot
cls.default_input_message = EXAMPLE_REPORT

def test_event(self):
""" Test if correct Event has been produced. """
self.run_bot()
self.assertMessageEqual(0, EVENTS[0])

if __name__ == '__main__':
unittest.main()
Loading