By default all of the bots are started when you start the whole botnet, however there is a possibility to
disable a bot. This means that the bot will not start every time you start the botnet, but you can start
and stop the bot if you specify the bot explicitly. To disable a bot, add the following to your
runtime.conf
: "enabled": false
. Be aware that this is not a normal parameter (like the others
described in this file). It is set outside of the parameters
object in runtime.conf
. Check the
User-Guide for an example.
There are two different types of parameters: The initialization parameters are need to start the bot. The runtime parameters are needed by the bot itself during runtime.
The initialization parameters are in the first level, the runtime parameters live in the parameters
sub-dictionary:
{
"bot-id": {
"parameters": {
runtime parameters...
},
initialization parameters...
}
}
For example:
{
"abusech-feodo-domains-collector": {
"parameters": {
"provider": "Abuse.ch",
"feed": "Abuse.ch Feodo Domains",
"http_url": "http://example.org/feodo-domains.txt"
},
"name": "Generic URL Fetcher",
"group": "Collector",
"module": "intelmq.bots.collectors.http.collector_http",
"description": "collect report messages from remote hosts using http protocol",
"enabled": true,
"run_mode": "scheduled"
}
}
This configuration resides in the file runtime.conf
in your intelmq's configuration directory for each configured bot.
name
anddescription
: The name and description of the bot as can be found in BOTS-file, not used by the bot itself.group
: Can be"Collector"
,"Parser"
,"Expert"
or"Output"
. Only used for visualization by other tools.module
: The executable (should be in$PATH
) which will be started.enabled
: If the parameter is set totrue
(which is NOT the default value if it is missing as a protection) the bot will start when the botnet is started (intelmqctl start
). If the parameter was set tofalse
, the Bot will not be started byintelmqctl start
, however you can run the bot independently usingintelmqctl start <bot_id>
. Check the User-Guide for more details.run_mode
: There are two run modes, "continuous" (default run mode) or "scheduled". In the first case, the bot will be running forever until stopped or exits because of errors (depending on configuration). In the latter case, the bot will stop after one successful run. This is especially useful when scheduling bots via cron or systemd. Default iscontinuous
. Check the User-Guide for more details.
Feed parameters: Common configuration options for all collectors
feed
: Name for the feed (feed.name
).accuracy
: Accuracy for the data of the feed (feed.accuracy
).code
: Code for the feed (feed.code
).documentation
: Link to documentation for the feed (feed.documentation
).provider
: Name of the provider of the feed (feed.provider
).rate_limit
: time interval (in seconds) between fetching data if applicable.
HTTP parameters: Common URL fetching parameters used in multiple collectors
http_timeout_sec
: A tuple of floats or only one float describing the timeout of the http connection. Can be a tuple of two floats (read and connect timeout) or just one float (applies for both timeouts). The default is 30 seconds in default.conf, if not given no timeout is used. See also https://requests.readthedocs.io/en/master/user/advanced/#timeoutshttp_timeout_max_tries
: An integer depciting how often a connection is retried, when a timeout occured. Defaults to 3 in default.conf.http_username
: username for basic authentication.http_password
: password for basic authentication.http_proxy
: proxy to use for httphttps_proxy
: proxy to use for httpshttp_user_agent
: user agent to use for the request.http_verify_cert
: path to trusted CA bundle or directory,false
to ignore verifying SSL certificates, ortrue
(default) to verify SSL certificatesssl_client_certificate
: SSL client certificate to use.http_header
: HTTP request headers
name:
intelmq.bots.collectors.http.collector_httplookup:
yespublic:
yescache (redis db):
nonedescription:
collect report messages from remote hosts using http protocol
- Feed parameters (see above)
- HTTP parameters (see above)
http_url
: location of information resource (e.g. https://feodotracker.abuse.ch/blocklist/?download=domainblocklist)
name:
intelmq.bots.collectors.http.collector_http_streamlookup:
yespublic:
yescache (redis db):
nonedescription:
Opens a streaming connection to the URL and sends the received lines.
- Feed parameters (see above)
- HTTP parameters (see above)
strip_lines
: boolean, if single lines should be stripped (removing whitespace from the beginning and the end of the line)
If the stream is interrupted, the connection will be aborted using the timeout parameter. Then, an error will be thrown and rate_limit applies if not null.
The parameter http_timeout_max_tries
is of no use in this collector.
name:
intelmq.bots.collectors.mail.collector_mail_urllookup:
yespublic:
yescache (redis db):
nonedescription:
collect messages from mailboxes, extract URLs from that messages and download the report messages from the URLs.
- Feed parameters (see above)
- HTTP parameters (see above)
mail_host
: FQDN or IP of mail servermail_user
: user account of the email accountmail_password
: password associated with the user accountmail_ssl
: whether the mail account uses SSL (default:true
)folder
: folder in which to look for mails (default:INBOX
)subject_regex
: regular expression to look for a subjecturl_regex
: regular expression of the feed URL to search for in the mail bodysent_from
: filter messages by sendersent_to
: filter messages by recipient
name:
intelmq.bots.collectors.mail.collector_mail_attachlookup:
yespublic:
yescache (redis db):
nonedescription:
collect messages from mailboxes, download the report messages from the attachments.
- Feed parameters (see above)
mail_host
: FQDN or IP of mail servermail_user
: user account of the email accountmail_password
: password associated with the user accountmail_ssl
: whether the mail account uses SSL (default:true
)folder
: folder in which to look for mails (default:INBOX
)subject_regex
: regular expression to look for a subjectattach_regex
: regular expression of the name of the attachmentattach_unzip
: whether to unzip the attachment (default:true
)
name:
intelmq.bots.collectors.file.collector_filelookup:
yespublic:
yescache (redis db):
nonedescription:
collect messages from a file.
- Feed parameters (see above)
path
: path to filepostfix
: FIXMEdelete_file
: whether to delete the file after reading (default:false
)
name:
intelmq.bots.collectors.misp.collectorlookup:
yespublic:
yescache (redis db):
nonedescription:
collect messages from a MISP server.
- Feed parameters (see above)
misp_url
: url of MISP server (with trailing '/')misp_key
: MISP Authkeymisp_verify
: (default:true
)misp_tag_to_process
: MISP tag for events to be processedmisp_tag_processed
: MISP tag for processed events
name:
intelmq.bots.collectors.rt.collector_rtlookup:
yespublic:
yescache (redis db):
nonedescription:
Request Tracker Collector fetches attachments from an RTIR instance.
- Feed parameters (see above)
- HTTP parameters (see above)
uri
: url of the REST interface of the RTuser
: RT usernamepassword
: RT passwordsearch_owner
: owner of the ticket to search for (default:nobody
)search_queue
: queue of the ticket to search for (default:Incident Reports
)search_status
: status of the ticket to search for (default:new
)search_subject_like
: part of the subject of the ticket to search for (default:Report
)set_status
: status to set the ticket to after processing (default:open
)take_ticket
: whether to take the ticket (default:true
)url_regex
: regular expression of an URL to search for in the ticketattachment_regex
: regular expression of an attachment in the ticketunzip_attachment
: whether to unzip a found attachment
The parameter http_timeout_max_tries
is of no use in this collector.
name:
intelmq.bots.collectors.xmpp.collectorlookup:
yespublic:
yescache (redis db):
nonedescription:
This bot can connect to an XMPP Server and one room, in order to receive reports from it. TLS is used by default. rate_limit is ineffective here. Bot can either pass the body or the whole event.
- Feed parameters (see above)
xmpp_server
: FIXMExmpp_user
: FIXMExmpp_password
: FIXMExmpp_room
: FIXMExmpp_room_nick
: FIXMExmpp_room_password
: FIXMEca_certs
: FIXME (default:/etc/ssl/certs/ca-certificates.crt
)strip_message
: FIXME (default:true
)pass_full_xml
: FIXME (default:false
)
See the README.md
name:
intelmq.bots.collectors.alienvault_otx.collectorlookup:
yespublic:
yescache (redis db):
nonedescription:
collect report messages from Alien Vault OTX API
- Feed parameters (see above)
api_key
: location of information resource (e.g. FIXME)
See the README.md
name:
intelmq.bots.collectors.blueliv.collector_crimeserverlookup:
yespublic:
nocache (redis db):
nonedescription:
collect report messages from Blueliv API
- Feed parameters (see above)
api_key
: location of information resource
Iterates over all blobs in all containers in an Azure storage.
name:
intelmq.bots.collectors.microsoft.collector_azurelookup:
yespublic:
nocache (redis db):
nonedescription:
collect blobs from microsoft azure using their library
- Feed parameters (see above)
account_name
: account name as give by Microsoftaccount_key
: account key as give by Microsoftdelete
: boolean, delete containers and blobs after fetching
See the README.md
name:
intelmq.bots.collectors.n6.collector_stomplookup:
yespublic:
nocache (redis db):
nonedescription:
collect report messages from Blueliv API
- Feed parameters (see above)
exchange
: exchange point as given by CERT.plport
: 61614server
: hostname e.g. "n6stream.cert.pl"ssl_ca_certificate
: path to CA filessl_client_certificate
: path to client cert filessl_client_certificate_key
: path to client cert key file
TODO
Lines starting with '#'
will be ignored. Headers won't be interpreted.
-
"columns"
: A list of strings or a string of comma-separated values with field names. The names must match the harmonization's field names. Strings starting withextra.
will be written into the Extra-Object of the DHO. E.g.[ "", "source.fqdn", "extra.http_host_header" ],
It is possible to specify multiple coulmns using
|
character. E.g."columns": "source.url|source.fqdn|source.ip"
First, bot will try to parse the value as url, if it fails, it will try to parse it as FQDN, if that fails, it will try to parse it as IP, if that fails, an error wil be raised. Some use cases -
- mixed data set, e.g. URL/FQDN/IP/NETMASK `"columns": "source.url|source.fqdn|source.ip|source.network"` - parse a value and ignore if it fails `"columns": "source.url|__IGNORE__"`
-
"column_regex_search"
: Optional. A dictionary mapping field names (as given per the columns parameter) to regular expression. The field is evaulated usingre.search
. Eg. to get the ASN out ofAS1234
use:{"source.asn": "[0-9]*"}
. -
"default_url_protocol"
: For URLs you can give a defaut protocol which will be pretended to the data. -
"delimiter"
: separation character of the CSV, e.g.","
-
"skip_header"
: Boolean, skip the first line of the file, optional. Lines starting with#
will be skipped additionally, make sure you do not skip more lines than needed! -
time_format
: Optional. If"timestamp"
,"windows_nt"
or"epoch_millis"
the time will be converted first. With the defaultnull
fuzzy time parsing will be used. -
"type"
: set theclassification.type
statically, optional -
"data_type"
: sets the data of specific type, currently only"json"
is supported value. An example```{ "columns": [ "source.ip", "source.url", "extra.tags"], "data_type": "{\"extra.tags\":\"json\"}" }``` It will ensure `extra.tags` is treated as `json`.
-
"filter_text"
: only process the lines containing or not containing specified text, to be used in conjection withfilter_type
-
"filter_type"
: value can be whitelist or blacklist. Ifwhitelist
, only lines containing the text infilter_text
will be processed, ifblacklist
, only lines NOT containing the text will be processed.To process ipset format files use
{ "filter_text": "ipset add ", "filter_type": "whitelist", "columns": [ "__IGNORE__", "__IGNORE__", "__IGNORE__", "source.ip"] }
-
"type_translation"
: See below, optional
If the source does have a field with information for classification.type
, but it does not correspond to intelmq's types,
you can map them to the correct ones. The type_translation
field can hold a JSON field with a dictionary which maps the feed's values to intelmq's.
See the README.md
name:
abusixlookup:
dnspublic:
yescache (redis db):
5description:
FIXMEnotes
: https://abusix.com/contactdb.html
FIXME
See the README.md
name:
ASN lookuplookup:
local databasepublic:
yescache (redis db):
nonedescription:
IP to ASN
FIXME
name:
cymru-whoislookup:
cymru dnspublic:
yescache (redis db):
5description:
IP to geolocation, ASN, BGP prefix
FIXME
See the README.md
name:
deduplicatorlookup:
redis cachepublic:
yescache (redis db):
6description:
message deduplicator
Please check this README file.
name:
reducerlookup:
nonepublic:
yescache (redis db):
nonedescription:
The field reducer bot is capable of removing fields from events.
type
- either"whitelist"
or"blacklist"
keys
- a list of key names (strings)
Only the fields in keys
will passed along.
The fields in keys
will be removed from events.
See the README.md
name:
filterlookup:
nonepublic:
yescache (redis db):
nonedescription:
filter messages (drop or pass messages) FIXME
FIXME
See the README.md
name:
gethostbynamelookup:
dnspublic:
yescache (redis db):
nonedescription:
DNS name (FQDN) to IP
none
name:
idealookup:
local configpublic:
yescache (redis db):
nonedescription:
The bot does a best effort translation of events into the IDEA format.
test_mode
: addTest
category to mark all outgoing IDEA events as informal (meant to simplify setting up and debugging new IDEA producers) (default:true
)
See the README.md
name:
maxmind-geoiplookup:
local databasepublic:
yescache (redis db):
nonedescription:
IP to geolocation
FIXME
name:
modifylookup:
local configpublic:
yescache (redis db):
nonedescription:
modify expert bot allows you to change arbitrary field values of events just using a configuration file
The modify expert bot allows you to change arbitrary field values of events just using a configuration file. Thus it is possible to adapt certain values or adding new ones only by changing JSON-files without touching the code of many other bots.
The configuration is called modify.conf
and looks like this:
[
{
"rulename": "Standard Protocols http",
"if": {
"source.port": "^(80|443)$"
},
"then": {
"protocol.application": "http"
}
},
{
"rule": "Spamhaus Cert conficker",
"if": {
"malware.name": "^conficker(ab)?$"
},
"then": {
"classification.identifier": "conficker"
}
},
{
"rule": "bitdefender",
"if": {
"malware.name": "bitdefender-(.*)$"
},
"then": {
"malware.name": "{matches[malware.name][1]}"
}
},
{
"rule": "urlzone",
"if": {
"malware.name": "^urlzone2?$"
},
"then": {
"classification.identifier": "urlzone"
}
},
{
"rule": "default",
"if": {
"feed.name": "^Spamhaus Cert$"
},
"then": {
"classification.identifier": "{msg[malware.name]}"
}
}
]
In our example above we have five groups labeled Standard Protocols http
,
Spamhaus Cert conficker
, bitdefender
, urlzone
and default
.
All sections will be considered, in the given order (from top to bottom).
Each rule consists of conditions and actions. Conditions and actions are dictionaries holding the field names of events and regex-expressions to match values (selection) or set values (action). All matching rules will be applied in the given order. The actions are only performed if all selections apply.
If the value for a condition is an empty string, the bot checks if the field does not exist. This is useful to apply default values for empty fields.
You can set the value of the field to a string literal or number.
In addition you can use the standard Python string format syntax
to access the values from the processed event as msg
and the match groups
of the conditions as matches
, see the bitdefender example above.
Note that matches
will also contain the match groups
from the default conditions if there were any.
We have an event with feed.name = Spamhaus Cert
and malware.name = confickerab
. The expert loops over all sections in the file and eventually enters section Spamhaus Cert
. First, the default condition is checked, it matches! OK, going on. Otherwise the expert would have selected a different section that has not yet been considered. Now, go through the rules, until we hit the rule conficker
. We combine the conditions of this rule with the default conditions, and both rules match! So we can apply the action: classification.identifier
is set to conficker
, the trivial name.
Assume we have an event with feed.name = Spamhaus Cert
and malware.name = feodo
. The default condition matches, but no others. So the default action is applied. The value for classification.identifier
will be set to feodo
by {msg[malware.name]}
.
If the rule is a string, a regex-search is performed, also for numeric values (str()
is called on them). If the rule is numeric for numeric values, a simple comparison is done. If other types are mixed, a warning will be thrown.
name:
national_cert_contact_certat
lookup:
httpspublic:
yescache (redis db):
nonedescription:
https://contacts.cert.at offers an IP address to national CERT contact (and cc) mapping. See https://contacts.cert.at for more info.
filter
: (true/false) act as a a filter for AT.overwrite_cc
: set to true if you want to overwrite any potentially existing cc fields in the event.
name:
reverse-dnslookup:
dnspublic:
yescache (redis db):
8description:
IP to domain
FIXME
Several RFCs define IP addresses and Hostnames (and TLDs) reserved for documentation:
Sources:
- https://tools.ietf.org/html/rfc1918
- https://tools.ietf.org/html/rfc2606
- https://tools.ietf.org/html/rfc3849
- https://tools.ietf.org/html/rfc4291
- https://tools.ietf.org/html/rfc5737
- https://en.wikipedia.org/wiki/IPv4
name:
rfc1918lookup:
nonepublic:
yescache (redis db):
nonedescription:
removes events or single fields with invalid data
fields
: list of fields to look at. e.g. "destination.ip,source.ip,source.url"policy
: list of policies, e.g. "del,drop,drop".drop
drops the entire event,del
removes the field.
name:
ripencc-abuse-contactlookup:
https apipublic:
yescache (redis db):
9description:
IP to abuse contact
query_ripe_db_asn
: Query for IPs athttp://rest.db.ripe.net/abuse-contact/%s.json
, defaulttrue
query_ripe_db_ip
: Query for ASNs athttp://rest.db.ripe.net/abuse-contact/as%s.json
, defaulttrue
query_ripe_stat_asn
: Query for ASNs athttps://stat.ripe.net/data/abuse-contact-finder/data.json?resource=%s
, defaulttrue
query_ripe_stat_ip
: Query for IPs athttps://stat.ripe.net/data/abuse-contact-finder/data.json?resource=%s
, defaulttrue
mode
: eitherappend
(default) orreplace
name:
taxonomylookup:
local configpublic:
yescache (redis db):
nonedescription:
use eCSIRT taxonomy to classify events (classification type to classification taxonomy)
FIXME
See the README.md
name:
tor-nodeslookup:
local databasepublic:
yescache (redis db):
nonedescription:
check if IP is tor node
FIXME
name:
url2fqdnlookup:
nonepublic:
yescache (redis db):
nonedescription:
writes domain name from URL to FQDN
overwrite
: boolean, replace existing FQDN?
name:
filelookup:
nopublic:
yescache (redis db):
nonedescription:
output messages (reports or events) to file
file
: file path of output file
name:
fileslookup:
nopublic:
yescache (redis db):
nonedescription:
saving of messages as separate files
dir
: output directory (default/opt/intelmq/var/lib/bots/files-output/incoming
)tmp
: temporary directory (must reside on the same filesystem asdir
) (default:/opt/intelmq/var/lib/bots/files-output/tmp
)suffix
: extension of created files (default.json
)hierarchical_output
: iftrue
, use nested dictionaries; iffalse
, use flat structure with dot separated keys (default)single_key
: ifnone
, the whole event is saved (default); otherwise the bot saves only contents of the specified key
name:
mongodblookup:
nopublic:
yescache (redis db):
nonedescription:
MongoDB is the bot responsible to send events to a MongoDB database
collection
: MongoDB collectiondatabase
: MongoDB databasedb_user
: Database user that should be used if you enabled authenticationdb_pass
: Password associated todb_user
host
: MongoDB host (FQDN or IP)port
: MongoDB porthierarchical_output
: Boolean (default true) as mongodb does not allow saving keys with dots, we split the dictionary in sub-dictionaries.
pip3 install pymongo>=2.7.1
name:
postgresqllookup:
nopublic:
yescache (redis db):
nonedescription:
PostgreSQL is the bot responsible to send events to a PostgreSQL Databasenotes
: When activating autocommit, transactions are not used: http://initd.org/psycopg/docs/connection.html#connection.autocommit
The parameters marked with 'PostgreSQL' will be sent to libpq via psycopg2. Check the [libpq parameter documentation] (https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-PARAMKEYWORDS) for the versions you are using.
autocommit
: psycopg's autocommit mode, optional, default Trueconnect_timeout
: PostgreSQL connect_timeout, optional, default 5 secondsdatabase
: PostgreSQL databasehost
: PostgreSQL hostjsondict_as_string
: save JSONDict fields as JSON string, boolean. Default: true (like in versions before 1.1)port
: PostgreSQL portuser
: PostgreSQL userpassword
: PostgreSQL passwordsslmode
: PostgreSQL sslmode, can be'disable'
,'allow'
,'prefer'
(default),'require'
,'verify-ca'
or'verify-full'
. See postgresql docs: https://www.postgresql.org/docs/current/static/libpq-connect.html#libpq-connect-sslmodetable
: name of the database table into which events are to be inserted
See REQUIREMENTS.txt from your installation.
See outputs/postgresql/README.md from your installation.
name:
restapilookup:
nopublic:
yescache (redis db):
nonedescription:
REST API is the bot responsible to send events to a REST API listener through POST
auth_token
: the user name / http header keyauth_token_name
: the password / http header valueauth_type
: one of:"http_basic_auth"
,"http_header"
hierarchical_output
: booleanhost
: destination URLuse_json
: boolean
Sends a MIME Multipart message containing the text and the event as CSV for every single event.
name:
smtplookup:
nopublic:
yescache (redis db):
nonedescription:
Sends events via SMTP
fieldnames
: a list of field names to be included in the email, comma separated string or list of stringsmail_from
: string. Supports formatting, see belowmail_to
: string of email addresses, comma separated. Supports formatting, see belowsmtp_host
: stringsmtp_password
: string or null, Password for authentication on your SMTP serversmtp_port
: portsmtp_username
: string or null, Username for authentication on your SMTP serverssl
: booleanstarttls
: booleansubject
: string. Supports formatting, see belowtext
: string or null. Supports formatting, see below
For several strings you can use values from the string using the
standard Python string format syntax.
Access the event's values with {ev[source.ip]}
and similar.
Authentication is optional. If both username and password are given, these mechanism are tried: CRAM-MD5, PLAIN, and LOGIN.
Client certificates are not supported. If http_verify_cert
is true, TLS certificates are checked.
name:
tcplookup:
nopublic:
yescache (redis db):
nonedescription:
TCP is the bot responsible to send events to a tcp port (Splunk, ElasticSearch, etc..)
ip
: IP of destination serverhierarchical_output
: true for a nested JSON, false for a flat JSON.port
: port of destination serverseparator
: separator of messages