Skip to content

Python Bindings

MatejKastak edited this page Jan 27, 2023 · 7 revisions

Python bindings provide easy and comfortable way how to use yaramod in your Python scripts. You should be able to do most of the things that are available in C++ but you can run into few limitations of these bindings.

yaramod python bindings follow the same principles as C++ interface so you should first read about parsing and construction of YARA rules.

General information

  • Names of classes are the same as in C++.
  • Methods use snake_case instead of camelCase.
  • Usually method which begin with get, set and is are transformed into class properties.
  • Nested enums are no longer nested but they are still prefixed with the parent name (Rule::Modifier is RuleModifier in Python).
  • Enums which contain value None are replaced with value Empty to prevent name clash with python keyword.
  • Modifying lists or dictionaries returned by methods/properties do not directly modify the contents of object which it comes from. Use dedicated methods for it.
  • Methods which expect std::vector as parameter can be given list object in Python.
  • Methods which expect std::istream as parameter can be given str object in Python.
  • Methods which expect variadic arguments can be given variadic arguments in Python too.
  • Methods which expect pointer/reference to the object can be given the instance of that class in Python. However, be advised that there is much more copying going on when using Python bindings. This is most noticeable when using YaraFileBuilder.with_rule(). This method would not copy in C++, it would move the given Rule but in Python it is being copied. So even if you end up with valid Rule instance after calling with_rule(), you should no longer use it because any modification to it would not be reflected in the YaraFileBuilder.

Bindings tables

Parser bindings

C++ Python
parseFile() parse_file()
parseStream() parse_string()

Types bindings

YaraFile

C++ Python
YaraFile::getText() YaraFile.text
YaraFile::getRules() YaraFile.rules
YaraFile::getImports() YaraFile.imports
YaraFile::findSymbol() YaraFile.find_symbol()
YaraFile::insertRule() YaraFile.add_rule()
YaraFile::insertRule() YaraFile.insert_rule()
YaraFile::removeRules() YaraFile.remove_rules()
YaraFile::removeImports() YaraFile.remove_imports()
YaraFile::getTokenStream() YaraFile.tokenstream()

Rule

C++ Python
Rule::getText() Rule.text
Rule::getName() Rule.name
Rule::getMetas() Rule.metas
Rule::getStrings() Rule.strings
Rule::getTags() Rule.tags
Rule::getModifier() Rule.modifier
Rule::isPrivate() Rule.is_private
Rule::isGlobal() Rule.is_global
Rule::getLocation() Rule.location
Rule::getSymbol() Rule.symbol
Rule::getCondition() Rule.condition
Rule::setCondition() Rule.condition
Rule::removeString() Rule.remove_string()
Rule::getMetaWithName() Rule.get_meta_with_name()

Rule::Location

C++ Python
Rule::Location::filePath RuleLocation.file_path
Rule::Location::lineNumber RuleLocation.line_number

Module

C++ Python
Module::getName() Module.name

String

C++ Python
String::getText() String.text
String::getPureText() String.pure_text
String::getType() String.type
String::getIdentifier() String.identifier
String::isPlain() String.is_plain
String::isHex() String.is_hex
String::isRegexp() String.is_regexp
String::isAscii() String.is_ascii
String::isWide() String.is_wide
String::isFullword() String.is_fullword
String::isNocase() String.is_nocase
String::getModifiersText() String.modifiers_text

Regexp

C++ Python
Regexp::getSuffixModifiers() Regexp.suffix_modifiers

Symbol

C++ Python
Symbol::getName() Symbol.name
Symbol::getDataType() Symbol.data_type
Symbol::isValue() Symbol.is_value
Symbol::isArray() Symbol.is_array
Symbol::isDictionary() Symbol.is_dictionary
Symbol::isFunction() Symbol.is_function
Symbol::isStructure() Symbol.is_structure

StructureSymbol

C++ Python
StructureSymbol::getAttribute() StructureSymbol.get_attribute()

Expression

C++ Python
Expression::accept() Expression.accept()
Expression::getText() Expression.text
Expression::getText() Expression.get_text()

StringExpression

C++ Python
StringExpression::getId() StringExpression.id
StringExpression::setId() StringExpression.id

StringWildcardExpression

C++ Python
StringWildcardExpression::getId() StringWildcardExpression.id
StringWildcardExpression::setId() StringWildcardExpression.id

StringAtExpression

C++ Python
StringAtExpression::getId() StringAtExpression.id
StringAtExpression::setId() StringAtExpression.id
StringAtExpression::getAtExpression() StringAtExpression.at_expr
StringAtExpression::setAtExpression() StringAtExpression.at_expr

StringInRangeExpression

C++ Python
StringInRangeExpression::getId() StringInRangeExpression.id
StringInRangeExpression::setId() StringInRangeExpression.id
StringInRangeExpression::getRangeExpression() StringInRangeExpression.range_expr
StringInRangeExpression::setRangeExpression() StringInRangeExpression.range_expr

StringOffsetExpression

C++ Python
StringOffsetExpression::getId() StringOffsetExpression.id
StringOffsetExpression::setId() StringOffsetExpression.id
StringOffsetExpression::getIndexExpression() StringOffsetExpression.index_expr
StringOffsetExpression::setIndexExpression() StringOffsetExpression.index_expr

StringLengthExpression

C++ Python
StringLengthExpression::getId() StringLengthExpression.id
StringLengthExpression::setId() StringLengthExpression.id
StringLengthExpression::getIndexExpression() StringLengthExpression.index_expr
StringLengthExpression::setIndexExpression() StringLengthExpression.index_expr

UnaryOpExpression

C++ Python
UnaryOpExpression::getOperand() UnaryOpExpression.operand
UnaryOpExpression::setOperand() UnaryOpExpression.operand

BinaryOpExpression

C++ Python
BinaryOpExpression::getLeftOperand() BinaryOpExpression.left_operand
BinaryOpExpression::setLeftOperand() BinaryOpExpression.left_operand
BinaryOpExpression::getRightOperand() BinaryOpExpression.right_operand
BinaryOpExpression::setRightOperand() BinaryOpExpression.right_operand

SetExpression

C++ Python
SetExpression::getElements() SetExpression.elements
SetExpression::setElements() SetExpression.elements

RangeExpression

C++ Python
RangeExpression::getLow() RangeExpression.low
RangeExpression::getHigh() RangeExpression.high

IdExpression

C++ Python
IdExpression::getSymbol() IdExpression.symbol
IdExpression::setSymbol() IdExpression.symbol

StructAccessExpression

C++ Python
StructAccessExpression::getStructure() StructAccessExpression.structure
StructAccessExpression::setStructure() StructAccessExpression.structure

ArrayAccessExpression

C++ Python
ArrayAccessExpression::getArray() ArrayAccessExpression.array
ArrayAccessExpression::setArray() ArrayAccessExpression.array
ArrayAccessExpression::getAccessor() ArrayAccessExpression.acessor
ArrayAccessExpression::setAccessor() ArrayAccessExpression.acessor

FunctionCallExpression

C++ Python
FunctionCallExpression::getFunction() FunctionCallExpression.function
FunctionCallExpression::setFunction() FunctionCallExpression.function
FunctionCallExpression::getArguments() FunctionCallExpression.arguments
FunctionCallExpression::setArguments() FunctionCallExpression.arguments

LiteralExpression

C++ Python
LiteralExpression::getValue() LiteralExpression.value

ParenthesesExpression

C++ Python
ParenthesesExpression::getEnclosedExpression() ParenthesesExpression.enclosed_expr
ParenthesesExpression::setEnclosedExpression() ParenthesesExpression.enclosed_expr

IntFunctionExpression

C++ Python
IntFunctionExpression::getFunction() IntFunctionExpression.function
IntFunctionExpression::setFunction() IntFunctionExpression.function
IntFunctionExpression::getArguments() IntFunctionExpression.arguments
IntFunctionExpression::setArguments() IntFunctionExpression.arguments

RegexpExpression

C++ Python
RegexpExpression::getRegexpString() RegexpExpression.regexp_string
RegexpExpression::setRegexpString() RegexpExpression.regexp_string

Builder bindings

YaraFileBuilder

C++ Python
YaraFileBuilder::get() YaraFileBuilder.get()
YaraFileBuilder::withModule() YaraFileBuilder.with_module()
YaraFileBuilder::withRule() YaraFileBuilder.with_rule()

YaraRuleBuilder

C++ Python
YaraRuleBuilder::get() YaraRuleBuilder.get()
YaraRuleBuilder::withName() YaraRuleBuilder.with_name()
YaraRuleBuilder::withModifier() YaraRuleBuilder.with_modifier()
YaraRuleBuilder::withTag() YaraRuleBuilder.with_tag()
YaraRuleBuilder::withStringMeta() YaraRuleBuilder.with_string_meta()
YaraRuleBuilder::withIntMeta() YaraRuleBuilder.with_int_meta()
YaraRuleBuilder::withUIntMeta() YaraRuleBuilder.with_uint_meta()
YaraRuleBuilder::withHexIntMeta() YaraRuleBuilder.with_hex_int_meta()
YaraRuleBuilder::withBoolMeta() YaraRuleBuilder.with_bool_meta()
YaraRuleBuilder::withPlainString() YaraRuleBuilder.with_plain_string()
YaraRuleBuilder::withHexString() YaraRuleBuilder.with_hex_string()
YaraRuleBuilder::withRegexp() YaraRuleBuilder.with_regexp()
YaraRuleBuilder::withCondition() YaraRuleBuilder.with_condition()

YaraExpressionBuilder

C++ Python
YaraExpressionBuilder::get() YaraExpressionBuilder.get()
YaraExpressionBuilder::operator~() YaraExpressionBuilder.__invert__()
YaraExpressionBuilder::operator-() YaraExpressionBuilder.__neg__()
YaraExpressionBuilder::operator==() YaraExpressionBuilder.__eq__()
YaraExpressionBuilder::operator!=() YaraExpressionBuilder.__ne__()
YaraExpressionBuilder::operator<() YaraExpressionBuilder.__lt__()
YaraExpressionBuilder::operator>() YaraExpressionBuilder.__gt__()
YaraExpressionBuilder::operator<=() YaraExpressionBuilder.__le__()
YaraExpressionBuilder::operator>=() YaraExpressionBuilder.__ge__()
YaraExpressionBuilder::operator+() YaraExpressionBuilder.__add__()
YaraExpressionBuilder::operator-() YaraExpressionBuilder.__sub__()
YaraExpressionBuilder::operator*() YaraExpressionBuilder.__mul__()
YaraExpressionBuilder::operator/() YaraExpressionBuilder.__truediv__()
YaraExpressionBuilder::operator%() YaraExpressionBuilder.__mod__()
YaraExpressionBuilder::operator^() YaraExpressionBuilder.__xor__()
YaraExpressionBuilder::operator&() YaraExpressionBuilder.__and__()
YaraExpressionBuilder::operator|() YaraExpressionBuilder.__or__()
YaraExpressionBuilder::operator<<() YaraExpressionBuilder.__lshift__()
YaraExpressionBuilder::operator>>() YaraExpressionBuilder.__rshift__()
YaraExpressionBuilder::operator()() YaraExpressionBuilder.__call__()
YaraExpressionBuilder::operator[]() YaraExpressionBuilder.__getitem__()
YaraExpressionBuilder::access() YaraExpressionBuilder.access()
YaraExpressionBuilder::contains() YaraExpressionBuilder.contains()
YaraExpressionBuilder::matches() YaraExpressionBuilder.matches()
YaraExpressionBuilder::readInt8() YaraExpressionBuilder.read_int8()
YaraExpressionBuilder::readInt16() YaraExpressionBuilder.read_int16()
YaraExpressionBuilder::readInt32() YaraExpressionBuilder.read_int32()
YaraExpressionBuilder::readUInt8() YaraExpressionBuilder.read_uint8()
YaraExpressionBuilder::readUInt16() YaraExpressionBuilder.read_uint16()
YaraExpressionBuilder::readUInt32() YaraExpressionBuilder.read_uint32()
YaraExpressionBuilder::operator!() not_()

Expression builder functions

C++ Python
set() set()
range() range()
conjunction() conjunction()
disjunction() disjunction()
filesize() filesize()
entrypoint() entrypoint()
all() all()
any() any()
them() them()
regexp() regexp()

YaraHexStringBuilder

C++ Python
YaraHexStringBuilder::get() YaraHexStringBuilder.get()
YaraExpressionBuilder::add() YaraHexStringBuilder.add()

Hex string builder functions

C++ Python
wildcard() wildcard()
wildcardLow() wildcard_low()
wildcardHigh() wildcard_high()
jumpVarying() jump_varying()
jumpFixed() jump_fixed()
jumpVaryingRange() jump_varying_range()
jumpRange() jump_range()
alt() alt()

TokenStream

C++ Python
empty() empty
size() size
front() front
back() back
getTokens() tokens
getTokensAsText() tokens_as_text
begin() begin
end() end
find() find()

TokenIt

C++ Python
*self value
++self increment()
--self decrement()
return ++self next()
return --self previous()

Visitor bindings

Since Python does not support function overloading and library pybind11 which we used has its limitations too, we applied unconventional measures in order to make the visitor interface work in Python. visit() from C++ becomes visit_XYZ() where XYZ is the name of the type that is being visited (for example visit_FunctionCallExpression()). This should keep your Python code as much similar as its C++ counterpart.

This is an example of FunctionCallDumper which we introduced in the parsing of YARA files.

import yaramod

class FunctionCallDumper(yaramod.ObservingVisitor):
    def visit_FunctionCallExpression(self, expr):
        print('Function call: {}'.format(expr.function.text))
        for arg in expr.arguments:
            arg.accept(self)