performance degradation for extra fields #1117

navtej · 2017-11-12T16:58:52Z

I have noticed severe performance degradation in bots caused by commit 50eb548 to file intelmq/lib/message.py

Here are two runs with commit 50eb548 and prior commit 6ccaf3b

Logging is set as follows

    "log_processed_messages_count": 10000,
    "log_processed_messages_seconds": 300,

First run is with commit 6ccaf3b and the bot is run for 5 minutes

ubuntu@ubuntu:~/src/intelmq$ git checkout 6ccaf3b82b051749164b0884607b5c161b142056 -- intelmq/lib/message.py
ubuntu@ubuntu:~/src/intelmq$ sudo -H pip3 install -e .
--pip output--
ubuntu@ubuntu:~/src/intelmq$ intelmqctl start taxonomy-expert
ubuntu@ubuntu:~/src/intelmq$ tail -f /opt/intelmq/var/log/taxonomy-expert.log
2017-11-12 13:29:32,022 - taxonomy-expert - INFO - TaxonomyExpertBot initialized with id taxonomy-expert and intelmq 1.1.0.alpha1 and python 3.4.2 (default, Oct  8 2014, 10:45:20) as process 19841.
2017-11-12 13:29:32,022 - taxonomy-expert - INFO - Bot is starting.
2017-11-12 13:29:32,023 - taxonomy-expert - INFO - Pipeline ready.
2017-11-12 13:30:01,141 - taxonomy-expert - INFO - Processed 10000 messages since last logging.
2017-11-12 13:30:30,353 - taxonomy-expert - INFO - Processed 10000 messages since last logging.
2017-11-12 13:30:59,151 - taxonomy-expert - INFO - Processed 10000 messages since last logging.
2017-11-12 13:31:28,417 - taxonomy-expert - INFO - Processed 10000 messages since last logging.
2017-11-12 13:31:56,735 - taxonomy-expert - INFO - Processed 10000 messages since last logging.
2017-11-12 13:32:25,360 - taxonomy-expert - INFO - Processed 10000 messages since last logging.
2017-11-12 13:32:54,706 - taxonomy-expert - INFO - Processed 10000 messages since last logging.
2017-11-12 13:33:23,586 - taxonomy-expert - INFO - Processed 10000 messages since last logging.
2017-11-12 13:33:52,999 - taxonomy-expert - INFO - Processed 10000 messages since last logging.
2017-11-12 13:34:21,959 - taxonomy-expert - INFO - Processed 10000 messages since last logging.
2017-11-12 13:34:53,043 - taxonomy-expert - INFO - Processed 10000 messages since last logging.
2017-11-12 13:35:22,710 - taxonomy-expert - INFO - Processed 10000 messages since last logging.```

The bot processed approx 1,20,000 messages in 5 minutes.

Next run is with commit 50eb548 and the bt is run for 30 minutes

ubuntu@ubuntu:~/src/intelmq$ git checkout 50eb548f57444e87063b456903b38eb9bbb23ed8 -- intelmq/lib/message.py
ubuntu@ubuntu:~/src/intelmq$ sudo -H pip3 install -e .
--pip output--
ubuntu@ubuntu:~/src/intelmq$ intelmqctl start taxonomy-expert
ubuntu@ubuntu:~/src/intelmq$ tail -f /opt/intelmq/var/log/taxonomy-expert.log
2017-11-12 13:40:15,904 - taxonomy-expert - INFO - TaxonomyExpertBot initialized with id taxonomy-expert and intelmq 1.1.0.alpha1 and python 3.4.2 (default, Oct  8 2014, 10:45:20) as process 20422.
2017-11-12 13:40:15,904 - taxonomy-expert - INFO - Bot is starting.
2017-11-12 13:40:15,905 - taxonomy-expert - INFO - Pipeline ready.
2017-11-12 13:45:15,938 - taxonomy-expert - INFO - Processed 9027 messages since last logging.
2017-11-12 13:50:15,980 - taxonomy-expert - INFO - Processed 3406 messages since last logging.
2017-11-12 13:55:16,005 - taxonomy-expert - INFO - Processed 2345 messages since last logging.
2017-11-12 14:00:16,150 - taxonomy-expert - INFO - Processed 1851 messages since last logging.
2017-11-12 14:05:16,216 - taxonomy-expert - INFO - Processed 1778 messages since last logging.
2017-11-12 14:10:16,335 - taxonomy-expert - INFO - Processed 1647 messages since last logging.

This time the bot only processed 20,000 messages in 30 minutes.

I haven't yet dissected deeper, I hope you can reproduce the results and pinpoint the cause of this issue.

The text was updated successfully, but these errors were encountered:

ghost · 2017-11-13T10:45:09Z

Uh. I expected that the speed would be s bit lower but that is really slow. Thanks for identifying the commit.

navtej · 2017-11-13T20:07:22Z

Any possibility you can help me with a quick resolution on this?

ghost · 2017-11-15T15:31:41Z

Unfortunately not yet.

It only affects extra fields and I suspect that the slow part is somewhere in the add-method (actually: some methods used by it):

>>> timeit.timeit('msg.MessageFactory.from_dict({"__type": "Event", "extra.foo": "bar"})', setup='import intelmq.lib.message as msg', number=1000)
7.618276967999918
>>> timeit.timeit('msg.MessageFactory.from_dict({"__type": "Event", "source.abuse_contact": "bar"})', setup='import intelmq.lib.message as msg', number=1000)
0.1765257789993484
>>> timeit.timeit("ev = msg.Event(); ev.add('extra.foo', 'bar')", setup='import intelmq.lib.message as msg', number=100)
0.7326651559997117
>>> timeit.timeit("ev = msg.Event(); ev.add('source.abuse_contact', 'bar')", setup='import intelmq.lib.message as msg', number=100)
0.019149830000969814
>>> timeit.timeit("ev._Message__is_valid_key('extra.foo')", setup='import intelmq.lib.message as msg; ev = msg.Event()', number=100)
0.1854049479989044
>>> timeit.timeit("ev._Message__is_valid_key('source.abuse_contact')", setup='import intelmq.lib.message as msg; ev = msg.Event()', number=100)
0.0003554070008249255
>>> timeit.timeit("ev._Message__get_type_config('extra.foo')", setup='import intelmq.lib.message as msg; ev = msg.Event()', number=100)
0.07898436200048309
>>> timeit.timeit("ev._Message__get_type_config('source.abuse_contact')", setup='import intelmq.lib.message as msg; ev = msg.Event()', number=100)
0.0006972609990043566

navtej · 2017-11-15T16:25:39Z

 ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   251201 1202.798    0.005 2502.155    0.010 message.py:163(add)
   251115  876.961    0.003  901.906    0.004 message.py:291(__is_valid_key)
   251200  309.809    0.001  346.010    0.001 message.py:301(__is_valid_value)
    67165   73.630    0.001   73.630    0.001 {method 'recv' of '_socket.socket' objects}
  1021561   34.987    0.000   95.798    0.000 message.py:350(hash)
 13540536   17.743    0.000   29.494    0.000 utils.py:90(encode)
 27013688   12.840    0.000   12.840    0.000 {method 'update' of '_hashlib.HASH' objects}
13954100/13940189    7.924    0.000    8.112    0.000 {method 'encode' of 'str' objects}
  6753471    6.933    0.000    6.933    0.000 {built-in method builtins.repr}
  1021585    6.751    0.000    6.782    0.000 {built-in method builtins.sorted}
15604264/15604263    5.255    0.000    5.255    0.000 {built-in method builtins.isinstance}
  1021561    3.926    0.000   99.724    0.000 message.py:347(__hash__)
  1021561    2.517    0.000    2.517    0.000 {method 'hexdigest' of '_hashlib.HASH' objects}
    67165    2.129    0.000    4.485    0.000 connection.py:632(pack_command)
  1021562    2.052    0.000    2.052    0.000 {built-in method _hashlib.openssl_sha256}
    67165    1.972    0.000    1.972    0.000 {method 'sendall' of '_socket.socket' objects}
    30768    1.835    0.000    3.610    0.000 _strptime.py:302(_strptime)
    67165    1.200    0.000   75.427    0.001 connection.py:166(_read_from_socket)
    16847    1.194    0.000 2481.536    0.147 message.py:92(__init__)

Here is cProfile run, add, __is_valid_key and __is_valid_value are the culprits

navtej · 2017-11-16T05:54:10Z

I did the timeit test with repeat and it led to interesting result

timeit.repeat("ev = msg.Event(); ev.add('extra.foo', 'bar')", setup='import intelmq.lib.message as msg', number=1000, repeat=10)
[1.447678582975641, 2.464164612116292, 3.438820904120803, 4.1598789119161665, 5.545920182950795, 6.403191218851134, 7.351596172899008, 8.178117901086807, 8.996471178950742, 10.242410328937694]

Notice the linear increase in time. This forced me to look away from the methods listed in above comment. I started to look at functool decorator lru_cache, here are the results with lru_cache decorator commented out

timeit.repeat("ev = msg.Event(); ev.add('extra.foo', 'bar')", setup='import intelmq.lib.message as msg', number=1000, repeat=10)
[0.23544840887188911, 0.2455875410232693, 0.22563187615014613, 0.22631947207264602, 0.2341673190239817, 0.23926178296096623, 0.22105620708316565, 0.22270849510096014, 0.21890562190674245, 0.2204773100093007]

The only explanation I can think of is that lru_cache relies on hash and somehow this doesnt works well in current setup.

I have submitted the PR #1119 for fix, in which I reordered the call to _get_type_config in add, and passed the result to __is_valid_key and __is_valid_value in params, so that these functions dont have to call _get_type_config

ghost · 2017-12-12T15:26:12Z

I reworked JSONDict a bit (without any changes at in these validation methods in the message class yet) and for the timeit-benchmark I am now as fast as on 1.0.x (actually ~5% faster, but that's nitpicking).

ghost · 2018-06-28T12:29:34Z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

performance degradation for extra fields #1117

performance degradation for extra fields #1117

navtej commented Nov 12, 2017

ghost commented Nov 13, 2017

navtej commented Nov 13, 2017

ghost commented Nov 15, 2017

navtej commented Nov 15, 2017

navtej commented Nov 16, 2017

ghost commented Dec 12, 2017 •

edited by ghost

Loading

ghost commented Jun 28, 2018

ghost commented Jul 19, 2018

performance degradation for extra fields #1117

performance degradation for extra fields #1117

Comments

navtej commented Nov 12, 2017

ghost commented Nov 13, 2017

navtej commented Nov 13, 2017

ghost commented Nov 15, 2017

navtej commented Nov 15, 2017

navtej commented Nov 16, 2017

ghost commented Dec 12, 2017 • edited by ghost Loading

ghost commented Jun 28, 2018

ghost commented Jul 19, 2018

ghost commented Dec 12, 2017 •

edited by ghost

Loading