Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve speed #984

Merged
merged 18 commits into from
Apr 4, 2023
Merged

Improve speed #984

merged 18 commits into from
Apr 4, 2023

Conversation

hitenkoku
Copy link
Collaborator

@hitenkoku hitenkoku commented Mar 25, 2023

What Changed

  • Improved speed by removed unnecessary processing

Evidence

Benchmark by 6.1GB Evtx

main json-timeline(Elapsed time: 00:04:41.829)
Results Summary:

First Timestamp: 2009-07-14 13:56:45.074 +09:00
Last Timestamp: 2022-09-18 23:37:13.088 +09:00

Events with hits / Total events: 1,594,356 / 4,817,181 (Data reduction: 3,222,825 events (66.90%))

Total | Unique detections: 1,627,665 | 152
Total | Unique critical detections: 0 (0.00%) | 0 (0.00%)
Total | Unique high detections: 12,043 (0.74%) | 20 (13.16%)
Total | Unique medium detections: 11,015 (0.68%) | 42 (27.63%)
Total | Unique low detections: 1,054,520 (64.79%) | 40 (26.32%)
Total | Unique informational detections: 550,087 (33.80%) | 50 (32.89%)

Dates with most total detections:
critical: n/a, high: 2022-09-18 (4,438), medium: 2022-02-08 (4,802), low: 2022-09-18 (911,835), informational: 2022-03-02 (206,078)

Top 5 computers with most unique detections:
critical: n/a
high: evtx-PC (10), Agamemnon (7), DESKTOP-A8CALR3 (7), DESKTOP-6D0DBMB (7), WIN-FPV0DSIC9O6 (3)
medium: Agamemnon (27), DESKTOP-6D0DBMB (20), DESKTOP-A8CALR3 (19), evtx-PC (13), WIN-FPV0DSIC9O6 (9)
low: DESKTOP-6D0DBMB (28), DESKTOP-A8CALR3 (22), Agamemnon (20), evtx-PC (15), WIN-FPV0DSIC9O6.sigma.fr (12)
informational: DESKTOP-6D0DBMB (36), DESKTOP-A8CALR3 (35), WIN-TKC15D7KHUR (33), Agamemnon (31), WIN-FPV0DSIC9O6.sigma.fr (30)

...
Saved file: main.json (957.4 MB)
Elapsed time: 00:04:41.829
Rule Parse Processing Time: 00:00:02.197
Analysis Processing Time: 00:04:14.859
Output Processing Time: 00:00:24.769

Memory usage stats:
heap stats:    peak      total      freed    current       unit      count
  reserved:    4.1 GiB    4.1 GiB   83.0 MiB    4.0 GiB
 committed:    3.3 GiB   71.4 GiB   68.1 GiB    3.2 GiB
     reset:      0          0          0          0                            ok
   touched:   64.2 KiB   23.9 MiB   55.7 GiB  -55.7 GiB                        ok
  segments:     14        383        374          9                            not all freed!
-abandoned:      0          0          0          0                            ok
   -cached:      0          0          0          0                            ok
     pages:      0          0      755.6 Ki  -755.6 Ki                         ok
-abandoned:      0          0          0          0                            ok
 -extended:      0
 -noretire:      0
     mmaps:      0
   commits:   62.7 Ki
   threads:     32         32          0         32                            not all freed!
  searches:     0.0 avg
numa nodes:       1
   elapsed:     281.843 s
   process: user: 2558.031 s, system: 15.453 s, faults: 5359182, rss: 3.3 GiB, commit: 3.4 GiB

This PR json-timeline(Elapsed time: 00:04:14.211)
Results Summary:

First Timestamp: 2009-07-14 13:56:45.074 +09:00
Last Timestamp: 2022-09-18 23:37:13.088 +09:00

Events with hits / Total events: 1,594,356 / 4,817,181 (Data reduction: 3,222,825 events (66.90%))

Total | Unique detections: 1,627,665 | 152
Total | Unique critical detections: 0 (0.00%) | 0 (0.00%)
Total | Unique high detections: 12,043 (0.74%) | 20 (13.16%)
Total | Unique medium detections: 11,015 (0.68%) | 42 (27.63%)
Total | Unique low detections: 1,054,520 (64.79%) | 40 (26.32%)
Total | Unique informational detections: 550,087 (33.80%) | 50 (32.89%)

Dates with most total detections:
critical: n/a, high: 2022-09-18 (4,438), medium: 2022-02-08 (4,802), low: 2022-09-18 (911,835), informational: 2022-03-02 (206,078)

Top 5 computers with most unique detections:
critical: n/a
high: evtx-PC (10), Agamemnon (7), DESKTOP-A8CALR3 (7), DESKTOP-6D0DBMB (7), WIN-FPV0DSIC9O6 (3)
medium: Agamemnon (27), DESKTOP-6D0DBMB (20), DESKTOP-A8CALR3 (19), evtx-PC (13), WIN-FPV0DSIC9O6 (9)
low: DESKTOP-6D0DBMB (28), DESKTOP-A8CALR3 (22), Agamemnon (20), evtx-PC (15), WIN-FPV0DSIC9O6.sigma.fr (12)
informational: DESKTOP-6D0DBMB (36), DESKTOP-A8CALR3 (35), WIN-TKC15D7KHUR (33), Agamemnon (31), WIN-FPV0DSIC9O6.sigma.fr (30)

...

Saved file: improve.json (957.4 MB)
Elapsed time: 00:04:14.211
Rule Parse Processing Time: 00:00:00.881
Analysis Processing Time: 00:03:48.796
Output Processing Time: 00:00:24.532

Memory usage stats:
heap stats:    peak      total      freed    current       unit      count
  reserved:    4.2 GiB    4.2 GiB  109.0 MiB    4.1 GiB
 committed:    3.3 GiB   70.6 GiB   67.4 GiB    3.1 GiB
     reset:      0          0          0          0                            ok
   touched:   64.2 KiB   24.0 MiB   55.7 GiB  -55.7 GiB                        ok
  segments:     14        385        376          9                            not all freed!
-abandoned:      0          0          0          0                            ok
   -cached:      0          0          0          0                            ok
     pages:      0          0      755.6 Ki  -755.6 Ki                         ok
-abandoned:      0          0          0          0                            ok
 -extended:      0
 -noretire:      0
     mmaps:      0
   commits:   62.4 Ki
   threads:     32         32          0         32                            not all freed!
  searches:     0.0 avg
numa nodes:       1
   elapsed:     254.244 s
   process: user: 2521.625 s, system: 13.343 s, faults: 5290026, rss: 3.3 GiB, commit: 3.4 GiB


  • main csv-timeline
  1. Elapsed time: 00:04:09.352
    • Rule Parse Processing Time: 00:00:01.456
    • Analysis Processing Time: 00:03:55.534
    • Output Processing Time: 00:00:12.360
  2. Elapsed time: 00:04:03.522
    • Rule Parse Processing Time: 00:00:00.889
    • Analysis Processing Time: 00:03:50.359
    • Output Processing Time: 00:00:12.272
  3. Elapsed time: 00:04:02.660
    • Rule Parse Processing Time: 00:00:00.890
    • Analysis Processing Time: 00:03:49.488
    • Output Processing Time: 00:00:12.281
  • This PR csv-timeline
  1. Elapsed time: 00:04:15.633
    • Rule Parse Processing Time: 00:00:01.538
    • Analysis Processing Time: 00:04:01.778
    • Output Processing Time: 00:00:12.316
  2. Elapsed time: 00:04:01.801
    • Rule Parse Processing Time: 00:00:00.904
    • Analysis Processing Time: 00:03:48.721
    • Output Processing Time: 00:00:12.174
  3. Elapsed time: 00:04:04.515
    • Rule Parse Processing Time: 00:00:00.885
    • Analysis Processing Time: 00:03:51.402
    • Output Processing Time: 00:00:12.226
  4. Elapsed time: 00:04:02.902
    • Rule Parse Processing Time: 00:00:00.935
    • Analysis Processing Time: 00:03:49.765
    • Output Processing Time: 00:00:12.200

@hitenkoku hitenkoku added the enhancement New feature or request label Mar 25, 2023
@hitenkoku hitenkoku self-assigned this Mar 25, 2023
Copy link
Collaborator

@fukusuket fukusuket left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my environment the speed (and memory usage, detection result file) seemed to be the same before and after the fix.(6.1GB) But no regressions, so LGTM!!🚀

@YamatoSecurity
Copy link
Collaborator

@hitenkoku ありがとうございます!CSV出力しかまだテストできていないのですが、こんな感じです:
2.3.2:
-EのEIDフィルタが有効の場合: 処理時間 00:20:42.163 メモリ使用: 12.5 GiB
無効の場合: 処理時間:00:25:46.253 メモリ使用: 12.5 GiB

PR:
-EのEIDフィルタが有効の場合: 処理時間 00:21:07.684 メモリ使用: 12.5 GiB
無効の場合: 処理時間: 00:26:08.211 メモリ使用: 12.2 GiB

メモリ使用が300MB減っているものの、処理時間が少し増えているので、ゆっくり確認したいと思います。

@fukusuket 時間がある時に、確認をお願いできますか?

@hitenkoku
Copy link
Collaborator Author

hitenkoku commented Mar 26, 2023

@YamatoSecurity お忙しい中、ご確認ありがとうございます。StringからCompactstrに変更しているためその分遅くなったかもしれませんあります。(その代わりメモリ使用量が減った)

@fukusuket
Copy link
Collaborator

fukusuket commented Mar 26, 2023

update-rules後のルールで、6.1GB EVTXにcsv-timelineしたところ、以下の結果でした!🤔
(各回測定前にマシンを再起動しています)

Version Option Elapsed time Memory(peak) Events with hits / Total events Output file size(bytes)
2.3.2 - 00:06:46.046 3.8 GiB 1,594,356 / 4,817,181 594986069
main - 00:06:45.302 3.8 GiB 1,594,356 / 4,817,181 594986069
This PR - 00:06:45.944 3.8 GiB 1,594,356 / 4,817,181 594986069
2.3.2 -E 00:06:38.384 3.8 GiB 1,594,356 / 4,817,181 594986069
main -E 00:06:37.118 3.8 GiB 1,594,356 / 4,817,181 594986069
This PR -E 00:06:42.050 3.8 GiB 1,594,356 / 4,817,181 594986069

(すみません🙇 main -E/This PR -Eの結果が間違っていたようなので、再取得して修正しました)

@hitenkoku
Copy link
Collaborator Author

@fukusuket ありがとうございます!filterとかでCompactStringにしているところをStringにrevertしたpull reqをべつに出しておきますねー

@hitenkoku
Copy link
Collaborator Author

#985 で作っておきました。
他の高速なString系統がないか別で調べてみたりします。

…eb4f59f3164229a6bae0d49

Revert "perf(filter, yaml): replaced String with ComapctString"
@codecov
Copy link

codecov bot commented Mar 28, 2023

Codecov Report

Patch coverage: 39.66% and project coverage change: -0.15 ⚠️

Comparison is base (f2f20ef) 75.96% compared to head (904dd04) 75.81%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #984      +/-   ##
==========================================
- Coverage   75.96%   75.81%   -0.15%     
==========================================
  Files          24       24              
  Lines       17221    17178      -43     
==========================================
- Hits        13082    13024      -58     
- Misses       4139     4154      +15     
Impacted Files Coverage Δ
src/detections/detection.rs 71.76% <23.24%> (-0.49%) ⬇️
src/options/profile.rs 86.08% <56.25%> (-0.27%) ⬇️
src/afterfact.rs 45.44% <70.78%> (-1.50%) ⬇️
src/detections/message.rs 92.04% <87.50%> (+0.15%) ⬆️
src/detections/configs.rs 55.97% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@YamatoSecurity
Copy link
Collaborator

もう一回試しましたが、まだ
00:25:27.342 -> 00:26:36.196
12.5 GiB -> 12.2 GiB
という感じでメモリ使用は減っていますが、処理時間が遅くなっています。

@fukusuket
Copy link
Collaborator

fukusuket commented Mar 28, 2023

@YamatoSecurity
以下はそれぞれどのような状況でしょうか?(念の為の確認ですが、下記すべて同じ条件で比較でしょうか??)

  • コマンドオプション
  • プロファイル
  • 検知ルール
  • 検知結果数
  • レコード数

コード差分から1分の速度差を出しそうな箇所を見つけられていないのですが、上記の条件次第で、速度差がでやすいということかもしれません...🤔

@hitenkoku
Copy link
Collaborator Author

@fukusuket @YamatoSecurity

c1ef6ea でProfile内のCompactStringをCowに置き換えることでもう少し早くなりました。ご確認いただけますでしょうか。

6.1GBのevtxでWIndows11の環境で得られています

 >./improve4.exe csv-timeline -d ../all-evtx -o improve4.csv -q -E --debug
...
Results Summary:

First Timestamp: 2009-07-14 13:56:45.074 +09:00
Last Timestamp: 2022-09-18 23:37:13.088 +09:00

Events with hits / Total events: 1,594,356 / 4,817,181 (Data reduction: 3,222,825 events (66.90%))

Total | Unique detections: 1,627,665 | 152
Total | Unique critical detections: 0 (0.00%) | 0 (0.00%)
Total | Unique high detections: 12,043 (0.74%) | 20 (13.16%)
Total | Unique medium detections: 11,015 (0.68%) | 42 (27.63%)
Total | Unique low detections: 1,054,520 (64.79%) | 40 (26.32%)
Total | Unique informational detections: 550,087 (33.80%) | 50 (32.89%)

Dates with most total detections:
critical: n/a, high: 2022-09-18 (4,438), medium: 2022-02-08 (4,802), low: 2022-09-18 (911,835), informational: 2022-03-02 (206,078)

Top 5 computers with most unique detections:
critical: n/a
high: evtx-PC (10), Agamemnon (7), DESKTOP-A8CALR3 (7), DESKTOP-6D0DBMB (7), WIN-FPV0DSIC9O6 (3)
medium: Agamemnon (27), DESKTOP-6D0DBMB (20), DESKTOP-A8CALR3 (19), evtx-PC (13), WIN-FPV0DSIC9O6 (9)
low: DESKTOP-6D0DBMB (28), DESKTOP-A8CALR3 (22), Agamemnon (20), evtx-PC (15), WIN-FPV0DSIC9O6.sigma.fr (12)
informational: DESKTOP-6D0DBMB (36), DESKTOP-A8CALR3 (35), WIN-TKC15D7KHUR (33), Agamemnon (31), WIN-FPV0DSIC9O6.sigma.fr (30)

...

Saved file: improve4.csv (595.0 MB)
Elapsed time: 00:04:02.449
Rule Parse Processing Time: 00:00:01.487
Analysis Processing Time: 00:03:48.471
Output Processing Time: 00:00:12.490

@hitenkoku hitenkoku added this to the v2.4.0 milestone Mar 29, 2023
@YamatoSecurity
Copy link
Collaborator

コード差分から1分の速度差を出しそうな箇所を見つけられていないのですが、上記の条件次第で、速度差がでやすいということかもしれません...🤔

ちゃんとエビデンスを残さなくてすみません!
最近は全ルールを有効にしています:
./target/release/hayabusa csv-timeline -d ../logs -o test.csv -n -D -u --debug

出力ファイルサイズと検知数は一緒です。

@hitenkoku
最新版で試したら
main (00:25:46.253 メモリ使用: 12.5 GiB) -> PR (00:24:55.799 メモリ使用: 13.0 GiB)
という風に速くなっていますが、メモリ使用が少し増えています。

後で、もう一回調べてみますね。少々お待ち下さい。

@fukusuket ベンチマークを取る時にオプション、プロファイル等を決めた方が良さそうですね。何が良いか考えてみますね。

@hitenkoku
Copy link
Collaborator Author

hitenkoku commented Mar 29, 2023

ご確認ありがとうございます!Cowの利用にあたり、一部処理でtoStringを一時的に作らざるを得ないというところとCowの生存期間の問題もありそうなのでもう少し方法ないかなーとはおもっています。あとはreplaceとかの処理についてもAho-Corasickを使えそうなところはあるのでそこも入れられると良さそうですかね……

Copy link
Collaborator

@YamatoSecurity YamatoSecurity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

速度が約3〜5%増えています。LGTM!
Thank you!

@fukusuket
Copy link
Collaborator

確認が遅く、すみません🙇 こちらの環境でもスピードアップ確認いたしました!LGTM!🚀

Version Option Elapsed time Memory(peak) Events with hits / Total events Output file size(bytes)
main csv-timeline -n -D -u --debug 00:07:33.695 9.8 GiB 1,594,356 / 4,817,181 1823073069
This PR csv-timeline -n -D -u --debug 00:07:09.212 10.4 GiB 1,594,356 / 4,817,181 1823073069

@hitenkoku
Copy link
Collaborator Author

ありがとうございます。CoWについてはstaticでやらないと難しそうなので、一旦この状態でスピードアップ対応としてマージをさせていただきます

@hitenkoku hitenkoku merged commit 276ff6f into main Apr 4, 2023
@YamatoSecurity YamatoSecurity deleted the improve_speed branch April 12, 2023 00:09
@YamatoSecurity YamatoSecurity restored the improve_speed branch April 13, 2023 04:43
@hitenkoku hitenkoku deleted the improve_speed branch April 27, 2023 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants