Add search_after to geonames and http_logs #130

mayya-sharipova · 2020-09-07T21:26:55Z

As we are optimizing "search_after" performance, we would like to
measure it. This adds "search_after" operation to geonames
and http_logs with a big enough value for "search_after".

geonames

"population" field. As most documents have 0 value for this field,
search_after was added to asc sort.
"geonameid" field has unique values for each document.
"search_after" was set to 5000000, as an approximate medium
value.

http_logs

"@timestamp" field, "search_after" was set to "1998-06-10"
for desc and asc sorts, which is a medium value.

@timestamp

As we are optimizing "search_after" performance, we would like to measure it. This adds "search_after" operation to geonames and http_logs with a big enough value for "search_after". 1. geonames - "population" field. As most documents have 0 value for this field, search_after was added to asc sort. - "geonameid" field has unique values for each document. "search_after" was set to 5000000, as an approximate medium value. 2. http_logs - "@timestamp" field, "search_after" was set to "1998-06-10" for desc and asc sorts, which is a medium value.

mayya-sharipova · 2020-09-07T21:32:50Z

Some results obtained on my laptop:

http_logs

| 50th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     6.83232 |     ms |
| 90th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     7.08107 |     ms |
| 99th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     7.71589 |     ms |
|100th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     9.33675 |     ms |

| 50th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     709.761 |     ms |
| 90th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     731.649 |     ms |
| 99th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     752.732 |     ms |
|100th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     767.543 |     ms |

| 50th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     54.4112 |     ms |
| 90th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |      56.511 |     ms |
| 99th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     61.0517 |     ms |
|100th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     61.6246 |     ms | 
                                            
| 50th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     734.612 |     ms |
| 90th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     770.386 |     ms |
| 99th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     821.791 |     ms |
|100th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     834.089 |     ms |

Full results here

geonames

| 50th percentile service time |            asc_sort_population |      49.872 |     ms |
| 90th percentile service time |            asc_sort_population |     53.4872 |     ms |
| 99th percentile service time |            asc_sort_population |     58.1984 |     ms |
|100th percentile service time |            asc_sort_population |     59.2803 |     ms |

| 50th percentile service time | asc_sort_with_after_population |      72.391 |     ms |
| 90th percentile service time | asc_sort_with_after_population |     75.4028 |     ms |
| 99th percentile service time | asc_sort_with_after_population |     84.9475 |     ms |
|100th percentile service time | asc_sort_with_after_population |     85.8957 |     ms |

| 50th percentile service time |             asc_sort_geonameid |     4.17862 |     ms |
| 90th percentile service time |             asc_sort_geonameid |     4.27179 |     ms |
| 99th percentile service time |             asc_sort_geonameid |     4.34915 |     ms |
|100th percentile service time |             asc_sort_geonameid |     4.41063 |     ms |

| 50th percentile service time |  asc_sort_with_after_geonameid |     66.9177 |     ms |
| 90th percentile service time |  asc_sort_with_after_geonameid |     70.9752 |     ms |
| 99th percentile service time |  asc_sort_with_after_geonameid |     73.6624 |     ms |
|100th percentile service time |  asc_sort_with_after_geonameid |     80.3342 |     ms |

| 50th percentile service time |            desc_sort_geonameid |     6.11204 |     ms |
| 90th percentile service time |            desc_sort_geonameid |     6.86492 |     ms |
| 99th percentile service time |            desc_sort_geonameid |     7.03959 |     ms |
|100th percentile service time |            desc_sort_geonameid |     7.09217 |     ms |

| 50th percentile service time | desc_sort_with_after_geonameid |     41.6359 |     ms |
| 90th percentile service time | desc_sort_with_after_geonameid |     43.7668 |     ms |
| 99th percentile service time | desc_sort_with_after_geonameid |     46.2971 |     ms |
|100th percentile service time | desc_sort_with_after_geonameid |     47.1171 |     ms |

Full results here

mayya-sharipova · 2020-09-10T13:59:46Z

Also would like to ask @jimczi if using a median value for search_after looks good to him?

jimczi

I left some comments. Can you share the entire output for the challenges, it's truncated in your comments but I don't understand why the query is so costly. The fact that the throughput is too high shouldn't affect the service time too much.
Do you have an idea on how long the http_logs search_after query runs on a single node ?

geonames/operations/default.json

jimczi · 2020-09-10T14:03:50Z

http_logs/operations/default.json

+ "sort" : [
+ {"@timestamp" : "desc"}
+ ],
+ "search_after": ["1998-06-10"]


http_logs/operations/default.json

jimczi · 2020-09-10T14:05:22Z

http_logs/challenges/default.json

@@ -136,6 +136,20 @@
 "warmup-iterations": 200,
 "iterations": 100,
 "target-throughput": 2
+ },
+ {
+ "name": "desc-sort-with-after-timestamp-after-force-merge-1-seg",


Would be interesting to have the number before force merge too?

jimczi · 2020-09-10T14:06:16Z

http_logs/challenges/default.json

+ {
+ "name": "desc-sort-with-after-timestamp-after-force-merge-1-seg",
+ "operation": "desc_sort_with_after_timestamp",
+ "warmup-iterations": 200,


these warmups and iterations are too big imo. 5 to 10 should be enough to have stable results ?

addressed in b9f095f. But also wanted to confirm with @ebadyano that we are ok with 10 warmup-iterations

http_logs/challenges/default.json

jimczi · 2020-09-10T14:11:01Z

f using a median value for search_after looks good to him?

+1 to use a median value

mayya-sharipova · 2020-09-10T21:51:00Z

@jimczi Thank you for the feedback, at first I have incorrectly reported latency instead of service_time, and latency also included wait time; that's why the numbers were so big. I have updated the comment to report service_time and the numbers are not so atrocious now.

Also I've chatted with @jimczi offline, and we have defined the following tasks to be tackled:

Determine the right target-throughput for http_logs sort with search_after. Running a manual test and checking its performance can help here. Also help from the performance team is appreciated.
Benchmark desc and asc sort with search_after before merging to a single segment.

mayya-sharipova · 2020-09-14T18:48:12Z

@jimczi I have addressed your comments, and ran benchmarking once more with updated operations.
Below are the summary of results.

Full results here

|                       Metric |                                                   Task |       Value |   Unit |
| ----------------------------:|-------------------------------------------------------:|------------:|-------:|
| 50th percentile service time |                                    desc_sort_timestamp |     10.7055 |     ms |
| 90th percentile service time |                                    desc_sort_timestamp |     11.2209 |     ms |
| 99th percentile service time |                                    desc_sort_timestamp |     13.4729 |     ms |
|100th percentile service time |                                    desc_sort_timestamp |      18.994 |     ms |
|
| 50th percentile service time |                         desc_sort_with_after_timestamp |     813.787 |     ms |
| 90th percentile service time |                         desc_sort_with_after_timestamp |     921.016 |     ms |
| 99th percentile service time |                         desc_sort_with_after_timestamp |     1021.73 |     ms |
|100th percentile service time |                         desc_sort_with_after_timestamp |     1144.55 |     ms |    
|                                             
| 50th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     53.8716 |     ms |
| 90th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     56.3902 |     ms |
| 99th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     59.7357 |     ms |
|100th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     59.9086 |     ms |
|                                       
| 50th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     811.308 |     ms |
| 90th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     893.403 |     ms |
| 99th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     950.185 |     ms |
|100th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     958.059 |     ms |
|                                     
|
| 50th percentile service time |                                     asc_sort_timestamp |     8.10474 |     ms |
| 90th percentile service time |                                     asc_sort_timestamp |     12.1846 |     ms |
| 99th percentile service time |                                     asc_sort_timestamp |     13.4501 |     ms |
|100th percentile service time |                                     asc_sort_timestamp |     14.9856 |     ms |                
|                                                   
| 50th percentile service time |                          asc_sort_with_after_timestamp |     754.215 |     ms |
| 90th percentile service time |                          asc_sort_with_after_timestamp |      811.58 |     ms |
| 99th percentile service time |                          asc_sort_with_after_timestamp |     883.414 |     ms |
|100th percentile service time |                          asc_sort_with_after_timestamp |     908.114 |     ms |
|                                                    
| 50th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     3.69549 |     ms |
| 90th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     3.86637 |     ms |
| 99th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     4.08493 |     ms |
|100th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     4.42581 |     ms |

| 50th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |      778.27 |     ms |
| 90th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     872.711 |     ms |
| 99th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     905.239 |     ms |
|100th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     920.384 |     ms |

ebadyano · 2020-09-21T20:28:25Z

@mayya-sharipova Thank you for the changes. I ran on our low-mem env and the target throughput looks okay to me. I will merge the request tomorrow and add the charts to https://elasticsearch-benchmarks.elastic.co later this week. Thanks!

@timestamp

As we are optimizing "search_after" performance, we would like to measure it. This adds "search_after" operation to geonames and http_logs with a big enough value for "search_after". 1. geonames - "population" field. As most documents have 0 value for this field, search_after was added to asc sort. - "geonameid" field has unique values for each document. "search_after" was set to 5000000, as an approximate medium value. 2. http_logs - "@timestamp" field, "search_after" was set to "1998-06-10" for desc and asc sorts, which is a medium value.

dliappis requested a review from ebadyano September 8, 2020 07:13

dliappis added the enhancement label Sep 8, 2020

ebadyano self-assigned this Sep 8, 2020

jimczi requested changes Sep 10, 2020

View reviewed changes

Address Jim's feedback

b9f095f

ebadyano merged commit 6bb91d4 into elastic:master Sep 22, 2020

mayya-sharipova deleted the search-after branch September 24, 2020 14:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add search_after to geonames and http_logs #130

Add search_after to geonames and http_logs #130

mayya-sharipova commented Sep 7, 2020

mayya-sharipova commented Sep 7, 2020 •

edited

Loading

mayya-sharipova commented Sep 10, 2020

jimczi left a comment

jimczi Sep 10, 2020

jimczi Sep 10, 2020

jimczi Sep 10, 2020

mayya-sharipova Sep 14, 2020

jimczi commented Sep 10, 2020

mayya-sharipova commented Sep 10, 2020

mayya-sharipova commented Sep 14, 2020 •

edited

Loading

ebadyano commented Sep 21, 2020

Add search_after to geonames and http_logs #130

Add search_after to geonames and http_logs #130

Conversation

mayya-sharipova commented Sep 7, 2020

mayya-sharipova commented Sep 7, 2020 • edited Loading

mayya-sharipova commented Sep 10, 2020

jimczi left a comment

Choose a reason for hiding this comment

jimczi Sep 10, 2020

Choose a reason for hiding this comment

jimczi Sep 10, 2020

Choose a reason for hiding this comment

jimczi Sep 10, 2020

Choose a reason for hiding this comment

mayya-sharipova Sep 14, 2020

Choose a reason for hiding this comment

jimczi commented Sep 10, 2020

mayya-sharipova commented Sep 10, 2020

mayya-sharipova commented Sep 14, 2020 • edited Loading

ebadyano commented Sep 21, 2020

mayya-sharipova commented Sep 7, 2020 •

edited

Loading

mayya-sharipova commented Sep 14, 2020 •

edited

Loading