Preserve x-axis ordering in split series vis. #27723

lukeelmers · 2018-12-21T22:48:56Z

Closed in favor of #31533

Resolves part of #17532

Summary

This resolves an issue where the original sort order sent back by ES was
lost for point series / vislib visualizations with split series. This
was due to the way the point series agg response handler generated
series data, only filling in series values as it encountered them
bucket-by-bucket, rather than first looking at all x-values and ordering
them consistently within each series.

With this change, when a series is first created in the agg_response, it
will first look at all results, preserving the x-value sort order. Then
when creating new series, it will instantiate a zero-filled array with
the correctly ordered x axis values, filling it in with the real values
as it encounters them.

This duplicates some of the work done in the vislib zero_injection
component, which can likely be cleaned up further, or possibly removed
entirely.

To Do

~~- [ ] Determine if vislib/components/zero_injection can be removed~~ (Edit: I think we should look at this in a separate PR to keep things smaller and simpler... plus I want to take additional time for testing should we remove this).

Resolve conflicts for compatibility with [WIP] visualizations field formatting refactoring #26951
Fix failing tests

Checklist

~~- [ ] This was checked for cross-browser compatibility, including a check against IE11~~
~~- [ ] Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n support~~
~~- [ ] Documentation was added for features that require explanation or tutorials~~
~~- [ ] This was checked for keyboard-only and screenreader accessibility~~

Unit or functional tests were updated or added to match the most common scenarios

elasticmachine · 2018-12-21T22:48:58Z

Pinging @elastic/kibana-app

elasticmachine · 2018-12-21T23:18:53Z

💔 Build Failed

continuous-integration/kibana-ci/pull-request

elasticmachine · 2018-12-22T00:46:23Z

💔 Build Failed

continuous-integration/kibana-ci/pull-request

elasticmachine · 2018-12-28T00:25:06Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request

This resolves an issue where the original sort order sent back by ES was lost for point series / vislib visualizations with split series. This was due to the way the point series agg response handler generated series data, only filling in series values as it encountered them bucket-by-bucket, rather than first looking at all x-values and ordering them consistently within each series. With this change, when a series is first created in the `agg_response`, it will first look at all results, preserving the x-value sort order. Then when creating new series, it will instantiate a zero-filled array with the correctly ordered x axis values, filling it in with the real values as it encounters them. This duplicates some of the work done in the vislib `zero_injection` component, which can likely be cleaned up further, or possibly removed entirely.

elasticmachine · 2019-01-23T06:52:20Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request

ppisljar

LGTM, tested it in chrome linux

src/ui/public/agg_response/point_series/_get_series.js

markov00

Let me block this PR so we can discuss this a bit:
Are we fixing only the first level of ordering (fixing only the order of the main columns on the x axis)?

Because the issue #17532 is not only related to the ordering of the bars on the x axis, but is related also to the ordering of the splitted series for each bar (as some linked issues report that.

You can easily see that if you use the following:

first x axis by extensions terms
split series by machine os.

Now try to change the ordering of the split series aggregation (ascending and descending) and check the tooltips values: seems that the series are ordered by series and they are not respecting the ordering coming from ES.
On the inspector table you can easily see the right results: but the visualization just insert points based on series orders not on data order.

lukeelmers · 2019-01-24T18:02:29Z

After further investigation with @markov00, we confirmed that this PR does indeed only solve part of the problem: While the x-axis will be ordered correctly, subbuckets will still be sorted based on the results of the first agg.

TL;DR: I recommend we merge this PR as it still solves one use case, and open a new PR for the second.

To reiterate Marco's point, take the following example using kibana_sample_data_logs. Here's the aggregation config:

X-Axis terms agg bucket (field.extension.keyword), sorted alphabetically
split series terms agg on machine.os.keyword

Here is an excerpt of the ES response:

      "aggregations": {
        "2": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 218,
          "buckets": [
            {
              "3": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 129,
                "buckets": [
                  {
                    "key": "win 7",
                    "doc_count": 114
                  },
                  {
                    "key": "ios",
                    "doc_count": 117
                  },
                  {
                    "key": "osx",
                    "doc_count": 125
                  },
                  {
                    "key": "win 8",
                    "doc_count": 125
                  }
                ]
              },
              "key": "",
              "doc_count": 610
            },
            {
              "3": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 62,
                "buckets": [
                  {
                    "key": "win 8",
                    "doc_count": 43
                  },
                  {
                    "key": "win 7",
                    "doc_count": 47
                  },
                  {
                    "key": "osx",
                    "doc_count": 51
                  },
                  {
                    "key": "ios",
                    "doc_count": 54
                  }
                ]
              },
              "key": "css",
              "doc_count": 257
            },
            {
              "3": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 44,
                "buckets": [
                  {
                    "key": "win 7",
                    "doc_count": 29
                  },
                  {
                    "key": "win 8",
                    "doc_count": 32
                  },
                  {
                    "key": "win xp",
                    "doc_count": 33
                  },
                  {
                    "key": "ios",
                    "doc_count": 34
                  }
                ]
              },
              "key": "deb",
              "doc_count": 172
            },
            ...

This PR will ensure the order is correct for the first bucket:

["", "css", "deb", ...]

However, the subbuckets will all be ordered based on the first result only:

["win 7", "ios", "osx", "win 8", ...]

This issue is described in deeper detail in the comments on the original issue, and in some of the duplicate issues. I was focused on solving for the x-axes and missed the second use case.

Solving for the subbucket ordering is more complex as it requires reworking our fundamental structure for passing around series data; currently the data is passed to vislib like this:

[
  {label: "win 7", aggLabel: "Count", aggId: "1", count: 0, values: Array(5)},
  {label: "ios", aggLabel: "Count", aggId: "1", count: 0, values: Array(5)},
  {label: "osx", aggLabel: "Count", aggId: "1", count: 0, values: Array(5)},
  {label: "win 8", aggLabel: "Count", aggId: "1", count: 0, values: Array(5)},
  {label: "win xp", aggLabel: "Count", aggId: "1", count: 0, values: Array(5)}
]

values[] contain the point data for each of the (ordered) items in the x-axis

As you can see, there is no concept of ordering series items within each x-axis bucket, as they only exist once at the outer level.

Solving for the ordering of subbuckets would require a few things:

We would need to decide what is the "source of truth" for ordering the legend. Is this ordered based on the results for the current window of data you are looking at (as was requested in the original issue)? What if you're using a custom metric to order that data?
We would need to rethink the way we are passing data to the charts, such that we could introduce a mechanism to track both overall ordering of the subbuckets, as well as ordering within each individual point on the x-axis.

Since this PR still solves one valid use case and is separate from the subbucket ordering issue, I recommend we merge this and open a new PR for the second use case.

elasticmachine · 2019-01-24T23:24:09Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request

ppisljar

This solution (probably) has another issue. We are zero filling all the series, which means we are actually changing the results. It will not have any effect on the bar chart, but area and line charts will look different. I don't think there is an easy way around it, as we have no way to differentiate between zero filled values and actual zero values.

A quick overview of where things go wrong:

we correctly convert table (tabify) to preserve all the orders
we correctly create the series: series are ordered by the order they appeared in the response, and their values are ordered by the order they appeared in the response.

But when the chart tries to draw this, it would render one series a time, in the order they appeared and always render all values for each series. This is what produces wrong x-axis.

One way to fix above, would be to give information about the x axis, in the form of ordered array of x-axis values. Chart can then use that to make sure x-axis order is preserved.

However this still doesn't solve the issue @markov00 is mentioning, but i would argue that is not really an issue. Our charts were designed to behave that way. In most scenarios it makes sense:

you do any chart with non stacked split series ... when the series are not stacked you might expect them to always show in the same order. For example charts like this:

order of series is always the same, no matter their value. it would be confusing if the red bar would be jumping left and right.

you do stacked area or line. You will always want the order of series to stay the same, no matter the values ... you don't want the zig-zag lines just because the order of series changed between data points.

so the only use case where this order doesn't make that much sense (it still might in some scenarios) is in a stacked bar chart.

I suggest leaving this out of this PR, opening a feature request for it and referencing it in original issue.

markov00 · 2019-02-14T11:10:33Z

@ppisljar the zigzag thing only depends on how you order the splitted series. Since we provide the user the ability to change the Order by option, he can decide if it's better to have the subbuckets orderd by metric (that can make the zigzag thing but can be used to compare behaviours on each bucket) or you can order alphabetically (that preserve the bucket orders and dont create the zig zag thing).
Thing is that we are not respecting the split series order by. It's neither alphabetical or by value, it's by first come first served, or better we just preserve the order of elements on the first sub bucket, appending any other new bucket value on the end of this list. On Luke's example you see that the first bucket is

 {
                    "key": "win 7",
                    "doc_count": 114
                  },
                  {
                    "key": "ios",
                    "doc_count": 117
                  },
                  {
                    "key": "osx",
                    "doc_count": 125
                  },
                  {
                    "key": "win 8",
                    "doc_count": 125

that's the order we maintain throughout the visualization. when we find win xp we just add it to the end of the ordering list creating something like:
win 7, ios, osx, win 8, win xp that doesn't have any predictable ordering, is not alphabetical, is not by value.

So in conclusion: yes mine is an issue: we are not taking in consideration the split series order by.

elasticmachine · 2019-02-14T11:39:44Z

💔 Build Failed

continuous-integration/kibana-ci/pull-request

ppisljar · 2019-02-15T06:21:50Z

@markov00 and we never were. But not really relevant, it shouldn't be part of this PR, its gonna be quite a big undertaking. as discussed yesterday over zoom, the problem is that the data structure we use to respresent chart data (series) doesn't hold the information about the ordering of points.

lukeelmers · 2019-02-19T23:24:23Z

I'm closing this in favor of #31533, which will address the issue as follows:

x value ordering is still determined when the series are generated, and is then passed to vislib
we ensure that the correct order is preserved during the zero injection process, that way we aren't zero filling everything by default.

we have no way to differentiate between zero filled values and actual zero values.

@ppisljar Just a note that I think we can check for the presence of an xi key in the series value to determine if it is zero-filled (xi: Infinity is set on all zero-filled items and is not present on "real" values). But regardless, I think the plan described above is simpler as it doesn't touch as many things.

lukeelmers added review Feature:Vislib Vislib chart implementation WIP Work in progress Feature:Visualizations Generic visualization features (in case no more specific feature label is available) v7.0.0 Team:Visualizations Visualization editors, elastic-charts and infrastructure labels Dec 21, 2018

lukeelmers requested review from ppisljar and markov00 December 21, 2018 22:50

lukeelmers force-pushed the fix/subbucket-sorting branch from 384a7a8 to 37d145a Compare December 27, 2018 23:43

lukeelmers added the v6.7.0 label Jan 9, 2019

lukeelmers mentioned this pull request Jan 15, 2019

[6.x] Preserve x-axis ordering in split series vis. #28733

Closed

lukeelmers added 6 commits January 15, 2019 10:29

Fix typo in function name

3b53c82

Fix some of the failing tests.

edac25e

Fix more failing unit tests.

3f6e0c3

Add TODO

caebdd0

Add unit test & clean up.

9e613b3

lukeelmers force-pushed the fix/subbucket-sorting branch from 4ed8394 to 9e613b3 Compare January 23, 2019 06:11

lukeelmers changed the title ~~[WIP] Preserve x-axis ordering in split series vis.~~ Preserve x-axis ordering in split series vis. Jan 23, 2019

lukeelmers removed the WIP Work in progress label Jan 23, 2019

ppisljar approved these changes Jan 23, 2019

View reviewed changes

src/ui/public/agg_response/point_series/_get_series.js Outdated Show resolved Hide resolved

This comment has been minimized.

Sign in to view

markov00 requested changes Jan 24, 2019

View reviewed changes

Merge branch 'master' into fix/subbucket-sorting

368a0e4

This comment has been minimized.

Sign in to view

Update fake x aspects

661757b

ppisljar requested changes Feb 14, 2019

View reviewed changes

lukeelmers closed this Feb 19, 2019

lukeelmers deleted the fix/subbucket-sorting branch February 19, 2019 23:24

lukeelmers mentioned this pull request Feb 20, 2019

Bar in a bar charts are getting unsorted when adding a sub-bucket aggregation #17532

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserve x-axis ordering in split series vis. #27723

Preserve x-axis ordering in split series vis. #27723

lukeelmers commented Dec 21, 2018 •

edited

Loading

elasticmachine commented Dec 21, 2018

elasticmachine commented Dec 21, 2018

elasticmachine commented Dec 22, 2018

elasticmachine commented Dec 28, 2018

elasticmachine commented Jan 23, 2019

ppisljar left a comment

This comment has been minimized.

markov00 left a comment •

edited

Loading

lukeelmers commented Jan 24, 2019

This comment has been minimized.

elasticmachine commented Jan 24, 2019

ppisljar left a comment

markov00 commented Feb 14, 2019 •

edited

Loading

elasticmachine commented Feb 14, 2019

ppisljar commented Feb 15, 2019

lukeelmers commented Feb 19, 2019

Preserve x-axis ordering in split series vis. #27723

Preserve x-axis ordering in split series vis. #27723

Conversation

lukeelmers commented Dec 21, 2018 • edited Loading

Summary

To Do

Checklist

elasticmachine commented Dec 21, 2018

elasticmachine commented Dec 21, 2018

💔 Build Failed

elasticmachine commented Dec 22, 2018

💔 Build Failed

elasticmachine commented Dec 28, 2018

💚 Build Succeeded

elasticmachine commented Jan 23, 2019

💚 Build Succeeded

ppisljar left a comment

Choose a reason for hiding this comment

This comment has been minimized.

markov00 left a comment • edited Loading

Choose a reason for hiding this comment

lukeelmers commented Jan 24, 2019

This comment has been minimized.

elasticmachine commented Jan 24, 2019

💚 Build Succeeded

ppisljar left a comment

Choose a reason for hiding this comment

markov00 commented Feb 14, 2019 • edited Loading

elasticmachine commented Feb 14, 2019

💔 Build Failed

ppisljar commented Feb 15, 2019

lukeelmers commented Feb 19, 2019

lukeelmers commented Dec 21, 2018 •

edited

Loading

markov00 left a comment •

edited

Loading

markov00 commented Feb 14, 2019 •

edited

Loading