Bar in a bar charts are getting unsorted when adding a sub-bucket aggregation #17532

melvynator · 2018-04-03T22:07:49Z

Kibana version: 6.2.2

Elasticsearch version: 6.2.2

Server OS version: Mac OS

Browser version: Google chrome Version 65.0.3325.162 (Official Build) (64-bit)

Browser OS version: Chrome 65 on Mac OS X 10

Original install method (e.g. download page, yum, from source, etc.): Download page

Description of the problem including expected versus actual behavior:
The problem appears when adding a sub-bucket to a bar chart. If I have this bar chart:

It's a simple terms aggregation on a specific field.

If I want to split the series using another terms aggregation the sorting will be messed up:

This visualisation is not in accordance with the elasticsearch response:

{
          "3": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "neutral",
                "doc_count": 6091
              }
            ]
          },
          "key": "nicetrybertha",
          "doc_count": 6091
        },
        {
          "3": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "positive",
                "doc_count": 5325
              }
            ]
          },
          "key": "JennaGuillaume",
          "doc_count": 5325
        },
        {
          "3": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "positive",
                "doc_count": 4626
              }
            ]
          },
          "key": "malgico",
          "doc_count": 4626
        }

It may be because of the polarity of certain bucket, but I have no mean to confirm this hypothesis.

Steps to reproduce:

Build a bar chart
Define a terms aggregation
Split the series using another term aggregation

@thomasneirynck I would provide a dataset ASAP

The text was updated successfully, but these errors were encountered:

timroes · 2018-04-11T12:41:55Z

Thanks for the report. I am able to reproduce this with makelogs data as follows:

Terms on extension.raw, size: 5
- Terms on machine.os.raw, size: 2

There could be a possible workaround for you. Can you try switching the "Order By" of the "Split Series" to "Term" (or whatever the third option is) instead of using "metric: Count". Since you have all possible 3 sentiment values within each of the author buckets, this might work.

melvynator · 2018-04-20T01:23:13Z

@timroes @ppisljar

Thanks for the replication.

I tried to apply the work around, it doesn't seem to fix the issue:

ogtool · 2018-05-17T17:22:05Z

Believe we have this issue as well (ES + Kibana both 6.2.3) - I lodged at https://discuss.elastic.co/t/sub-aggregation-graph-ordering-off-first-bucket-not-overall-data-set/132216 and was pointed at this existing request.

What appears to be happening is that the sorting (for the legend and each bar graph) is based on the popularity of the sub-aggs in the first bar graph. The same ordering is then applied to every single bar graph - if your data sample for the first bucket is an outlier (as I do in the screenshot below) then this leads to a pretty illogical UX for the rest of the graph. This assumption seems to be consistent across all the samples other users have provided.

The expected output (in my opinion) is that the all sub-aggregations should be considered in the whole displayed time period and then the sorting of the legend and within each bar graph should reflect all data points, not just the first buckets sub-aggs.

Some may expect that each bar in the graph is ordered based off it's own sub-aggs - maybe this is a user selectable option as I feel if you ask 100 people for their opinion, you'd get 150 different answers.

CCM

willphillips-armedia · 2018-11-19T18:46:43Z

Hey, this is still an open issue in 6.4.2. Anyone know if any progress has been made on this? This is seriously impacting our users. #25687

lukeelmers · 2019-02-20T07:06:16Z

An exploratory PR (#27723) was opened to investigate this, and here are our findings after much discussion (some of which is captured in the PR comments).

There are really two things that are causing confusion:

x-axis values are not ordered consistently when dealing with split series visualizations. I think this was the original intent the issue @willphillips-armedia created in Sorting of X-Axis is incorrect when "Split Series" is also enabled #25687. This affects all point series charts.
The ordering of subbuckets within stacked bar charts is determined based on the order of the first agg result that comes back, and that order is applied to all subsequent buckets. This may lead to surprising behavior when using stacked bars, like if your first result is an outlier and affects the sorting of all other bars (+ the legend). This is essentially the problem @ogtool is describing above.

I'm not sure if @melvynator intended for this issue to address item 1, item 2, or both, but for the time being I opened #31534 to track item 1 separately. (A solution for that issue is already in progress)

As for item 2, there are a lot of things to consider:

Kibana has, as far as we are able to tell, always behaved this way. It's a result of the underlying data schema we are using to pass around series data: Preserve x-axis ordering in split series vis. #27723 (comment)
The sorting is primarily an issue when you are using stacked split series bar charts. When using non-stacked charts, I think the current sorting behavior (where subbucket order is consistent across each X value) is probably what most users would expect.
In order to introduce this requested functionality to Kibana, we would need to:
1. Redesign the schema we are using for vislib / point series data and roll it out across vislib.
2. Determine how we should actually sort subbuckets. (As @ogtool rightly points out, there could be a lot of opinions on how this should be handled). e.g. Is is based on all of the data that is currently visible? Is it sorted individually for each x value? How should the legend be ordered?

Since this change would be a team effort that requires overhauling large portions of vislib, I'm going to keep this issue open for now in order to keep tabs on item 2 until we can set aside some time to tackle it properly.

The good news is that item 1 should be fixed any day now, and we have a much clearer picture of the effort that would be involved to make item 2 happen.

larrywongl · 2019-07-17T08:43:04Z

Any update on this issue? I have met the same issue.

Knksumanth · 2019-12-09T18:23:37Z

Any update on when this issue will be addressed and which version can we expect this to be fixed?

formiaczek · 2020-05-15T08:30:27Z

Hello,

This is to also 👍 and follow up from another discussion and it's conclusions on this subject made here.

To summarize: issue can be reproduced with the Kibana demo here

As @lukeelmers pointed out:

The ordering of subbuckets (...) is determined based on the order of the first agg result that comes back, and that order is applied to all subsequent buckets.

Nested buckets allow for 'groping' the data by some criteria, the most-inner bucket being a 'sort-of' result meant to present / make sense of the data.

Currently it is possible to:

re-arrange the order of buckets (and it is useful and affects the way data is being processed in subsequent buckets)
for each bucket (for most of the aggregations) there is 'OrderBy' and 'Size'. Because it is 'for a given bucket' - it really seems that this should be applied within the bucket (and not some outer/parent bucket).
for each bucket the pair: 'OrderBy' and 'Size X' - seem like they should produce 'Top' (or 'Bottom' depending on 'Ascending' vs 'Descending') X items from the resulting bucket (and not necessarily from an outer bucket that likely contains other, unrelated items). There is an issue with this too (see linked discussion) whereby items expected to fall within the 'Top/Bottom X' range might disappear from the bucket if they won't fall into this range in root bucket (and this is very likely if Size is small enough). Given this unpredictability - the only safe way to currently use 'Size' it to set it to a 'big enough' value so that expected results are not discarded..

I think that if 'Order by' that is set against a particular bucket is not applied within this bucket,
and because in some use-cases original 'order' of first aggregation (current behaviour) might actually be desired, to preserve the 'existing' behaviour and address issues that arrise(d), perhaps either could be considered:

'OrderBy' SHOULD NOT be available and accessible for sub-buckes at all (perhaps it could appear in the 'root/parent' bucket (upon adding sub-buckets) instead. This would make it unambiguously clear where & when it is applied in the processing,
Don't 'hard-code' it like now and allow for more control: by making the 'OrderBy' have additional option about WHERE it is applied, e.g.:

'Apply in current bucket' (this is what me, and others have expected and really need contributing to this and related discussions) AND
'Apply to most outer (root) bucket': (this is current 'default', could stay 'default' to avoid breaking other things)
More 'generic' version of the above: 'Apply to 'XX' bucket' where XX is 'current', 'root' or any of the buckets on the path from 'current' to 'root' (again 'root' being default if nothing is selected/pointed out to preserve current behaviour.

When it is possible to control and apply the 'SortBy' to a selected bucket - this 'Size' related issue will also get addressed.

When someone really depends on a deterministic behaviour, being able to control aggregation results in regards to sorting is something that is really needed.
Unfortunately there is currently NO way to achieve this. And there is an impression like there was because it 'sometimes' works like that. And 'sometimes' though is not really good enough.. ELK is and should be a solid stack, it is so powerful and useful that it seems surprising that it doesn't already cope with it, especially when it's being raised for many years now.. I think it is really time to fix / update this now instead of 'bouncing' it back again: every reply made to comments on that now gets more and more links to other discussions (spanning for many years), and every reply from ELK Team seems to say 'yeah, maybe someday, not now.. you see? this has been like that forever and we can't change it now'

Sorry if I sound sarcastic - this is not the intention, really - I'm just trying to point out that I really care, and know it could be done better. I can even spare my time to help if possible / needed - just let me know!

Thanks!
Lukasz.

markov00 · 2024-10-10T08:19:44Z

Closing this because the 6.x version is not anymore under maintenance. Please upgrade to the latest 8 version

thomasneirynck added bug Fixes for quality problems that affect the customer experience Feature:Visualizations Generic visualization features (in case no more specific feature label is available) triage_needed labels Apr 3, 2018

timroes added Feature:XYAxis XY-Axis charts (bar, area, line) and removed triage_needed labels Apr 11, 2018

timroes assigned ppisljar Apr 11, 2018

TomonoriSoejima mentioned this issue Apr 25, 2018

Bar in a bar chart gets incorrect sorted order when adding a sub-bucket aggregation #18551

Closed

This was referenced Sep 10, 2018

Incorrect bar ordering for sub-bucket term aggregation and custom metric as "Order by" #22207

Closed

X-axis order in Line chart gets messed up after adding sub-bucket aggregation #22231

Closed

Inconsistent ascending/descending ordering #22863

Closed

timroes added the Team:Visualizations Visualization editors, elastic-charts and infrastructure label Sep 16, 2018

timroes removed Feature:Visualizations Generic visualization features (in case no more specific feature label is available) labels Oct 1, 2018

timroes mentioned this issue Dec 7, 2018

Sorting of X-Axis is incorrect when "Split Series" is also enabled #25687

Closed

timroes unassigned ppisljar Dec 7, 2018

lukeelmers self-assigned this Dec 7, 2018

leamingrad mentioned this issue Dec 12, 2018

Empty X-Axis buckets disappear when series is split in horizontal bar chart #27018

Closed

lukeelmers mentioned this issue Dec 21, 2018

Preserve x-axis ordering in split series vis. #27723

Closed

3 tasks

timroes added the app-stabilizing label Feb 6, 2019

lukeelmers mentioned this issue Feb 19, 2019

x-axis order not preserved in split series vislib charts #31534

Closed

lukeelmers added enhancement New value added to drive a business result Feature:Vislib Vislib chart implementation and removed app-stabilizing bug Fixes for quality problems that affect the customer experience labels Feb 20, 2019

timroes mentioned this issue Mar 19, 2019

Split series alphabetical order is not respected #32135

Closed

lukeelmers mentioned this issue Apr 5, 2019

Better legend ordering when using date histograms #34567

Closed

timroes mentioned this issue Apr 23, 2019

Sub-bucket terms agg isn't ordered correctly in series chart legend #35485

Closed

lukeelmers removed their assignment May 14, 2020

This was referenced Mar 29, 2022

[Lens] Allow client-side sorting of dimensions and legends at datasource level for all chart types #86184

Open

[Meta][Lens] Data Modelling #57708

Closed

adrien-cahoreau mentioned this issue Apr 29, 2022

[BUG] Heatmap Y-axis does not sort correctly opensearch-project/OpenSearch-Dashboards#882

Open

markov00 closed this as not planned Won't fix, can't repro, duplicate, stale Oct 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bar in a bar charts are getting unsorted when adding a sub-bucket aggregation #17532

Bar in a bar charts are getting unsorted when adding a sub-bucket aggregation #17532

melvynator commented Apr 3, 2018

timroes commented Apr 11, 2018

melvynator commented Apr 20, 2018 •

edited

Loading

ogtool commented May 17, 2018

willphillips-armedia commented Nov 19, 2018

lukeelmers commented Feb 20, 2019

larrywongl commented Jul 17, 2019

Knksumanth commented Dec 9, 2019

formiaczek commented May 15, 2020

markov00 commented Oct 10, 2024

Bar in a bar charts are getting unsorted when adding a sub-bucket aggregation #17532

Bar in a bar charts are getting unsorted when adding a sub-bucket aggregation #17532

Comments

melvynator commented Apr 3, 2018

timroes commented Apr 11, 2018

melvynator commented Apr 20, 2018 • edited Loading

ogtool commented May 17, 2018

willphillips-armedia commented Nov 19, 2018

lukeelmers commented Feb 20, 2019

larrywongl commented Jul 17, 2019

Knksumanth commented Dec 9, 2019

formiaczek commented May 15, 2020

markov00 commented Oct 10, 2024

melvynator commented Apr 20, 2018 •

edited

Loading