Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of multi-fields in Discover #7419

Closed
djschny opened this issue Jun 10, 2016 · 29 comments
Closed

Improve handling of multi-fields in Discover #7419

djschny opened this issue Jun 10, 2016 · 29 comments
Labels
enhancement New value added to drive a business result Feature:Discover Discover Application Team:Visualizations Visualization editors, elastic-charts and infrastructure

Comments

@djschny
Copy link
Contributor

djschny commented Jun 10, 2016

In Elasticsearch multi-fields can be defined. These fields are present in the index and can be queried against (just like _all) but are not present in the _source.

However when in the Discover interface of Kibana, for example even though you can query against these fields in the search box at the top and they show up in the list of fields if you uncheck Hide missing fields it appears Kibana does not handle these properly.

As a user since I can query on them, I would expect to see them. See #1829 for original reference.

screen shot 2016-06-10 at 10 44 07 am

@Bargs
Copy link
Contributor

Bargs commented Jun 10, 2016

When you click on multi-fields in the sidebar, would you expect the downdown contents ("Quick Count") to be the same as the base field, so that the user can click the magnifying glass to create filters against the raw field?

Any older members around that might be able to provide context for why multi-fields are hidden in the first place? @w33ble maybe?

@djschny
Copy link
Contributor Author

djschny commented Jun 10, 2016

When you click on multi-fields in the sidebar, would you expect the downdown contents ("Quick Count") to be the same as the base field, so that the user can click the magnifying glass to create filters against the raw field?

From an end user perspective yes; It's something I can query against if I type in manually.

@w33ble
Copy link
Contributor

w33ble commented Jun 14, 2016

Any older members around that might be able to provide context for why multi-fields are hidden in the first place?

Before my time. I'm not sure why we hide the multi-fields, possibly to save space and/or cut down on the output?

Kibana in general doesn't handle multi-fields well. As a tangential issue, the requirement to use non-analyzed fields (usually via the .raw field) in 5.0 is also somewhat confusing. Showing the .raw multi-field clutters up the dropdown and it's not an obvious solution the first time you run in to the error either.

It's odd that unchecking that box shows multi-fields too - those fields seem to be hidden on purpose, they shouldn't just show up like that.

@ppf2
Copy link
Member

ppf2 commented Aug 9, 2016

Multifields is such a common use case though - Logstash generates them automatically for all string fields other than the message field and that it disables fielddata for all analyzed strings by default - see https:/logstash-plugins/logstash-output-elasticsearch/blob/master/lib/logstash/outputs/elasticsearch/elasticsearch-template-es2x.json.

For LS users, if someone wants to sort on application_name in the Discovery view, it will just throw an error now saying that application_name (which is a string analyzed field) cannot be sorted (even though there is an application_name.raw available in the index which is not_analyzed and using doc_values). So with this limitation, it means that Logstash users by default cannot perform sorting in the Discover table on any string field.

@lukasolson lukasolson changed the title Multi-fields not showing up in Discovery interface field list Improve handling of multi-fields in Discover Sep 30, 2016
@lukasolson
Copy link
Member

I would really consider this a bug. If we show the multi-fields in the list of selectable fields (even if it's only accessible by un-checking "Hide Missing Fields"), then I would expect it to actually display the value of the field in the table. However, it currently doesn't, and only displays "-". Either we should not show them in the list at all, or we should actually show the value of the field in the table.

@dav3860
Copy link

dav3860 commented Nov 17, 2016

I don't think that the multi-fields should be displayed as different fields (AFAIK it's just the same information, but analyzed differently).
Kibana should display the best type, corresponding to the context. Users get really confused by the mapping produced by Logstash for example. They don't understand which type is adapted to what they want to do.
If the goal is to create a visualization, it makes more sense to select the "keyword" type of a string by default, not the "text" type. Currently, if I click a string field in the left panel of the Discovery tab to create a quick viz, the text type is chosen. As Logstash disables the fielddata, it produces an error. When creating a visualization from scratch, it should be better to display only the keyword types for a terms aggregation, leaving the possibility for the user to explicitly select the text type with a checkbox for example. Or display the keywords at the top of the list.
In the Discovery tab, the column headers should be set to the keywords, not the text fields as the default LS mapping prevents the sorting. But only the short name should be displayed, not "field.keyword", which would be confusing.

In general, it would be good to hide the complexity of multifields, text/keyword types, etc by choosing a default type depending on the context. Or course Kibana should let the advanced user select another type if needed, by checking a box or something else.

@Bargs
Copy link
Contributor

Bargs commented Nov 17, 2016

I chatted with @rashidkpc recently and I think we can improve both usability and performance here by querying the field mapping API and checking for the existence of a keyword version of any text field the user wants to sort/visualize on.

@epixa epixa removed the discuss label Nov 21, 2016
@ycombinator
Copy link
Contributor

... querying the field mapping API and checking for the existence of a keyword version of any text field the user wants to sort/visualize on

@Bargs A user on discuss recently asked about sorting on .keyword fields in Discover. Your comment seems relevant to what this user is asking for. Are you suggesting something like checking for the existence of a .keyword version of a text field and then using that for sorting?

@luis-silva
Copy link

For reference, this is the topic I created:
https://discuss.elastic.co/t/kibana-not-using-sub-keyword-field-for-sorting/66983

I have to agree with Bargs on using .keyword if available as it improves performance while keeping things simple.

@Bargs
Copy link
Contributor

Bargs commented Nov 28, 2016

Yup, ideally we would just use the correct version of the field depending on how you're trying to use it.

@blfrantz
Copy link
Contributor

+1 I'm also running into this limitation. I assume the resolution would also make it possible to sort on the hidden field. I'd like to be able to sort a column on a multi-field which has both an analyzed text index and a not_analyzed string index. The UI only lets me select/display/sort the analyzed text index which isn't what I want. If there was some way to select the .raw index, I expect it'd work as I want.

@bwalsh
Copy link

bwalsh commented Mar 23, 2017

+1 same here. I'd like to be able to create a search and choose fields users can sort on.

@Bargs
Copy link
Contributor

Bargs commented Mar 27, 2017

@gamercubed it's not ideal at the moment, but you can select and sort on a multi-field if you un-hide them in the sidebar. You won't see any values in the raw field's column since it has no value in _source, but the sorting will still work correctly.

screen shot 2017-03-27 at 2 25 30 pm

@fbaligand
Copy link
Contributor

@Bargs
As I mention in issue #10996, it would be great if : when you sort on a "visible" field, it uses the first "sub-field" which type is "keyword". This is clearly better than throw an error (as it behaves today).

This particularly relevant because out-of-the-box, Logstash maps each string field with "text" type, and with "keyword" sub-field as "keyword" type.

@sansbonsang
Copy link

Hi,
I am a novice to Elasticsearch/Kibana version 4.5.1. I am not sure that my question is 100% relevant to this issue; in any event, here it is. In Discover, I would like to alphabetically sort records according to an indexed, analyzed text field. This field is actually author(s) from scientific publications. Each author appears as "Last Name First Name", and for those publications that have more than one author, a comma is used to separate the authors. So essentially I would like to sort a list of publications according to the family name of the first author in alphabetical order. It does not work. I think it has to do with the length of the field, and the way that Elasticsearch handles long strings of text. Sorting using simple fields like PubMed's PMID (8-digit numbers; "DataProviderKey" in attached screen capture), or patent numbers (e.g., US9000000) works exactly as expected. Thanks in advance for helping me.

sorted by individual
sorted by dataproviderkey

@fbaligand
Copy link
Contributor

Sort is disabled on text fields, except if you set fielddata=true on this field mapping.
Warning : it is very memory expensive, so dangerous.
Else you can create in mapping, a sub field which is keyword and use it to sort.

@sansbonsang
Copy link

@fbaligand Thanks for your quick reply, and for the clear explanation. We will think about whether we really need to be able to sort according to these long text string fields.

@fbaligand
Copy link
Contributor

When you define a keyword typed field, you can set "ignore_above=256" option to indicate that only the 256 first characters are stored in keyword field. Largely enough to sort.

@ArcticSnowman
Copy link

Having the ability to sort on 'string' fields that you might also want to partial match filtering is an extremely important feature, in my opinion..

@strawgate
Copy link
Contributor

Question from a customer, "Why can't I sort strings? Wouldn't this be the most basic feature of an analytics tool?"

Seems like this hasn't gotten much attention but it's a huge pain point for us

@Bargs
Copy link
Contributor

Bargs commented Jul 19, 2018

@strawgate you can sort by string if you add the .keyword version of the field as a column in the doc table. I agree though, it's not obvious and not a good experience.

@strawgate
Copy link
Contributor

strawgate commented Jul 20, 2018

@Bargs -- I started a post here with another workaround I found: https://discuss.elastic.co/t/index-mapping-type-text-and-keyword-vs-type-keyword-and-text/140805

Default Mapping generated by ES:

"Windows": {
  "type": "text",
  "fields": {
    "keyword": {
      "type": "keyword",
      "ignore_above": 256
    }
  }
}

Custom Mapping:

"Windows": {
  "type": "keyword",
  "ignore_above": 256,
  "fields": {
    "text": {
      "type": "text"
    }
  }
}

The custom mapping causes Kibana to allow sorting on the field without the ".keyword" workaround. I am however having a hard time figuring out what this might break or otherwise change. Do you have any idea? Might this be a viable workaround and we can just apply this custom mapping to text fields instead of the default mapping?

@Bargs
Copy link
Contributor

Bargs commented Jul 20, 2018

Yep, that's a valid workaround. Ultimately you're just changing the names of the fields. I know of at least one member of our team who arranges their mappings in this way because it makes more sense to him that way.

@fbaligand
Copy link
Contributor

Personally, this is the first thing I do when I personalize Logstash elasticsearch template.

@timroes timroes added Feature:Discover Discover Application Team:Visualizations Visualization editors, elastic-charts and infrastructure and removed :Discovery labels Sep 16, 2018
@timroes timroes added enhancement New value added to drive a business result and removed bug Fixes for quality problems that affect the customer experience labels Apr 3, 2019
@wylieconlon
Copy link
Contributor

The latest update on this is that the Elasticsearch team is working on a new way to fetch field information, including multi-fields: elastic/elasticsearch#49028

By using that new API once it's ready, we'll be able to display in Discover the most accurate representation of each document, combining _source and docvalues.

cc @kertal

@fbaligand
Copy link
Contributor

Great! 👍

@legrego
Copy link
Member

legrego commented Dec 16, 2020

Another user came across this problem today in the discussion boards: https://discuss.elastic.co/t/string-field-in-discover-is-not-sortable-and-keyword-field-and-alias-field-are-both-empty

It looks like the work on the Elasticsearch side has completed (elastic/elasticsearch#49028). What are the next steps for us to take advantage of the new API?

@kertal
Copy link
Member

kertal commented Dec 16, 2020

will be fixed by #83891

@kertal
Copy link
Member

kertal commented Mar 16, 2021

Closing this because #83891 was merged and multi fields handling in Discover works now

@kertal kertal closed this as completed Mar 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Feature:Discover Discover Application Team:Visualizations Visualization editors, elastic-charts and infrastructure
Projects
None yet
Development

No branches or pull requests