Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searching for saved objects with dashes shows no results #5734

Closed
gena01 opened this issue Dec 18, 2015 · 17 comments · Fixed by #82693
Closed

Searching for saved objects with dashes shows no results #5734

gena01 opened this issue Dec 18, 2015 · 17 comments · Fixed by #82693
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Feature:Kibana Management Feature label for Data Views, Advanced Setting, Saved Object management pages Feature:Saved Objects Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@gena01
Copy link

gena01 commented Dec 18, 2015

While trying to search for either saved queries/visualizations/dashboards that contain dashes comes up blank if you start typing a name and right after typing in a dash. I tried using backslash and putting the search in double quotes (doesn't seem to make a difference).

I am using Kibana 4.1.3

Note: when we change the analyzer for any fields, we should to be sure migrations are triggered.

@rashidkpc rashidkpc added bug Fixes for quality problems that affect the customer experience v5.0.0 labels Dec 21, 2015
@wwsean08
Copy link
Contributor

I was able to work around this while trying to figure out what was going on (though the workaround didn't make sense). I created a visualization called foo-bar-baz and was able to find it a couple of ways:

  1. The search in double quotes worked for me (though I was using the development version)
  2. I could kind of somewhat escape it though it made no sense, if I did foo-\bar-\ (the \ seems to be making stuff works)

This likely happens because - is a reserved character in the elasticsearch query. One possible solution would be to under the hood wrap the query with quotes if it contains any keywords but I could imagine that causing other issues for searches.

@chrisronline
Copy link
Contributor

This is because saved objects title field is analyzed using the standard analyzer which will break on the - character. Searching for - returns nothing because that character is not stored in the ES index.

POST _analyze
{
  "analyzer": "standard",
  "text": "Foo-Bar"
}

{
  "tokens": [
    {
      "token": "foo",
      "start_offset": 0,
      "end_offset": 3,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "bar",
      "start_offset": 4,
      "end_offset": 7,
      "type": "<ALPHANUM>",
      "position": 1
    }
  ]
}

This feels related to #4563.

@rhoboat
Copy link

rhoboat commented Nov 10, 2017

I reindexed .kibana to use keyword for dashboard.title. It seems to work fine.

image

@timroes timroes added Team:Visualizations Visualization editors, elastic-charts and infrastructure Feature:Kibana Management Feature label for Data Views, Advanced Setting, Saved Object management pages Feature:Saved Objects Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc and removed :Management DO NOT USE labels Nov 27, 2018
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-platform

@timroes timroes removed the Team:Visualizations Visualization editors, elastic-charts and infrastructure label Apr 3, 2019
@robin13
Copy link
Contributor

robin13 commented Sep 24, 2019

I have just confirmed that this is still an issue with 7.2.3:

screenshot-2019-09-24_14-32-39
screenshot-2019-09-24_14-32-49

@spalger
Copy link
Contributor

spalger commented Sep 24, 2019

additionally, this causes issues when searching for items with * in their name, when you specify the name including the *.

@timroes
Copy link
Contributor

timroes commented Oct 11, 2019

This topic comes up regularily, most often (as far as I can tell) also because of people having _ in their titles and are expecting to be able to find a visualization flight_traffic_map by just searching for traffic which won't be found, since _ is considered to be a word character by the standard analyzer and thus terms won't be split up by it.

After some discussion we think it might make sense tweaking the standard analyzer for those fields a bit to at least make _ not a word character anymore. The problem is: we could do this in any upgrade for just the titles of our saved objects (vis, dashboard, discover, saved search). The problem is, that if we use different analyzers between different saved objects (e.g. because other teams use different analyzers), we might create a bad UX, by users basically have different search experiences depending on the application they are in in Kibana.

Thus we think we should have discussion whether it makes instead sense having some kind of mechanism in the central saved object services for that. I am not sure yet exactly how this would look like, but a couple of ideas would be:

  • Either we assume every saved object field named title is a title and use a different analyzer for it.
  • We give developers a "shorthand" flag in the mappings their plugin provide, so they can just say something like:
    properties: {
      title: {
        type: "text",
        @humanSearchable: true
      }
    }

I personally would really prefer if we could not introduce bad UX by introducing different analyzed title fields. @elastic/kibana-platform what are your thoughts on that?

@spalger
Copy link
Contributor

spalger commented Oct 11, 2019

I personally would really prefer if we could not introduce bad UX by introducing different analyzed title fields.

👍

The search box already searches is set field names (title and description I think) so we kind of implicitly require that people use those fields right?

@nerddelphi
Copy link

nerddelphi commented Jan 14, 2020

I've just confirmed that this is still an issue with 7.5.0.

@ic-jsu
Copy link

ic-jsu commented Jan 15, 2020

Still the same on 7.5.1.
Workaround is simply to use a space in the search (as one would expect with the result of the standard analyzer), instead of using double quote or escape the dashes.

@inqueue
Copy link
Member

inqueue commented Aug 4, 2020

I received a report for this against 7.8.1 today. Appending a space at the end of the search is the simplest workaround. Appending a \ also works.

  • filebeat-
  • filebeat-\

While the reason for the behavior makes sense, I think we could do better service to the many running into this limitation by either finding a way to fix it or documenting the workaround. Let's do something with it and close the issue.

Edit: The report I received was against searching for index patterns containing the common - character when creating a new visualization.

@ned-si
Copy link

ned-si commented Aug 5, 2020

a way to fix it

Would it be possible to apply the standard_analyzer to the queries themselves? So no change would be required on the analyzers to store saved objects.

@gimmic
Copy link

gimmic commented Aug 6, 2020

Regardless of the technical fix it seems very counter-intuitive to give a free form search modal and not have the results be free form. I understand there are technical limitations around the leading/trailing wild carding against index pattern titles, but if the back end ES isn't capable of loose matching, then at least the full index pattern result set could be passed to the local browser client and searched client-side with js or something. Relying on local client side js would obviously be sub-optimal in a product, you know, for search.

It is a poor UX limitation and will be experienced frequently on any customer with many nested index patterns/clusters.

@gimmic
Copy link

gimmic commented Aug 13, 2020

Just want to point something out:
image

@pgayvallet
Copy link
Contributor

pgayvallet commented Oct 20, 2020

So, after discussing with the ES team, it seems we have two options to solve this:

(#20242 shares the same root issue and will also be solved by this)

Changing the analyzer / tokenizer we are using for SO's

To allow to properly search for special chars such as - and *, we can change the analyzer used to use a whitespace tokenizer, to avoid token split on special chars. This could be done either for the whole kibana indices, or only for 'searchable' fields (for example, only the defaultSearchField of our SO types)

The analyzer would look like

PUT /.kibana_1/_settings
{
    "analysis": {
          "filter": {
              "ascii_folding_preserve": {
                  "type" : "asciifolding",
                  "preserve_original": "true"
              }
          },
          "analyzer": {
              "include_special_character": {
                  "type":  "custom",
                  "filter": [
                      "lowercase",
                      "ascii_folding_preserve"
                  ],
                  "tokenizer": "whitespace"
              }
          }
      }
}

Pros:

More powerful for 'fulltext' search

We can really search for terms with special chars. Searching for 'it-* would retrieve all documents with [fields] starting with the exact it- term.

Cons:

require changes on the SO migration mechanism

We can't update the settings of the index without closing it, meaning that we will need to add this new analyzer during a 'migration'. Atm, the migration triggers only if there is a mismatch in migrationVersion. We would need to also check if the index's settings are 'up to date' and if not, create a new index with correct settings and copy the documents to it as we are doing for a 'normal' migration.

async function requiresMigration(context: Context): Promise<boolean> {
const { client, alias, documentMigrator, dest, log } = context;
// Have all of our known migrations been run against the index?
const hasMigrations = await Index.migrationsUpToDate(
client,
alias,
documentMigrator.migrationVersion
);

need to decide if we put this analyzer at the index level, or on a per-field basis

Index level seems dangerous. Per-field basis requires to find an acceptable logic for that. Setting the analyzer at default for the defaultSearchField registered for the type could be a start.

we may be breaking some usages of the find API

As we are changing the analyzer / tokenizer, we might be breaking some usages of the find API. For instance a dashboard with title: 'fox-eco' would have been returned with a search: 'eco' previously, but not with the whitespace tokenizer, as the only token is fox-eco instead of ['fox', 'eco']

Modifying our SO search dsl to include match_phrase_prefix

We can, instead, decide to go for a more lightweight solution by adding additional search criteria for find.

if (search) {
bool.must = [
{
simple_query_string: {
query: search,
...getFieldsForTypes(types, searchFields, rootSearchFields),
...(defaultSearchOperator ? { default_operator: defaultSearchOperator } : {}),
},
},
];
}
return { query: { bool } };

would become

  if (search) {
    const fields = getFieldsForTypes(types, searchFields, rootSearchFields);
    bool.must = [
      {
        bool: {
          should: [
            {
              simple_query_string: {
                query: search,
                ...fields,
                ...(defaultSearchOperator ? { default_operator: defaultSearchOperator } : {}),
              },
            },
            // TODO: check if '*'
            ...fields.fields.map((field) => ({
              match_phrase_prefix: {
                [field]: search.replace(/[*]$/, ''), // need to remove the prefix search operator if present
              },
            })),
          ],
          minimum_should_match: 1,
        },
      },
    ];
  }

That way we are able to, more or less, simulate a search for the exact term by searching with phrase prefix for all of its tokens.

search for my-dash will search for phrase prefix ['my', 'dash']

Pros

Way simpler to implement

We are not touching the migration nor our fields analyzer. Just one file, src/core/server/saved_objects/service/lib/search_dsl/query_params.ts

Less risks to break existing usages of the find API

We are not altering the current simple_query_string and are using an inclusive should clause, meaning that all results that were previously returned still will. Worse case, more results will be returned, which seems more acceptable than less.

Cons

We are not really searching for the special characters

We are just 'cheating' with match_phrase_prefix, so it's a little less powerful.

For example, searching for test*dash is actually searching for match_phrase_prefix: ['test', 'dash*']. Any special character used as token separator by the tokenizer can match.

As an example:

Screenshot 2020-10-20 at 20 59 39

In the same logic, searching for it- is the exact same thing as searching for it (well, except that it will work, which is not the case atm). We can't still 'really' search for special chars as a last character of our search term, it will just be ignored.

Overall

I think solution 2 is 'good enough', as it solves main of the problems our end users are encountering, while being the safest one, so I would go with it.

@elastic/kibana-platform what do you think?

@rudolf
Copy link
Contributor

rudolf commented Oct 29, 2020

We discussed this as a team and agreed that solution 2 would be a good stop-gap in the short term.

Longer term, we feel like Kibana should have powerful full-text search which requires option 1, but it's worth taking the time to properly flesh this out. We would probably want to change defaultSearchField to an array so that a plugin can enable "full text search" on several fields of a saved object.

@rudolf
Copy link
Contributor

rudolf commented Mar 15, 2021

Related #14729

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Kibana Management Feature label for Data Views, Advanced Setting, Saved Object management pages Feature:Saved Objects Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

Successfully merging a pull request may close this issue.