Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not including date-format in sorting leads to problems with missing values in search_after parameters for numeric_type date #73772

Closed
EmilBode opened this issue Jun 4, 2021 · 1 comment
Labels
>bug needs:triage Requires assignment of a team area label

Comments

@EmilBode
Copy link

EmilBode commented Jun 4, 2021

Note
There's a different issue at play here (elastic/kibana#101391), so I believe this issue could be closed without any fix.

Elasticsearch version: 7.13.1 (newest)

Plugins installed: None

JVM version: ES builtin,
openjdk 16 2021-03-16
OpenJDK Runtime Environment AdoptOpenJDK (build 16+36)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 16+36, mixed mode, sharing)

OS version: Windows 10, 64bit, version 2004, build 19041.985

Setup to reproduce:

POST _bulk
{"index":{"_index":"test","_id":"1"}}
{"field1":"value1", "timestamp": "2021-06-04T13:50:00.000000001"}
{"index":{"_index":"test","_id":"2"}}
{"field1":"value2"}
{"index":{"_index":"test","_id":"3"}}
{"field1":"value3"}

POST test/_refresh
POST test/_pit?keep_alive=10m

Desired behaviour and attempted query
I'd like to paginate over a query, sorted on timestamp (this isn't much use in the minimal reprex, but it's useful if we had a thousand documents)

I could do this with the following query:

GET /_search
{
  "pit": {
    "id": "15izAwEEdGVzdBZkOHNKLTlvV1JYbU9MWE5aaHRCQTVnABZHZFlxQXNXcFNmU0xFUm9kR3hpR1JBAAAAAAAAAAx6FkluaW8yNEVBUkFHVXlZWlNfb2dRNHcAARZkOHNKLTlvV1JYbU9MWE5aaHRCQTVnAAA="
  },
  "query": {
    "match_all": {}
  },
  "size": 1,
  "sort": {
    "timestamp": {
      "order": "desc",
      "numeric_type": "date"
    }
  }
}

And for the next result, I include a search_after, copied from the sort from the last result.
For getting the 2nd result (with a search_after from the first), this works.
However, getting the 3rd result gives an error:

GET /_search
{
  "pit": {
    "id": "15izAwEEdGVzdBZkOHNKLTlvV1JYbU9MWE5aaHRCQTVnABZHZFlxQXNXcFNmU0xFUm9kR3hpR1JBAAAAAAAAAAx6FkluaW8yNEVBUkFHVXlZWlNfb2dRNHcAARZkOHNKLTlvV1JYbU9MWE5aaHRCQTVnAAA="
  },
  "query": {
    "match_all": {}
  },
  "size": 1,
  "sort": {
    "timestamp": {
      "order": "desc",
      "numeric_type": "date"
    }
  },
  "search_after": [
    -9223372036854776000,
    1
  ]
}


{
  "error" : {
    "root_cause" : [
      {
        "type" : "parse_exception",
        "reason" : "failed to parse date field [-9223372036854776000] with format [strict_date_optional_time||epoch_millis]: [failed to parse date field [-9223372036854776000] with format [strict_date_optional_time||epoch_millis]]"
      }
    ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [
      {
        "shard" : 0,
        "index" : "test",
        "node" : "GdYqAsWpSfSLERodGxiGRA",
        "reason" : {
          "type" : "parse_exception",
          "reason" : "failed to parse date field [-9223372036854776000] with format [strict_date_optional_time||epoch_millis]: [failed to parse date field [-9223372036854776000] with format [strict_date_optional_time||epoch_millis]]",
          "caused_by" : {
            "type" : "illegal_argument_exception",
            "reason" : "failed to parse date field [-9223372036854776000] with format [strict_date_optional_time||epoch_millis]",
            "caused_by" : {
              "type" : "date_time_parse_exception",
              "reason" : "Failed to parse with all enclosed parsers"
            }
          }
        }
      }
    ]
  },
  "status" : 400
}

Notes

My guess at a root cause would be that sorting on missing values as first or last is apperently done by assigning them a "lower than the lowest possible value", or "higher than the highest possible value" (when sorting in ascending order) value. When this is returned in a search_after parameter, the parser can't handle this.

@EmilBode EmilBode added >bug needs:triage Requires assignment of a team area label labels Jun 4, 2021
@EmilBode
Copy link
Author

EmilBode commented Jun 4, 2021

Wait, there's a completely different issue at play here.

Copying the value from the search_after does work after all....

However, what does not work, is copying the value into Kibana, and then hitting Ctrl+I to fix any allignment issues.
This action fixes the allignment, but also rounds large numbers.
So the value I copied into my newer query, [ -9223372036854776000, 1 ]?
That's wrong. It should be -9223372036854775808, which does work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug needs:triage Requires assignment of a team area label
Projects
None yet
Development

No branches or pull requests

1 participant