Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add parameters to flat_object type #13853

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kkewwei
Copy link
Contributor

@kkewwei kkewwei commented May 28, 2024

Signed-off-by: kkewwei [email protected]

Description

The following open parameters setting are useful, including normalizer, docValues, ignoreAbove, nullValue, similarity, and depthlimit.

normalizer: allow lowercase, uppercase
doc_values: allow to set docValues to be false
ignore_above: allow to set if the length of a field is go above certain limit then ignore the document.
null_value: allow to set nullValue to others
similarity: allow to set similarity setting
depth_limit: allow to set maximum depth limitation to the JSON document

If there are other useful parameters, you are also welcome to give feedback or continue to add.

Related Issues

Resolves #7137

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • API changes companion pull request created.
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request help wanted Extra attention is needed Search Search query, autocomplete ...etc labels May 28, 2024
@kkewwei kkewwei force-pushed the flat_object_para branch 2 times, most recently from 25c607c to 5bde254 Compare May 28, 2024 10:55
Copy link
Contributor

❌ Gradle check result for ec4d256: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 25c607c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 819ada8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 0f517b4: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for ceca186: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 3d8225b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❕ Gradle check result for 16ef9b9: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link
Contributor

✅ Gradle check result for ea2bae8: SUCCESS

@kkewwei
Copy link
Contributor Author

kkewwei commented Jun 25, 2024

@andrross PTAL.

@andrross
Copy link
Member

@mingshl Mind taking a look here?


public void removeKeyIfNeed(boolean hasValidValue) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when a key has null value, we also remove the key? is that the expected behavior?
should it check the nullValue setting? this.nullValue?
this will impact the exist query result when checking on keys

Copy link
Contributor Author

@kkewwei kkewwei Jun 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I' m not sure whether we should remove the sub key, when the the sub value is null. it is also discussed in here: #13880 (comment)

It will indeed impact the exist query when making change.

this.indexAnalyzer = this.normalizer;
this.searchAnalyzer = this.normalizer;
if (this.normalizer == null) {
throw new MapperParsingException("normalizer [" + normalizer + "] is not supported in flat_object");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here sets the indexAnalyzer = this.normalizer; and searchAnalyzer = this.normalizer; but the exception mentions normalizer + "] is not supported in flat_object"? should the message be "normalizer cannot be null"?


public void setNormalizer(String normalizer) {
this.normalizer = this.indexAnalyzers.getNormalizer(normalizer);
this.indexAnalyzer = this.normalizer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need indexAnalyzer in flat object? words are not tokenized in flat object, what is this indexAnalyzer used for?

Copy link
Contributor Author

@kkewwei kkewwei Jun 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When normalizer is set, it will be used in parseValueAddFields

Example in keyword type:

Did I understand wrong?

iterator.remove();
break;
case NULL_VALUE:// allow to set nullValue to others
builder.nullValue = XContentMapValues.nodeStringValue(propNode);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you test the null value when set to 'NA'? I might miss the logic in the JsonToStringContentParse, I didn't see the logic for custom null_value

Copy link
Contributor Author

@kkewwei kkewwei Jun 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'NA' is just a normal string, can you describe more details? very thank you.

true,
new TextSearchInfo(fieldType, null, Lucene.KEYWORD_ANALYZER, Lucene.KEYWORD_ANALYZER),
builder.hasDocValues,
new TextSearchInfo(fieldType, builder.similarity, getNamedAnalyzer(builder), getNamedAnalyzer(builder)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we allow custom index analyzer, then flat object can be tokenized?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mingshl, as you see above, words are not tokenized in flat object, why do we customize index analyzer? can you describe it more clearly? thank you.

String mappedFieldTypeName,
NamedAnalyzer normalizer,
boolean hasDocValues,
int ignoreAbove,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason that the depthLimit is not in the constructor of FlatObjectFieldType?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depthLimit is used in JsonToStringXContentParser, JsonToStringXContentParser is created in FlatObjectFieldMapper.parseCreateField, it seems no need in the constructor of FlatObjectFieldType.

@mingshl
Copy link
Contributor

mingshl commented Jun 26, 2024

@kkewwei thanks for adding the new parameters and some yaml tests are added. I have some questions above to understand more about the logic.

In overall, UT needs to be added to validate the behavior of the new parameters, or when the new parameters go out, users would ask how to use them. It would be nice to have UTs when similarity are set, normalizer are set, doc_values are set, does that impact the search capabilities? for example, exist query and prefix query. Also some Exception cases when the parameters are null, invalid parameters are set.

Thanks!

@kkewwei
Copy link
Contributor Author

kkewwei commented Jun 30, 2024

@mingshl,it seems that only the null_value may affect the exist query, if here is not clear, I can temporarily rewind the logic: we should remove the sub key, when the the sub value is null.

Those unit tests add in FlatObjectFieldMapperTests.java and [106_flat_object_with_parameter.yml, including exception cases, if you have any cases, I'd be happy to add them.

@kkewwei
Copy link
Contributor Author

kkewwei commented Jun 30, 2024

@mingshl maybe we should first consider the case of null_value? #14069.

@opensearch-trigger-bot
Copy link
Contributor

This PR is stalled because it has been open for 30 days with no activity.

@opensearch-trigger-bot opensearch-trigger-bot bot added stalled Issues that have stalled and removed stalled Issues that have stalled labels Jul 30, 2024
@kkewwei kkewwei mentioned this pull request Aug 27, 2024
8 tasks
@opensearch-trigger-bot
Copy link
Contributor

This PR is stalled because it has been open for 30 days with no activity.

@opensearch-trigger-bot opensearch-trigger-bot bot added stalled Issues that have stalled and removed stalled Issues that have stalled labels Sep 8, 2024
@kkewwei
Copy link
Contributor Author

kkewwei commented Sep 22, 2024

@msfroh I have a little doubt, please help confirm when you are free:

  • If we just support object value in flat_object field, do we need to support null_value?

  • If we do, null_value must be an object field?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch enhancement Enhancement or improvement to existing feature or request help wanted Extra attention is needed Search Search query, autocomplete ...etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Enhancement] Add Open Parameters to Flat_object Field Type
3 participants