-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add parameters to flat_object type #13853
base: main
Are you sure you want to change the base?
Conversation
25c607c
to
5bde254
Compare
❌ Gradle check result for ec4d256: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 25c607c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 819ada8: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 0f517b4: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
0f517b4
to
ceca186
Compare
❌ Gradle check result for ceca186: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
ceca186
to
3d8225b
Compare
❌ Gradle check result for 3d8225b: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
3d8225b
to
16ef9b9
Compare
❕ Gradle check result for 16ef9b9: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Signed-off-by: kkewwei <[email protected]>
16ef9b9
to
ea2bae8
Compare
@andrross PTAL. |
@mingshl Mind taking a look here? |
|
||
public void removeKeyIfNeed(boolean hasValidValue) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when a key has null value, we also remove the key? is that the expected behavior?
should it check the nullValue setting? this.nullValue?
this will impact the exist query result when checking on keys
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I' m not sure whether we should remove the sub key, when the the sub value is null. it is also discussed in here: #13880 (comment)
It will indeed impact the exist query when making change.
this.indexAnalyzer = this.normalizer; | ||
this.searchAnalyzer = this.normalizer; | ||
if (this.normalizer == null) { | ||
throw new MapperParsingException("normalizer [" + normalizer + "] is not supported in flat_object"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here sets the indexAnalyzer = this.normalizer; and searchAnalyzer = this.normalizer; but the exception mentions normalizer + "] is not supported in flat_object"? should the message be "normalizer cannot be null"?
|
||
public void setNormalizer(String normalizer) { | ||
this.normalizer = this.indexAnalyzers.getNormalizer(normalizer); | ||
this.indexAnalyzer = this.normalizer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need indexAnalyzer in flat object? words are not tokenized in flat object, what is this indexAnalyzer used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When normalizer is set, it will be used in parseValueAddFields
OpenSearch/server/src/main/java/org/opensearch/index/mapper/FlatObjectFieldMapper.java
Line 631 in c71fd4a
if (normalizer != null) { |
Example in keyword type:
OpenSearch/server/src/main/java/org/opensearch/index/mapper/KeywordFieldMapper.java
Line 281 in c71fd4a
setIndexAnalyzer(normalizer); |
Did I understand wrong?
iterator.remove(); | ||
break; | ||
case NULL_VALUE:// allow to set nullValue to others | ||
builder.nullValue = XContentMapValues.nodeStringValue(propNode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you test the null value when set to 'NA'? I might miss the logic in the JsonToStringContentParse, I didn't see the logic for custom null_value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'NA' is just a normal string, can you describe more details? very thank you.
true, | ||
new TextSearchInfo(fieldType, null, Lucene.KEYWORD_ANALYZER, Lucene.KEYWORD_ANALYZER), | ||
builder.hasDocValues, | ||
new TextSearchInfo(fieldType, builder.similarity, getNamedAnalyzer(builder), getNamedAnalyzer(builder)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we allow custom index analyzer, then flat object can be tokenized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mingshl, as you see above, words are not tokenized in flat object
, why do we customize index analyzer? can you describe it more clearly? thank you.
String mappedFieldTypeName, | ||
NamedAnalyzer normalizer, | ||
boolean hasDocValues, | ||
int ignoreAbove, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason that the depthLimit is not in the constructor of FlatObjectFieldType?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
depthLimit
is used in JsonToStringXContentParser
, JsonToStringXContentParser
is created in FlatObjectFieldMapper.parseCreateField
, it seems no need in the constructor of FlatObjectFieldType.
@kkewwei thanks for adding the new parameters and some yaml tests are added. I have some questions above to understand more about the logic. In overall, UT needs to be added to validate the behavior of the new parameters, or when the new parameters go out, users would ask how to use them. It would be nice to have UTs when similarity are set, normalizer are set, doc_values are set, does that impact the search capabilities? for example, exist query and prefix query. Also some Exception cases when the parameters are null, invalid parameters are set. Thanks! |
@mingshl,it seems that only the null_value may affect the exist query, if here is not clear, I can temporarily rewind the logic: we should remove the sub key, when the the sub value is null. Those unit tests add in FlatObjectFieldMapperTests.java and [106_flat_object_with_parameter.yml, including exception cases, if you have any cases, I'd be happy to add them. |
This PR is stalled because it has been open for 30 days with no activity. |
This PR is stalled because it has been open for 30 days with no activity. |
@msfroh I have a little doubt, please help confirm when you are free:
|
Signed-off-by: kkewwei [email protected]
Description
The following open parameters setting are useful, including normalizer, docValues, ignoreAbove, nullValue, similarity, and depthlimit.
normalizer: allow lowercase, uppercase
doc_values: allow to set docValues to be false
ignore_above: allow to set if the length of a field is go above certain limit then ignore the document.
null_value: allow to set nullValue to others
similarity: allow to set similarity setting
depth_limit: allow to set maximum depth limitation to the JSON document
If there are other useful parameters, you are also welcome to give feedback or continue to add.
Related Issues
Resolves #7137
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.