Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce runtime section in mappings #62906

Merged
merged 22 commits into from
Nov 12, 2020

Conversation

javanna
Copy link
Member

@javanna javanna commented Sep 24, 2020

The runtime section is at the same level as the existing properties section. Its purpose is to hold runtime fields only. With the introduction of the runtime section, a runtime field can be defined by specifying its type (previously called runtime_type) and script.

PUT /my-index/_mappings
{
    "runtime" : {
        "day_of_week" : {
            "type" : "keyword",
            "script" : {
                "source" : "emit(doc['timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))"
            }
        }
    },
    "properties" : {
        "timestamp" : {
            "type" : "date"
        }
    }
}

Fields defined in the runtime section can be updated at any time as they are not present in the lucene index. They get replaced entirely when they get updated.

Thanks to the introduction of the runtime section, runtime fields override existing mapped fields defined with the same name, similarly to runtime fields defined in the search request.

Note that the get field mappings API does not return runtime fields as it iterates through the mappers directly, which have no visibility over runtime fields.

Relates to #59332

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 If we want to go this way I think this is how we should do it.

@nik9000
Copy link
Member

nik9000 commented Sep 27, 2020

+1 If we want to go this way I think this is how we should do it.

And by that I mean, I don't see any problems with the code. I'm not 100% sure we want the new section, but if we want it then this implements it how I expect.

The runtime section is at the same level as the existing properties section. Its purpose is to hold runtime fields only. With the introduction of the runtime section, a runtime field can be defined by specifying its type (previously called runtime_type) and script.

Fields defined in the runtime section can be updated at any time as they are not present in the lucene index. They get replaced entirely when they get updated.

Thanks to the introduction of the runtime section, runtime fields can override existing mapped fields defined with the same name, similarly to runtime fields defined in the search request.
@javanna javanna force-pushed the enhancement/mapping_runtime_section branch from d7cbcce to dd5f648 Compare November 10, 2020 16:47
@javanna javanna changed the title WIP: introduce runtime section in mappings Introduce runtime section in mappings Nov 10, 2020
@javanna javanna added :Search Foundations/Mapping Index mappings, including merging and defining field types v7.11.0 v8.0.0 >enhancement labels Nov 10, 2020
throws MapperParsingException;
}

public static void parseRuntimeFields(Map<String, Object> node,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a big fan of this public static method here, but I think it's important to share it between QueryShardContext and RootObjectMapper

);
}

public final void testMinimalMappingToMaximal() throws IOException {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these three tests I borrowed from MapperTestCase , they were previously run as part of RuntimeFieldMapperTests

assertEquals("Failed to parse mapping: stored scripts are not supported for runtime field [field]", exception.getMessage());
}

public void testFieldCaps() throws Exception {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these were moved from RuntimeFieldMapperTests

MappedFieldType fieldType = mapperService.fieldType("field");
assertThat(fieldType, instanceOf(DateScriptFieldType.class));
assertEquals(Strings.toString(mapping.get()), Strings.toString(mapperService.documentMapper()));
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these were moved from RuntimeFieldMapperTests

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some formatting stuff but nothing big at all. Let me know if you'd like a consult on the sneaky tests.

* but {@link Builder#init(FieldMapper)} and {@link Builder#build(ContentPath)} are never called as
* {@link RuntimeFieldTypeParser#parse(String, Map, Mapper.TypeParser.ParserContext)} calls
* {@link Builder#parse(String, Mapper.TypeParser.ParserContext, Map)} and returns the corresponding
* {@link org.elasticsearch.index.mapper.MappedFieldType}.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something we should keep doing forever or just a way to get this in without making 123213214 changes? Like, should we stop doing this in a follow up change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ehm, doing what?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the Builder at all. It feels like it makes sense because we have them now, but maybe its overkill?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like using Parameters to define what gets parsed and to reuse that part of the existing mappings infra. I think that not using them would mean going back to manual parsing and printing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me too! I just wonder if we can avoid building the FieldMapper then. Like, in a follow up? Or not. I dunno. It just fields weird to build a dummy field mapper to make parsing easier. No big deal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not build the field mapper at all!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pressed enter too quickly: we skip calling build on the builder. we make the builder create the right field type and return that, without creating the field mapper.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OH! I see you even said it. I should read more closely!

Copy link
Member Author

@javanna javanna Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one follow-up we could think of to clean things up is to extract the parsing bits from FieldMapper.Builder as well as the Parameter class so that parsing can be done outside of a Builder.

script:
source: |
for (String node : params._fields.node.values) {
emit(params.prefix + node);
}
params:
prefix: node_
properties:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe keep the traditional properties at the top? I think the runtime properties will tend to consume the traditional ones so I'd try to read the traditional ones first.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually never considered runtime fields could not be on top :) I would expect them to appear first because they can override ordinary fields, I guess.

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of queries, but LGTM as a whole.

{
Set<String> sourcePaths = fieldTypeLookup.sourcePaths("field2");
assertEquals(1, sourcePaths.size());
assertTrue(sourcePaths.contains("field2"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting one. I'm not really sure what sourcePaths should return for runtime fields - I feel like maybe it should be an empty set? The ValueFetcher isn't using sourcePaths to load data for these fields, so maybe it doesn't matter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean, though the logic in the method is very simple and I am reluctant to special case runtime fields, under the assumption that they are never in source. Also runtime fields can be plugged in and maybe we should not make assumptions about how they will fetch their values?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figure there is a chance runtime fields will use the source path, yeah. But at least for now they don't so it doesn't really matter. I guess emptySet is more correct, but, yeah, it doesn't really matter.

* {@code null} because most fields don't support formats.
*/
protected String format() {
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much nicer to only have this on the DateFieldType, thanks!

@@ -116,20 +111,22 @@ protected boolean modifySearch(ApiCallSection search) {
return mergeMappings(new String[] { "*" });
}
String[] patterns = Arrays.stream(index.split(",")).map(m -> m.equals("_all") ? "*" : m).toArray(String[]::new);
// TODO this is always false?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nik9000 this is for you I think :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooops. I think I mean length == 1. I can take a look in a follow up.

* but {@link Builder#init(FieldMapper)} and {@link Builder#build(ContentPath)} are never called as
* {@link RuntimeFieldTypeParser#parse(String, Map, Mapper.TypeParser.ParserContext)} calls
* {@link Builder#parse(String, Mapper.TypeParser.ParserContext, Map)} and returns the corresponding
* {@link org.elasticsearch.index.mapper.MappedFieldType}.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OH! I see you even said it. I should read more closely!

{
Set<String> sourcePaths = fieldTypeLookup.sourcePaths("field2");
assertEquals(1, sourcePaths.size());
assertTrue(sourcePaths.contains("field2"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figure there is a chance runtime fields will use the source path, yeah. But at least for now they don't so it doesn't really matter. I guess emptySet is more correct, but, yeah, it doesn't really matter.

.endObject()
.endObject()
.endObject()
.startObject()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one got harder to read!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix

@@ -1,4 +1,4 @@
// Shared infratructure
// Shared infrastructure
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -116,20 +111,22 @@ protected boolean modifySearch(ApiCallSection search) {
return mergeMappings(new String[] { "*" });
}
String[] patterns = Arrays.stream(index.split(",")).map(m -> m.equals("_all") ? "*" : m).toArray(String[]::new);
// TODO this is always false?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooops. I think I mean length == 1. I can take a look in a follow up.

public Map<String, Mapper.TypeParser> getMappers() {
return Collections.singletonMap(RuntimeFieldMapper.CONTENT_TYPE, RuntimeFieldMapper.PARSER);
public Map<String, RuntimeFieldType.Parser> getRuntimeFieldTypes() {
return Map.ofEntries(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@javanna javanna merged commit 3101293 into elastic:master Nov 12, 2020
javanna added a commit to javanna/elasticsearch that referenced this pull request Nov 12, 2020
The runtime section is at the same level as the existing properties section. Its purpose is to hold runtime fields only. With the introduction of the runtime section, a runtime field can be defined by specifying its type (previously called runtime_type) and script.

```
PUT /my-index/_mappings
{
    "runtime" : {
        "day_of_week" : {
            "type" : "keyword",
            "script" : {
                "source" : "emit(doc['timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))"
            }
        }
    },
    "properties" : {
        "timestamp" : {
            "type" : "date"
        }
    }
}
```

Fields defined in the runtime section can be updated at any time as they are not present in the lucene index. They get replaced entirely when they get updated.

Thanks to the introduction of the runtime section, runtime fields override existing mapped fields defined with the same name, similarly to runtime fields defined in the search request.

Relates to elastic#59332
javanna added a commit that referenced this pull request Nov 13, 2020
The runtime section is at the same level as the existing properties section. Its purpose is to hold runtime fields only. With the introduction of the runtime section, a runtime field can be defined by specifying its type (previously called runtime_type) and script.

```
PUT /my-index/_mappings
{
    "runtime" : {
        "day_of_week" : {
            "type" : "keyword",
            "script" : {
                "source" : "emit(doc['timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))"
            }
        }
    },
    "properties" : {
        "timestamp" : {
            "type" : "date"
        }
    }
}
```

Fields defined in the runtime section can be updated at any time as they are not present in the lucene index. They get replaced entirely when they get updated.

Thanks to the introduction of the runtime section, runtime fields override existing mapped fields defined with the same name, similarly to runtime fields defined in the search request.

Relates to #59332
javanna added a commit to javanna/elasticsearch that referenced this pull request Nov 18, 2020
Runtime fields are defined in a separate runtime section in the mappings. Since the runtime section was introduced, runtime fields are not taken into account when parsing documents. That means that if a document gets indexed that holds a field that's already defined as a runtime field, the field gets dynamically mapped as a concrete field although it will always be shadowed by the runtime field defined with the same name.

A more sensible default would be to instead consider runtime fields like ordinary mapped fields, so a dynamic update is not necessary whenever a field is defined as part of the runtime section. As a consequence, the field does not get indexed. If users prefer to keep indexing the field although it is shadowed, we consider this an exception, and they can do so by mapping the field under properties explicitly.

Relates to elastic#62906
javanna added a commit that referenced this pull request Nov 19, 2020
Runtime fields are defined in a separate runtime section in the mappings. Since the runtime section was introduced, runtime fields are not taken into account when parsing documents. That means that if a document gets indexed that holds a field that's already defined as a runtime field, the field gets dynamically mapped as a concrete field although it will always be shadowed by the runtime field defined with the same name.

A more sensible default would be to instead consider runtime fields like ordinary mapped fields, so a dynamic update is not necessary whenever a field is defined as part of the runtime section. As a consequence, the field does not get indexed. If users prefer to keep indexing the field although it is shadowed, we consider this an exception, and they can do so by mapping the field under properties explicitly.

Relates to #62906
javanna added a commit to javanna/elasticsearch that referenced this pull request Nov 19, 2020
Runtime fields are defined in a separate runtime section in the mappings. Since the runtime section was introduced, runtime fields are not taken into account when parsing documents. That means that if a document gets indexed that holds a field that's already defined as a runtime field, the field gets dynamically mapped as a concrete field although it will always be shadowed by the runtime field defined with the same name.

A more sensible default would be to instead consider runtime fields like ordinary mapped fields, so a dynamic update is not necessary whenever a field is defined as part of the runtime section. As a consequence, the field does not get indexed. If users prefer to keep indexing the field although it is shadowed, we consider this an exception, and they can do so by mapping the field under properties explicitly.

Relates to elastic#62906
javanna added a commit that referenced this pull request Nov 20, 2020
Runtime fields are defined in a separate runtime section in the mappings. Since the runtime section was introduced, runtime fields are not taken into account when parsing documents. That means that if a document gets indexed that holds a field that's already defined as a runtime field, the field gets dynamically mapped as a concrete field although it will always be shadowed by the runtime field defined with the same name.

A more sensible default would be to instead consider runtime fields like ordinary mapped fields, so a dynamic update is not necessary whenever a field is defined as part of the runtime section. As a consequence, the field does not get indexed. If users prefer to keep indexing the field although it is shadowed, we consider this an exception, and they can do so by mapping the field under properties explicitly.

Relates to #62906
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team v7.11.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants