Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hybridfs store type #36668

Merged
merged 4 commits into from
Jan 2, 2019
Merged

Conversation

danielmitterdorfer
Copy link
Member

With this commit we introduce a new store type hybridfs that is a
hybrid between mmapfs and niofs. This store type chooses different
strategies to read Lucene files based on the read access pattern (random
or linear) in order to optimize performance.

This store type has been available in earlier versions of Elasticsearch
as default_fs. We have chosen a different name now in order to convey
the intent of the store type instead of tying it to the fact whether it
is the default choice.

With this commit we introduce a new store type `hybridfs` that is a
hybrid between `mmapfs` and `niofs`. This store type chooses different
strategies to read Lucene files based on the read access pattern (random
or linear) in order to optimize performance.

This store type has been available in earlier versions of Elasticsearch
as `default_fs`. We have chosen a different name now in order to convey
the intent of the store type instead of tying it to the fact whether it
is the default choice.
@danielmitterdorfer danielmitterdorfer added >enhancement WIP v7.0.0 :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. labels Dec 15, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

memory mapped. All other files are opened using Lucene `NIOFSDirectory`.
Similarly to `mmapfs` be sure you have allowed plenty of
<<vm-max-map-count,virtual address space>>.

[[allow-mmapfs]]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO we should rename this setting in order to decouple it from the store type, e.g. node.store.allow_mmap. The only complication is when we backport this to 6.x because changing a property name would be a breaking change (and that's the reason I did not rename it yet).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpountz and me discussed this and we'll open a separate PR to deprecate node.store.allow_mmapfs and rename it to node.store.allow_mmap.

Copy link
Member Author

@danielmitterdorfer danielmitterdorfer Jan 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have opened #37070.

@danielmitterdorfer
Copy link
Member Author

I'm marking this PR as WIP for the time being putting it up for discussion and also to get CI expose early on.

/*
* We are mmapping norms, docvalues as well as term dictionaries, all other files are served through NIOFS
* this provides good random access performance while not creating unnecessary mmaps for files like stored
* fields etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's mention FS cache usage rather than the number of mmaps as a reason for this hybrid store type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just restored the original comment but that makes sense to me. I'll update the comment accordingly.

@@ -75,7 +79,7 @@ private void doTestStoreDirectory(Index index, Path tempDir, String typeSettingV
break;
case FS:
if (Constants.JRE_IS_64BIT && MMapDirectory.UNMAP_SUPPORTED) {
assertTrue(directory.toString(), directory instanceof MMapDirectory);
assertTrue(directory.toString(), directory instanceof FileSwitchDirectory);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the two above cases, maybe also check that directory.getPrimaryDir() is a MMapDirectory?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an additional assertion.

@danielmitterdorfer
Copy link
Member Author

@elasticmachine retest this please

1 similar comment
@danielmitterdorfer
Copy link
Member Author

@elasticmachine retest this please

@danielmitterdorfer
Copy link
Member Author

Thanks for the review! I'll open a separate PR for the backport to 6.x in which hybridfs will be included as an option but will not be the default choice.

@danielmitterdorfer
Copy link
Member Author

danielmitterdorfer commented Jan 2, 2019

I've raised a separate PR for the backport to 6.x in #37067.

@danielmitterdorfer danielmitterdorfer deleted the hybridfs branch January 2, 2019 10:30
danielmitterdorfer added a commit to danielmitterdorfer/elasticsearch that referenced this pull request Jan 2, 2019
With this commit we rename `node.store.allow_mmapfs` to
`node.store.allow_mmap`. Previously this setting has controlled whether
`mmapfs` could be used as a store type. With the introduction of
`hybridfs` which also relies on memory-mapping,
`node.store.allow_mmapfs` also applies to `hybridfs` and thus we rename
it in order to convey that it is actually used to allow memory-mapping
but not a specific store type.

Relates elastic#36668
danielmitterdorfer added a commit that referenced this pull request Jan 3, 2019
With this commit we introduce a new store type `hybridfs` that is a
hybrid between `mmapfs` and `niofs`. This store type chooses different
strategies to read Lucene files based on the read access pattern (random
or linear) in order to optimize performance.

This store type has been available in earlier versions of Elasticsearch
as `default_fs`. We have chosen a different name now in order to convey
the intent of the store type instead of tying it to the fact whether it
is the default choice.

Relates #36668 
Relates #37067
danielmitterdorfer added a commit that referenced this pull request Jan 3, 2019
With this commit we rename `node.store.allow_mmapfs` to
`node.store.allow_mmap`. Previously this setting has controlled whether
`mmapfs` could be used as a store type. With the introduction of
`hybridfs` which also relies on memory-mapping,
`node.store.allow_mmapfs` also applies to `hybridfs` and thus we rename
it in order to convey that it is actually used to allow memory-mapping
but not a specific store type.

Relates #36668
Relates #37070
@s1monw
Copy link
Contributor

s1monw commented Jan 4, 2019

@danielmitterdorfer I think this implementation is broken with respect to managing pending deletes. I do have a fix for it. Just a heads-up for the backport. It will also fi #37111

@danielmitterdorfer
Copy link
Member Author

@s1monw sorry to hear and thank you for the heads-up.

s1monw added a commit to s1monw/elasticsearch that referenced this pull request Jan 4, 2019
We don't want two FSDirectories manage pending deletes separately
and optimze file listing. This confuses IndexWriter and causes exceptions
when files are deleted twice but are pending for deletion. This change
move to using a NIOFS subclass that only delegates to MMAP for opening files
all metadata and pending deletes are managed on top.

Closes elastic#37111
Relates to elastic#36668
s1monw added a commit that referenced this pull request Jan 5, 2019
We don't want two FSDirectories manage pending deletes separately
and optimize file listing. This confuses IndexWriter and causes exceptions
when files are deleted twice but are pending for deletion. This change
move to using a NIOFS subclass that only delegates to MMAP for opening files
all metadata and pending deletes are managed on top.

Closes #37111
Relates to #36668
s1monw added a commit that referenced this pull request Jan 5, 2019
We don't want two FSDirectories manage pending deletes separately
and optimize file listing. This confuses IndexWriter and causes exceptions
when files are deleted twice but are pending for deletion. This change
move to using a NIOFS subclass that only delegates to MMAP for opening files
all metadata and pending deletes are managed on top.

Closes #37111
Relates to #36668
danielmitterdorfer added a commit to danielmitterdorfer/elasticsearch that referenced this pull request Feb 1, 2019
With this commit we switch the default store type from `hybridfs` to `mmapfs`.
While `hybridfs` is beneficial for random access workloads (think: updates and
queries) when the index size is much larger than the available page cache, it
incurs a performance penalty on smaller indices that fit into the page cache (or
are not much larger than that).

This performance penalty shows not only for bulk updates or queries but also for
bulk indexing (without *any* conflicts) when an external document id is provided
by the client. For example, in the `geonames` benchmark this results in a
throughput reduction of roughly 17% compared to `mmapfs`. This reduction is
caused by document id lookups that show up as the top contributor in the profile
when enabling `hybridfs`. Below is such an example stack trace as captured by
async-profiler during a benchmarking trial where we can see that the overhead is
caused by additional `read` system calls for document id lookups:

```
__GI_pread64
sun.nio.ch.FileDispatcherImpl.pread0
sun.nio.ch.FileDispatcherImpl.pread
sun.nio.ch.IOUtil.readIntoNativeBuffer
sun.nio.ch.IOUtil.read sun.nio.ch.FileChannelImpl.readInternal
sun.nio.ch.FileChannelImpl.read
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal
org.apache.lucene.store.BufferedIndexInput.refill
org.apache.lucene.store.BufferedIndexInput.readByte
org.apache.lucene.store.DataInput.readVInt
org.apache.lucene.store.BufferedIndexInput.readVInt
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact
org.elasticsearch.common.lucene.uid.PerThreadIDVersionAndSeqNoLookup.getDocID
org.elasticsearch.common.lucene.uid.PerThreadIDVersionAndSeqNoLookup.
lookupVersion
org.elasticsearch.common.lucene.uid.VersionsAndSeqNoResolver.loadDocIdAndVersion
org.elasticsearch.index.engine.InternalEngine.resolveDocVersion
org.elasticsearch.index.engine.InternalEngine.planIndexingAsPrimary
org.elasticsearch.index.engine.InternalEngine.indexingStrategyForOperation
org.elasticsearch.index.engine.InternalEngine.index
org.elasticsearch.index.shard.IndexShard.index
org.elasticsearch.index.shard.IndexShard.applyIndexOperation
org.elasticsearch.index.shard.IndexShard.applyIndexOperationOnPrimary
[...]
```

For these reasons we are restoring `mmapfs` as the default store type.

Relates elastic#36668
danielmitterdorfer added a commit to danielmitterdorfer/elasticsearch that referenced this pull request Feb 15, 2019
With this commit we add the `.cfs` file extension to the list of file
types that are memory-mapped by hybridfs. `.cfs` files combine all files
of a Lucene segment into a single file in order to save file handles. As
this strategy is only used for "small" segments (less than 10% of the
shard size), it is benefical to memory-map them instead of accessing
them via NIO.

Relates elastic#36668
danielmitterdorfer added a commit that referenced this pull request Feb 15, 2019
With this commit we add the `.cfs` file extension to the list of file
types that are memory-mapped by hybridfs. `.cfs` files combine all files
of a Lucene segment into a single file in order to save file handles. As
this strategy is only used for "small" segments (less than 10% of the
shard size), it is benefical to memory-map them instead of accessing
them via NIO.

Relates #36668
danielmitterdorfer added a commit to danielmitterdorfer/elasticsearch that referenced this pull request Feb 15, 2019
With this commit we add the `.cfs` file extension to the list of file
types that are memory-mapped by hybridfs. `.cfs` files combine all files
of a Lucene segment into a single file in order to save file handles. As
this strategy is only used for "small" segments (less than 10% of the
shard size), it is benefical to memory-map them instead of accessing
them via NIO.

Relates elastic#36668
danielmitterdorfer added a commit to danielmitterdorfer/elasticsearch that referenced this pull request Feb 15, 2019
With this commit we add the `.cfs` file extension to the list of file
types that are memory-mapped by hybridfs. `.cfs` files combine all files
of a Lucene segment into a single file in order to save file handles. As
this strategy is only used for "small" segments (less than 10% of the
shard size), it is benefical to memory-map them instead of accessing
them via NIO.

Relates elastic#36668
danielmitterdorfer added a commit that referenced this pull request Feb 15, 2019
With this commit we add the `.cfs` file extension to the list of file
types that are memory-mapped by hybridfs. `.cfs` files combine all files
of a Lucene segment into a single file in order to save file handles. As
this strategy is only used for "small" segments (less than 10% of the
shard size), it is benefical to memory-map them instead of accessing
them via NIO.

Relates #36668
danielmitterdorfer added a commit that referenced this pull request Feb 15, 2019
With this commit we add the `.cfs` file extension to the list of file
types that are memory-mapped by hybridfs. `.cfs` files combine all files
of a Lucene segment into a single file in order to save file handles. As
this strategy is only used for "small" segments (less than 10% of the
shard size), it is benefical to memory-map them instead of accessing
them via NIO.

Relates #36668
danielmitterdorfer added a commit that referenced this pull request Mar 19, 2019
With this commit we add the `.cfs` file extension to the list of file
types that are memory-mapped by hybridfs. `.cfs` files combine all files
of a Lucene segment into a single file in order to save file handles. As
this strategy is only used for "small" segments (less than 10% of the
shard size), it is benefical to memory-map them instead of accessing
them via NIO.

Relates #36668
kovrus added a commit to crate/crate that referenced this pull request Sep 6, 2019
Backport of elastic/elasticsearch#36668

This commit introduces a new store type hybridfs that is a
hybrid between mmapfs and niofs.
kovrus added a commit to crate/crate that referenced this pull request Sep 6, 2019
Backport of elastic/elasticsearch#36668

This commit introduces a new store type hybridfs that is a
hybrid between mmapfs and niofs.
kovrus added a commit to crate/crate that referenced this pull request Sep 9, 2019
Backport of elastic/elasticsearch#36668

This commit introduces a new store type hybridfs that is a
hybrid between mmapfs and niofs.
kovrus added a commit to crate/crate that referenced this pull request Sep 9, 2019
Backport of elastic/elasticsearch#36668

This commit introduces a new store type hybridfs that is a
hybrid between mmapfs and niofs.
mergify bot pushed a commit to crate/crate that referenced this pull request Sep 10, 2019
Backport of elastic/elasticsearch#36668

This commit introduces a new store type hybridfs that is a
hybrid between mmapfs and niofs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants