Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trim translog for closed indices #43156

Merged
merged 6 commits into from
Jun 28, 2019

Conversation

tlrx
Copy link
Member

@tlrx tlrx commented Jun 12, 2019

Today when an index is closed all its shards are forced flushed but the translog files are left around. As explained in #42445 we'd like to trim the translog for closed indices in order to consume less disk space.

Instead of trimming the translog at closing time (which can be challenging as explained in #42445) or during the initialization of the noop engine, this pull request proposes to reuse the existing AsyncTrimTranslogTask task and to reenable it for closed indices.

At the time the task is executed, we should have the guarantee that nothing holds the translog files that are going to be removed. It also leaves a short period of time (10 min) during which translog files of a recently closed index are still present on disk. This could also help in some cases where the closed index is reopened shortly after being closed (in order to update an index setting for example).

@tlrx tlrx added >enhancement :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. v8.0.0 v7.3.0 labels Jun 12, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@tlrx tlrx requested review from dnhatn and ywelsch June 12, 2019 12:42
@dnhatn
Copy link
Member

dnhatn commented Jun 13, 2019

I think we should not trim translog in NoOpEngine because it extends from ReadOnlyEngine which should not modify translog (and Lucene index). How about trimming translog in InternalEngine when we execute the verify-before-close step (after we flush)?

@tlrx
Copy link
Member Author

tlrx commented Jun 19, 2019

We talked about this today and we agreed that we can move forward with the proposed solution, as long as the translog trimming is correctly documented in NoOpEngine. This solution will help to curate translog files of indices closed in 7.2.

We also agreed on trimming translogs files more aggressively when indices are verified before being closed, as suggested by @dnhatn and others, but that would require peer recoveries to not use translog at all anymore. This will become possible in a short future once peer recovery retention leases will be implemented (see #41536).

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Thanks @tlrx. I left a comment.

@tlrx tlrx requested a review from dnhatn June 25, 2019 14:38
Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a smaller comment but LGTM. Thanks @tlrx.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you follow-up with a change that exposes correct TranslogStats for NoOpEngine, and reloads them after the trimming?

try (ReleasableLock lock = readLock.acquire()) {
ensureOpen();
final List<IndexCommit> commits = DirectoryReader.listCommits(store.directory());
if (commits.size() == 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we guaranteed to have this on a successful close? How do we know that previous commits have been cleaned up on verify-before-close?

@tlrx tlrx merged commit c900795 into elastic:master Jun 28, 2019
@tlrx tlrx deleted the trim-translog-for-closed-indices branch June 28, 2019 14:18
tlrx added a commit that referenced this pull request Jun 28, 2019
Today when an index is closed all its shards are forced flushed
but the translog files are left around. As explained in #42445
we'd like to trim the translog for closed indices in order to
consume less disk space. This commit reuses the existing
AsyncTrimTranslogTask task and reenables it for closed indices.

At the time the task is executed, we should have the guarantee
that nothing holds the translog files that are going to be removed.
It also leaves a short period of time (10 min) during which translog
files of a recently closed index are still present on disk. This could
 also help in some cases where the closed index is reopened
shortly after being closed (in order to update an index setting
for example).

Relates to #42445
@tlrx
Copy link
Member Author

tlrx commented Jun 28, 2019

Thanks @dnhatn and @ywelsch

Can you follow-up with a change that exposes correct TranslogStats for NoOpEngine, and reloads them after the trimming?

I opened #43752 to expose translog stats for read only engines. Once this one is merged I'll open a follow to reload stats after the trimming.

original-brownbear pushed a commit to original-brownbear/elasticsearch that referenced this pull request Jul 3, 2019
…#43825)

This commit changes NoOpEngine so that it refreshes its translog 
stats once translog is trimmed.

Relates elastic#43156
tlrx added a commit that referenced this pull request Jul 3, 2019
This commit changes NoOpEngine so that it refreshes its translog 
stats once translog is trimmed.

Relates #43156
@tlrx tlrx mentioned this pull request Jul 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement v7.3.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants