Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Segment Replication] Update replica recovery to perform replication from source #5743

Closed
dreamer-89 opened this issue Jan 7, 2023 · 3 comments
Assignees
Labels
distributed framework enhancement Enhancement or improvement to existing feature or request

Comments

@dreamer-89
Copy link
Member

Coming from #5722 (comment), the existing replica recovery performs a round of segment replication when recovery is marked complete on target. This also introduced dependency on SegmentReplicationTargetService inside IndicesClusterStateService. This is not needed as same can be achieved by invoking ForceSegmentReplication handler inside SegmentReplicationTargetService.

@dreamer-89
Copy link
Member Author

Opened a PR, but observing unit test failures which needs deep dive. Will look into it after #5848

@anasalkouz anasalkouz added enhancement Enhancement or improvement to existing feature or request and removed bug Something isn't working labels Jan 27, 2023
@dreamer-89
Copy link
Member Author

dreamer-89 commented Jan 31, 2023

I looked into the test failure testNRTReplicaPromotedAsPrimary and seems to be happening due to stale translog generation which results in all operations been available on translog.
Test was passing previously as replica were started with -1 for lastReceivedGen. While primary post indexing has higher generation for SegmentInfos which results in new TranslogWriter on repilca (resulting in wiping out of previous operations). So, below assertions were not tripping. With my change, which performs SegRep during recovery bring the replica’s generation to the primary’s. During indexing and round of segrep, replica doesn’t receive higher SegmentInfos.version which results in writes to existing TranslogWriter; with end result of all documents been present on current TranslogWriter rather than the ones, ingested post SegRep.

assertEquals(additonalDocs, nextPrimary.translogStats().estimatedNumberOfOperations());
assertEquals(additonalDocs, replica.translogStats().estimatedNumberOfOperations());
assertEquals(additonalDocs, nextPrimary.translogStats().getUncommittedOperations());
assertEquals(additonalDocs, replica.translogStats().getUncommittedOperations());

@dreamer-89
Copy link
Member Author

Closing this issue as change is merged into main in #5746 and backported in #6149

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
distributed framework enhancement Enhancement or improvement to existing feature or request
Projects
Status: Done
Development

No branches or pull requests

3 participants