-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Segment Replication] Prioritize replica shard movement during shard relocation #8875
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Poojita-Raj
requested review from
reta,
anasalkouz,
andrross,
Bukhtawar,
CEHENKLE,
dblock,
gbbafna,
setiah,
kartg,
kotwanikunal,
mch2,
nknize,
owaiskazi19,
Rishikesh1159,
ryanbogan,
saratvemulapalli,
shwetathareja,
dreamer-89,
tlfeng,
VachaShah,
dbwiddis,
sachinpkale and
sohami
as code owners
July 25, 2023 17:29
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
@@ Coverage Diff @@
## main #8875 +/- ##
============================================
- Coverage 71.02% 70.94% -0.08%
+ Complexity 57286 57212 -74
============================================
Files 4765 4766 +1
Lines 270398 270454 +56
Branches 39546 39555 +9
============================================
- Hits 192045 191883 -162
- Misses 62191 62391 +200
- Partials 16162 16180 +18
|
Gradle Check (Jenkins) Run Completed with:
|
Bukhtawar
reviewed
Jul 26, 2023
server/src/main/java/org/opensearch/cluster/routing/RoutingNodes.java
Outdated
Show resolved
Hide resolved
Poojita-Raj
force-pushed
the
shard_movement
branch
from
August 1, 2023 20:48
7972ad3
to
cf4b3cc
Compare
Gradle Check (Jenkins) Run Completed with:
|
mch2
reviewed
Aug 1, 2023
server/src/main/java/org/opensearch/cluster/routing/RoutingNodes.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/opensearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/opensearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java
Outdated
Show resolved
Hide resolved
Compatibility status:
|
Gradle Check (Jenkins) Run Completed with:
|
mch2
reviewed
Aug 4, 2023
server/src/main/java/org/opensearch/cluster/routing/ShardMovementStrategy.java
Show resolved
Hide resolved
server/src/test/java/org/opensearch/cluster/routing/RoutingNodesTests.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Poojita Raj <[email protected]>
Signed-off-by: Poojita Raj <[email protected]>
Signed-off-by: Poojita Raj <[email protected]>
Signed-off-by: Poojita Raj <[email protected]>
Signed-off-by: Poojita Raj <[email protected]>
Signed-off-by: Poojita Raj <[email protected]>
Poojita-Raj
force-pushed
the
shard_movement
branch
from
August 4, 2023 23:33
1d8821b
to
342b6ed
Compare
Compatibility status:
|
Compatibility status:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
|
mch2
approved these changes
Aug 5, 2023
Gradle Check (Jenkins) Run Completed with:
|
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-8875-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 c6e4bcd097969f00956c0f4c152540cb610e4f93
# Push it to GitHub
git push --set-upstream origin backport/backport-8875-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x Then, create a pull request where the |
Poojita-Raj
added a commit
to Poojita-Raj/OpenSearch
that referenced
this pull request
Aug 7, 2023
…relocation (opensearch-project#8875) * add shard movement strategy setting Signed-off-by: Poojita Raj <[email protected]> * add tests Signed-off-by: Poojita Raj <[email protected]> * add changelog Signed-off-by: Poojita Raj <[email protected]> * Add NodeVersionAllocationDecider check Signed-off-by: Poojita Raj <[email protected]> * refactoring Signed-off-by: Poojita Raj <[email protected]> * add annotation + refactor Signed-off-by: Poojita Raj <[email protected]> --------- Signed-off-by: Poojita Raj <[email protected]> (cherry picked from commit c6e4bcd)
6 tasks
tlfeng
pushed a commit
that referenced
this pull request
Aug 7, 2023
…relocation (#8875) (#9153) When some node or set of nodes is excluded, the shards are moved away in random order. When segment replication is enabled for a cluster, we might end up in a mixed version state where replicas will be on lower version and unable to read segments sent from higher version primaries and fail. To avoid this, we could prioritize replica shard movement to avoid entering this situation. Adding a new setting called shard movement strategy - `SHARD_MOVEMENT_STRATEGY_SETTING` - that will allow us to specify in which order we want to move our shards: `NO_PREFERENCE` (default), `PRIMARY_FIRST` or `REPLICA_FIRST`. The `PRIMARY_FIRST` option will perform the same behavior as the previous setting `SHARD_MOVE_PRIMARY_FIRST_SETTING` which will be now deprecated in favor of the shard movement strategy setting. Expected behavior: If `SHARD_MOVEMENT_STRATEGY_SETTING` is changed from its default behavior to be either `PRIMARY_FIRST` or `REPLICA_FIRST` then we perform this behavior whether or not `SHARD_MOVE_PRIMARY_FIRST_SETTING` is enabled. If `SHARD_MOVEMENT_STRATEGY_SETTING` is still at its default setting of `NO_PREFERENCE` and `SHARD_MOVE_PRIMARY_FIRST_SETTING` is enabled we move the primary shards first. This ensures that users still using this setting will not see any changes in behavior. Reference: #1445 Parent issue: #3881 --------- Signed-off-by: Poojita Raj <[email protected]> (cherry picked from commit c6e4bcd)
4 tasks
kaushalmahi12
pushed a commit
to kaushalmahi12/OpenSearch
that referenced
this pull request
Sep 12, 2023
…relocation (opensearch-project#8875) * add shard movement strategy setting Signed-off-by: Poojita Raj <[email protected]> * add tests Signed-off-by: Poojita Raj <[email protected]> * add changelog Signed-off-by: Poojita Raj <[email protected]> * Add NodeVersionAllocationDecider check Signed-off-by: Poojita Raj <[email protected]> * refactoring Signed-off-by: Poojita Raj <[email protected]> * add annotation + refactor Signed-off-by: Poojita Raj <[email protected]> --------- Signed-off-by: Poojita Raj <[email protected]> Signed-off-by: Kaushal Kumar <[email protected]>
brusic
pushed a commit
to brusic/OpenSearch
that referenced
this pull request
Sep 25, 2023
…relocation (opensearch-project#8875) * add shard movement strategy setting Signed-off-by: Poojita Raj <[email protected]> * add tests Signed-off-by: Poojita Raj <[email protected]> * add changelog Signed-off-by: Poojita Raj <[email protected]> * Add NodeVersionAllocationDecider check Signed-off-by: Poojita Raj <[email protected]> * refactoring Signed-off-by: Poojita Raj <[email protected]> * add annotation + refactor Signed-off-by: Poojita Raj <[email protected]> --------- Signed-off-by: Poojita Raj <[email protected]> Signed-off-by: Ivan Brusic <[email protected]>
shiv0408
pushed a commit
to Gaurav614/OpenSearch
that referenced
this pull request
Apr 25, 2024
…relocation (opensearch-project#8875) * add shard movement strategy setting Signed-off-by: Poojita Raj <[email protected]> * add tests Signed-off-by: Poojita Raj <[email protected]> * add changelog Signed-off-by: Poojita Raj <[email protected]> * Add NodeVersionAllocationDecider check Signed-off-by: Poojita Raj <[email protected]> * refactoring Signed-off-by: Poojita Raj <[email protected]> * add annotation + refactor Signed-off-by: Poojita Raj <[email protected]> --------- Signed-off-by: Poojita Raj <[email protected]> Signed-off-by: Shivansh Arora <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
When some node or set of nodes is excluded, the shards are moved away in random order. When segment replication is enabled for a cluster, we might end up in a mixed version state where replicas will be on lower version and unable to read segments sent from higher version primaries and fail.
To avoid this, we could prioritize replica shard movement to avoid entering this situation.
Adding a new setting called shard movement strategy -
SHARD_MOVEMENT_STRATEGY_SETTING
- that will allow us to specify in which order we want to move our shards:NO_PREFERENCE
(default),PRIMARY_FIRST
orREPLICA_FIRST
.The
PRIMARY_FIRST
option will perform the same behavior as the previous settingSHARD_MOVE_PRIMARY_FIRST_SETTING
which will be now deprecated in favor of the shard movement strategy setting.Expected behavior:
If
SHARD_MOVEMENT_STRATEGY_SETTING
is changed from its default behavior to be eitherPRIMARY_FIRST
orREPLICA_FIRST
then we perform this behavior whether or notSHARD_MOVE_PRIMARY_FIRST_SETTING
is enabled.If
SHARD_MOVEMENT_STRATEGY_SETTING
is still at its default setting ofNO_PREFERENCE
andSHARD_MOVE_PRIMARY_FIRST_SETTING
is enabled we move the primary shards first. This ensures that users still using this setting will not see any changes in behavior.Reference: #1445
Parent issue: #3881
Related Issues
Resolves #8265
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.