-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-17401. EC: Excess internal block may not be able to be deleted correctly when it's stored in fallback storage #6597
base: trunk
Are you sure you want to change the base?
Conversation
💔 -1 overall
This message was automatically generated. |
@jojochuang @tomscut Could you please take a look?The failed UT seems unrelated. |
@zhangshuyan0 @haiyang1987 Could you please take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch!
import org.apache.hadoop.hdfs.protocol.LocatedBlocks; | ||
import org.apache.hadoop.hdfs.protocol.LocatedStripedBlock; | ||
import org.apache.hadoop.hdfs.protocol.SystemErasureCodingPolicies; | ||
import org.apache.hadoop.hdfs.protocol.*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not use * in import
.
💔 -1 overall
This message was automatically generated. |
fs.setStoragePolicy(dirPath, HdfsConstants.ALLSSD_STORAGE_POLICY_NAME); | ||
DFSTestUtil.createFile(fs, filePath, | ||
cellSize * dataBlocks * 2, (short) 1, 0L); | ||
FSNamesystem fsn3 = cluster.getNamesystem(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can remove unused code.
DFSTestUtil.waitForReplication(fs, filePath, groupSize, 15 * 1000); | ||
|
||
DatanodeInfo dnToStop2 = block.getLocations()[1]; | ||
MiniDFSCluster.DataNodeProperties dnProp2 = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here ~
cellSize * dataBlocks * 2, (short) 1, 0L); | ||
FSNamesystem fsn3 = cluster.getNamesystem(); | ||
BlockManager bm3 = fsn3.getBlockManager(); | ||
// stop a dn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first letter should be uppercase~
💔 -1 overall
This message was automatically generated. |
please fix checkstytle, thanks~ |
💔 -1 overall
This message was automatically generated. |
In |
@@ -4359,9 +4348,15 @@ private void chooseExcessRedundancyStriped(BlockCollection bc, | |||
} | |||
} | |||
if (candidates.size() > 1) { | |||
List<StorageType> internalExcessTypes = storagePolicy.chooseExcess( | |||
(short) 1, DatanodeStorageInfo.toStorageTypes(candidates)); | |||
if (internalExcessTypes.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About internalExcessTypes, could you explain which case will empty ? thanks~
// Wait for reconstruction to happen | ||
DFSTestUtil.waitForReplication(fs, filePath, groupSize, 15 * 1000); | ||
|
||
DatanodeInfo dnToStop2 = block.getLocations()[1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here may be add cluster.stopDataNode(dnToStop2.getXferAddr());
This parameter is added in https:/apache/hadoop/pull/5474 (Also a patch I contributed), after the modification of the patch, I dont't think the parameter is neccessary any more, so we can remove it.The "empty exessType" log can be removed too, after the modification of this patch, the exessType is impossible to be null, becase the candidates is > 1 and the "replication"(first parameter of storagePolicy.chooseExcess) is constantly 1. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
|
||
MiniDFSCluster.DataNodeProperties dnProp = | ||
cluster.stopDataNode(dnToStop.getXferAddr()); | ||
cluster.stopDataNode(dnToStop.getXferAddr()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here repeated, need remove.
From my side seems well, except for some small issues in test UT. |
💔 -1 overall
This message was automatically generated. |
cluster.waitActive(); | ||
DFSTestUtil.verifyClientStats(conf, cluster); | ||
|
||
// Currently namenode is able to track the missing block. And restart NN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here why need restart namenode?
For Line[629-639]
for (DataNode dn : cluster.getDataNodes()) {
DataNodeTestUtils.triggerBlockReport(dn);
}
BlockManager bm = cluster.getNamesystem().getBlockManager();
GenericTestUtils.waitFor(()
-> bm.getPendingDeletionBlocksCount() == 0,
10, 2000);
for (DataNode dn : cluster.getDataNodes()) {
DataNodeTestUtils.triggerHeartbeat(dn);
}
for (DataNode dn : cluster.getDataNodes()) {
DataNodeTestUtils.triggerDeletionReport(dn);
}
Maybe update this logic to avoid using Thread.sleep(3000)
, what do you think?
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Please fix checkstytle and the UT failed issue, thanks~ |
🎊 +1 overall
This message was automatically generated. |
@haiyang1987 Hi, just found out the failed UT is caused by the removal of Thread.sleep(3000). I think there is delay between the trigger of blockReport and the deletion of the block, so I will keep Thread.sleep(3000) in the UT. |
@haiyang1987 Could you please take a look? |
Description of PR
jira: https://issues.apache.org/jira/browse/HDFS-17401
Excess internal block can't be deleted correctly when it's stored in fallback storage.
Simple case:
EC-RS-6-3-1024k file is stored using ALL_SSD storage policy(SSD is default storage type and DISK is fallback storage type), if the block group is as follows
[0(SSD), 0(SSD), 1(SSD), 2(SSD), 3(SSD), 4(SSD), 5(SSD), 6(SSD), 7(SSD), 8(DISK)]
There are two index 0 internal blocks and one of them should be chosen to delete.But the current implement chooses the index 0 internal blocks as candidates but DISK as exess storage type.As a result, the exess storage type(DISK) can not correspond to the exess internal blocks' storage type(SSD) correctly, and the exess internal block can not be deleted correctly.