ambari cluster + 何时需要将 Block replication 设置为 1

ambari cluster + 何时需要将 Block replication 设置为 1

我们在Spark日志中获得以下内容:

java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage DatanodeInfoWithStorage\
The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
at org.apache.hadoop.hdfs

.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:1036)

我的 Ambari 集群仅包含 3 台工作机器,每台工作机器只有一个数据磁盘。

我在 Google 上搜索,发现解决方案可能与块复制有关。HDFS 中的块复制默认配置为 3,我发现建议将“块复制”设置为 1,而不是 3。

问:有道理吗?

此外,我的工作机器只有一个数据磁盘,这会是问题的一部分吗?

块复制 = 文件系统中的文件总数将是 dfs.replication 因子设置 dfs.replication=1 中指定的数量,意味着文件系统中只有一个文件副本。

完整日志:

java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[34.2.31.31:50010,DS-8234bb39-0fd4-49be-98ba-32080bc24fa9,DISK], DatanodeInfoWithStorage[34.2.31.33:50010,DS-b4758979-52a2-4238-99f0-1b5ec45a7e25,DISK]], original=[DatanodeInfoWithStorage[34.2.31.31:50010,DS-8234bb39-0fd4-49be-98ba-32080bc24fa9,DISK], DatanodeInfoWithStorage[34.2.31.33:50010,DS-b4758979-52a2-4238-99f0-1b5ec45a7e25,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:1036)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1110)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1268)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:993)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:500)
---2018-01-30T15:15:15.015 INFO  [][][] [dal.locations.LocationsDataFramesHandler] 

相关内容