HDFS 平衡，如何平衡 hdfs 数据？

Question

NameNode 在选择接收这些块的 DataNode 之前会考虑各种参数。其中一些考虑因素包括：

1. Policy to keep one of the replicas of a block on the same node as the node that is writing the block.
2. Need to spread different replicas of a block across the racks so that cluster can survive loss of whole rack.
3. One of the replicas is usually placed on the same rack as the node writing to the file so that cross-rack network I/O is reduced.
4. Spread HDFS data uniformly across the DataNodes in the cluster.

因此，您遇到的情况可能与上述情况相符。

Apache Balancer 命令。

hdfs balancer [-threshold <threshold>] [-policy <policy>]
 -- threshold *threshold* Percentage of disk capacity. This overwrites the default threshold.
 -- policy *policy* *datanode* (default): Cluster is balanced if each datanode is balanced.  
                    *blockpool*: Cluster is balanced if each block pool in each datanode is balanced.

Answer 1

NameNode 在选择接收这些块的 DataNode 之前会考虑各种参数。其中一些考虑因素包括：

1. Policy to keep one of the replicas of a block on the same node as the node that is writing the block.
2. Need to spread different replicas of a block across the racks so that cluster can survive loss of whole rack.
3. One of the replicas is usually placed on the same rack as the node writing to the file so that cross-rack network I/O is reduced.
4. Spread HDFS data uniformly across the DataNodes in the cluster.

因此，您遇到的情况可能与上述情况相符。

Apache Balancer 命令。

hdfs balancer [-threshold <threshold>] [-policy <policy>]
 -- threshold *threshold* Percentage of disk capacity. This overwrites the default threshold.
 -- policy *policy* *datanode* (default): Cluster is balanced if each datanode is balanced.  
                    *blockpool*: Cluster is balanced if each block pool in each datanode is balanced.

HDFS 平衡，如何平衡 hdfs 数据？

答案1

相关内容