我们最近向集群添加了新节点,我想运行重新平衡器以在这些节点之间分配数据。我们正在使用 CDH3,整个系统都是 cheffed - 我甚至不确定这些盒子上是否安装了 cloudera 管理器。
如果没有 Cloudera Manager,我能否运行平衡器?重新分配的正确方法是什么?
如果这是一个非常明显的问题,请原谅,我在 Google 上搜索了很多,也搜索了这里的问题,但没有成功,因此提出了这些问题。
附言:我别无选择,只能将其放入 CDH4 中,因为它不允许我创建 CDH3 标签。
答案1
是的,您可以在没有 CM 的情况下运行平衡器。您需要执行类似下面的操作,最好从您的 namenode 运行,但它应该可以从任何有访问权限的节点或客户端运行。
首先,运行 screen 或 tmux。这个过程可能需要一段时间。如果你不在 screen/tmux 中运行它,也不会有什么不好的事情发生,这只是在你与远程系统的连接断开时的一种保护。
如果您没有运行 Kerberos,您可以执行以下操作:
sudo su - hdfs
hadoop balancer -threshold <somevalue> > balance.out 2>&1
如果您正在运行 Kerberos,则需要执行如下操作:
sudo su - hdfs
kinit -t -k /path/to/your/hdfs.keytab hdfs/fully.qualified.hostname
hadoop balancer -threshold <somevalue> > balance.out 2>&1
然后在另一个 shell 中,你可以跟踪输出,将其传递给适当的 grep 以清除平衡器产生大量无用的信息。
我使用这样的东西:
tail -f balance.out | grep -v Moving
这意味着我将看到如下信息:
13/11/25 05:53:38 INFO balancer.Balancer: 0 over utilized nodes:
13/11/25 05:53:38 INFO balancer.Balancer: 1 under utilized nodes: 192.168.1.151:50010
13/11/25 05:53:38 INFO balancer.Balancer: Need to move 181.53 MB bytes to make the cluster balanced.
13/11/25 05:53:38 INFO balancer.Balancer: Decided to move 10 GB bytes from 192.168.1.131:50010 to 192.168.1.151:50010
13/11/25 05:53:38 INFO balancer.Balancer: Will move 10 GBbytes in this iteration
Nov 25, 2013 5:53:38 AM 2203 47.76 TB 181.53 MB 10 GB
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.147:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-top/192.168.1.122:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-bottom/192.168.1.137:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-top/192.168.1.128:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.145:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-top/192.168.1.126:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.149:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.146:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.153:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.156:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.151:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-bottom/192.168.1.134:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-bottom/192.168.1.135:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.154:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.144:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-top/192.168.1.125:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.148:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-bottom/192.168.1.139:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.152:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-bottom/192.168.1.133:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-bottom/192.168.1.132:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-bottom/192.168.1.136:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.150:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-top/192.168.1.129:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-top/192.168.1.130:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.142:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-top/192.168.1.123:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-top/192.168.1.127:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.160:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.158:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-bottom/192.168.1.131:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-bottom/192.168.1.138:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-top/192.168.1.124:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-bottom/192.168.1.140:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.159:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a8-top/192.168.1.121:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.155:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.141:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.157:50010
13/11/25 05:54:23 INFO net.NetworkTopology: Adding a new node: /c1/hadoop-a6/192.168.1.143:50010
13/11/25 05:54:23 INFO balancer.Balancer: 0 over utilized nodes:
13/11/25 05:54:23 INFO balancer.Balancer: 0 under utilized nodes:
The cluster is balanced. Exiting...
Balancing took 90.16988833333333 hours
最后,如果你觉得重新平衡花费的时间太长,你可以调整dfs.balance.bandwidthPerSec。需要在每个数据节点上设置,然后重新启动 hadoop-0.20-datanode 进程才能生效。我记得默认值是 1MB/s。该值以字节为单位。