我们在 Hadoop 1.0.4 上使用 HBase 0.94.4 版本。其中一个 HBase 区域卡在过渡状态,运行时出现以下信息/opt/hbase/bin/hbase hbck
:
ERROR: Region { meta => dev1_sliceagg_location_file,,1369128923119.21accc8b27bbd501ed4d3575d6ee725e., hdfs => hdfs://192.168.3.100:8020/hbase/dev1_sliceagg_location_file/21accc8b27bbd501ed4d3575d6ee725e, deployed => } not deployed on any region server.
ERROR: Region { meta => crash_experiment_sliceagg_client_file,,1369316587953.46e475f415d83f0d5caebccf67acc696., hdfs => hdfs://192.168.3.100:8020/hbase/crash_experiment_sliceagg_client_file/46e475f415d83f0d5caebccf67acc696, deployed => } not deployed on any region server.
ERROR: Region { meta => dev1_sliceagg_client_file,\x94\xDC\x97\x85\x94\x15\xAFO\xFEv\xE5}2\xBA\xE6\xC5\x8E\x87'\x0CG\x04\xCF)Q\xE1\xE7\x82\x0Dl\x8A+\x90\x18\xF8{2?\xD2]~6oO\x0F\\x97\x96\xBF\xE5Fc6|\xE8x\xF6+\x09s\xAF\xC9\xC3\xC8\x00<\x11\x00\x00\x00\x00\x00,1369315360949.92fc7ad4623318547cf7f4cb13e3afdc., hdfs => hdfs://192.168.3.100:8020/hbase/dev1_sliceagg_client_file/92fc7ad4623318547cf7f4cb13e3afdc, deployed => } not deployed on any region server.
13/05/23 18:54:16 DEBUG util.HBaseFsck: There are 64 region info entries
ERROR: There is a hole in the region chain between \x94\xDC\x97\x85\x94\x15\xAFO\xFEv\xE5}2\xBA\xE6\xC5\x8E\x87'\x0CG\x04\xCF)Q\xE1\xE7\x82\x0Dl\x8A+\x90\x18\xF8{2?\xD2]~6oO\x0F\\x97\x96\xBF\xE5Fc6|\xE8x\xF6+\x09s\xAF\xC9\xC3\xC8\x00<\x11\x00\x00\x00\x00\x00 and \xC80\xCD\x96\xBF-\xB0\xB6hm\x80\xE5\xD7\xDE\xAF\xB0\x0ANWW\xAE\x09\xFA\x96"\xE3\x15\x8C\xC1\xAE\xF1\x14\xEDWNB\x0EW7N2\x8C|Re\x04\xEC\xA5i\xC1d(yf\xF0`\x19\xEC |\xB1\x7F,T@6\x00\x00\x00\x00\x00\x00. You need to create a new .regioninfo and region dir in hdfs to plug the hole.
ERROR: Found inconsistency in table dev1_sliceagg_client_file
ERROR: (region dev1_sliceagg_location_file,\x80+\x02)\xD9\x04\xE2\x8C\x1E\xA9\xA5'J\xB4W\xFC\xD4\x8C\x86Kgx\x87"\x0C\x14\x8F\xCD\x00p\x11\xEB\xB7;\x98\x9B02J[\x07\xF0\xE8\xAE\xC1m\xFF\xA4\x00$\x01\x00\x00\x00\x00\x00\x00\x00\x03\xEE\x00\x00\x00\x00\x00\x00?\xB2\x00\x00\x00\x00\x00\x00\x0A\xB5,1369128923119.f7b1c0288f9fcc36ebceca091103ac18.) First region should start with an empty key. You need to create a new region and regioninfo in HDFS to plug the hole.
ERROR: Found inconsistency in table dev1_sliceagg_location_file
13/05/23 18:54:17 WARN regionserver.StoreFile: Failed match of store file name hdfs://192.168.3.100:8020/hbase/crash_experiment_sliceagg_file_stat/06f163c5f5e79b02e260f3b2752c9cb8/.oldlogs/hlog.1369315359473
13/05/23 18:54:17 WARN regionserver.StoreFile: Failed match of store file name hdfs://192.168.3.100:8020/hbase/-ROOT-/70236052/.oldlogs/hlog.1358951260249
13/05/23 18:54:17 WARN regionserver.StoreFile: Failed match of store file name hdfs://192.168.3.100:8020/hbase/dev1_sliceagg_client_file/92fc7ad4623318547cf7f4cb13e3afdc/.oldlogs/hlog.1369315360956
13/05/23 18:54:17 WARN regionserver.StoreFile: Failed match of store file name hdfs://192.168.3.100:8020/hbase/crash_experiment_sliceagg_client_file/46e475f415d83f0d5caebccf67acc696/.oldlogs/hlog.1369316587995
13/05/23 18:54:17 WARN regionserver.StoreFile: Failed match of store file name hdfs://192.168.3.100:8020/hbase/.META./1028785192/.oldlogs/hlog.1358951260483
/opt/hbase/bin/hbase hbck -fix
无法修复任何问题,因为它会卡住并打印Region still in transition, waiting for it to become assigned
错误消息。
/opt/hbase/bin/hbase hbck -repairHoles
也无济于事。我们应该怎么做才能解决这种情况?
答案1
我们必须停止 HBase 并删除recovered.edits
故障区域的文件夹。hbck
之后成功了。