我已经创建了一个小型 Hadoop 集群,其中包含 1 个 NameNode 和 1 个 DataNode,以便进行实际操作。
以下是我的配置文件:
核心-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://n1:9000</value>
</property>
hdfs-site.xml
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>/home/admin/hdfs_store/data/secondarynamenode</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/admin/hdfs_store/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/admin/hdfs_store/data/datanode</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/admin/hdfs_store/data/tmp</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
mapred-站点.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
纱线-站点.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>n1</value>
</property>
我正在尝试使用以下命令启动具有 DataNode 的 BackupNode
hdfs 名称节点-备份
下面是我得到的跟踪
19/03/06 05:04:36 INFO namenode.FSImage: Start loading edits file /home/admin/hdfs_store/data/namenode/current/edits_0000000000000063693-0000000000000063694
19/03/06 05:04:36 INFO namenode.FSImage: Edits file /home/admin/hdfs_store/data/namenode/current/edits_0000000000000063693-0000000000000063694 of size 42 edits # 2 loaded in 0 seconds
19/03/06 05:04:36 INFO namenode.NameCache: initialized with 79 entries 1482 lookups
19/03/06 05:04:36 INFO namenode.LeaseManager: Number of blocks under construction: 0
19/03/06 05:04:36 INFO namenode.FSImageFormatProtobuf: Saving image file /home/admin/hdfs_store/data/namenode/current/fsimage.ckpt_0000000000000063694 using no compression
19/03/06 05:04:36 INFO namenode.FSImageFormatProtobuf: Image file /home/admin/hdfs_store/data/namenode/current/fsimage.ckpt_0000000000000063694 of size 5536 bytes saved in 0 seconds.
19/03/06 05:04:36 INFO namenode.NNStorageRetentionManager: Going to retain 2 images with txid >= 63692
19/03/06 05:04:36 INFO namenode.NNStorageRetentionManager: Purging old image FSImageFile(file=/home/admin/hdfs_store/data/namenode/current/fsimage_0000000000000063689, cpktTxId=0000000000000063689)
19/03/06 05:04:36 INFO namenode.TransferFsImage: Sending fileName: /home/admin/hdfs_store/data/namenode/current/fsimage_0000000000000063694, fileSize: 5536. Sent total: 5536 bytes. Size of last segment intended to send: -1 bytes.
19/03/06 05:04:36 INFO namenode.TransferFsImage: Uploaded image with txid 63694 to namenode at http://n1:50070 in 0.021 seconds
19/03/06 05:04:36 ERROR namenode.Checkpointer: Throwable Exception in doCheckpoint:
java.lang.IllegalStateException: bad state: DROP_UNTIL_NEXT_ROLL
at com.google.common.base.Preconditions.checkState(Preconditions.java:172)
at org.apache.hadoop.hdfs.server.namenode.BackupImage.convergeJournalSpool(BackupImage.java:246)
at org.apache.hadoop.hdfs.server.namenode.Checkpointer.doCheckpoint(Checkpointer.java:278)
at org.apache.hadoop.hdfs.server.namenode.Checkpointer.run(Checkpointer.java:152)
19/03/06 05:04:36 INFO namenode.FSNamesystem: Stopping services started for active state
19/03/06 05:04:36 WARN namenode.LeaseManager: Encountered exception
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1260)
at org.apache.hadoop.hdfs.server.namenode.LeaseManager.stopMonitor(LeaseManager.java:641)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.stopActiveServices(FSNamesystem.java:1278)
at org.apache.hadoop.hdfs.server.namenode.BackupNode$BNHAContext.stopActiveServices(BackupNode.java:483)
at org.apache.hadoop.hdfs.server.namenode.BackupState.exitState(BackupState.java:60)
at org.apache.hadoop.hdfs.server.namenode.NameNode.stop(NameNode.java:1016)
at org.apache.hadoop.hdfs.server.namenode.BackupNode.stop(BackupNode.java:221)
at org.apache.hadoop.hdfs.server.namenode.Checkpointer.shutdown(Checkpointer.java:121)
at org.apache.hadoop.hdfs.server.namenode.Checkpointer.run(Checkpointer.java:159)
19/03/06 05:04:36 INFO namenode.FSNamesystem: LazyPersistFileScrubber was interrupted, exiting
19/03/06 05:04:36 INFO namenode.FSNamesystem: NameNodeEditLogRoller was interrupted, exiting
19/03/06 05:04:36 INFO blockmanagement.CacheReplicationMonitor: Shutting down CacheReplicationMonitor
19/03/06 05:04:36 INFO ipc.Server: Stopping server on 50100
19/03/06 05:04:36 INFO ipc.Server: Stopping IPC Server listener on 50100
19/03/06 05:04:36 INFO ipc.Server: Stopping IPC Server Responder
19/03/06 05:04:36 INFO blockmanagement.BlockManager: Stopping ReplicationMonitor.
19/03/06 05:04:36 INFO namenode.FSNamesystem: Stopping services started for active state
19/03/06 05:04:36 INFO namenode.FSNamesystem: Stopping services started for standby state
19/03/06 05:04:36 INFO mortbay.log: Stopped [email protected]:50105
19/03/06 05:04:36 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at d1/10.46.60.8
************************************************************/
无法弄清楚哪里出了问题。
答案1
我以前也遇到过同样的问题。我必须在 namenode 和备份节点中指定备份节点地址。
dfs.namenode.backup.address备份节点:50100
Br,Tomasz Ziss