namenode崩溃的数据恢复测试

Posted by abloz on September 26, 2012

周海汉/文 http://abloz.com 2012.9.9

前言 用second namenode 数据恢复测试。datanode由于采用2-3个备份,即使一台设备损坏,还是能自动恢复并找回全部数据。 hadoop 1.0.3和0.20之前的版本,namenode存在单点问题。如果namenode损坏,会导致整个系统数据彻底丢失。所以second namenode就显得特别重要。本文主要探讨namenode损坏的数据恢复实践,包括配置文件,部署,namenode崩溃,namenode数据损坏和namenode meta数据恢复。

hadoop版本是hadoop1.0.3 一共三台机器参与测试。 机器角色: Hadoop48 Namenode Hadoop47 Second Namenode, Datanode Hadoop46 Datanode

1.编辑core-site,增加checkpoint相关配置 fs.checkpoint.dir 是恢复文件存放目录 fs.checkpoint.period 同步检查时间,缺省是3600秒1小时。测试时设为20秒。 fs.checkpoint.size 当edit 日志文件大于这个字节数时,即使检查时间没到,也会触发同步。

[zhouhh@Hadoop48 conf]$ vi core-site.xml

hadoop.mydata.dir /data/zhouhh/myhadoop A base for other directories.${user.name} hadoop.tmp.dir /tmp/hadoop-${user.name} A base for other temporary directories. fs.checkpoint.dir ${hadoop.data.dir}/dfs/namesecondary Determines where on the local filesystem the DFS secondary name node should store the temporary images to merge. If this is a comma-delimited list of directories then the image is replicated in all of the directories for redundancy. fs.checkpoint.edits.dir ${fs.checkpoint.dir} Determines where on the local filesystem the DFS secondary name node should store the temporary edits to merge. If this is a comma-delimited list of directoires then teh edits is replicated in all of the directoires for redundancy. Default value is same as fs.checkpoint.dir fs.checkpoint.period 20 The number of seconds between two periodic checkpoints.default is 3600 second fs.checkpoint.size 67108864 The size of the current edit log (in bytes) that triggers a periodic checkpoint even if the fs.checkpoint.period hasn't expired.

2.将second namenode设置到另一台机器。 设置masters文件,这是指定seconde namenode启动的机器。 [zhouhh@Hadoop48 conf]$ cat masters Hadoop47

编辑dfs.secondary.http.address,指定second namenode的http web UI 域名或IP到namenode Hadoop48不同的机器Hadoop47,而不是缺省的0.0.0.0

[zhouhh@Hadoop48 conf]$ vi hdfs-site.xml

dfs.name.dir ${hadoop.mydata.dir}/dfs/name Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. Default value is:${hadoop.tmp.dir}/dfs/name dfs.secondary.http.address Hadoop47:55090 The secondary namenode http server address and port. If the port is 0 then the server will start on a free port.

3.测试时如果name node指定的目录没有初始化,需初始化一下 [zhouhh@Hadoop48 logs]$ hadoop namenode -format

4.同步conf下的配置到Hadoop47/46(略),启动hadoop [zhouhh@Hadoop48 conf]$ start-all.sh

[zhouhh@Hadoop48 conf]$ jps 9633 Bootstrap 10746 JobTracker 10572 NameNode 10840 Jps

[zhouhh@Hadoop47 ~]$ jps 23157 DataNode 23362 TaskTracker 23460 Jps 23250 SecondaryNameNode

Namenode log报的error: 2012-09-25 19:27:54,816 ERROR security.UserGroupInformation - PriviledgedActionException as:zhouhh cause:org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /data/zhouhh/myhadoop/mapred/ system. Name node is in safe mode. 请不要急,NameNode会在开始启动阶段自动关闭安全模式,然后启动成功。如果你不想等待,可以运行:

bin/hadoop dfsadmin -safemode leave 强制结束。 NameNode启动时会从fsimage和edits日志文件中装载文件系统的状态信息,接着它等待各个DataNode向它报告它们各自的数据块状态,这样,NameNode就不会过早地开始复制数据块,即使在副本充足的情况下。这个阶段,NameNode处于安全模式下。NameNode的安全模式本质上是HDFS集群的一种只读模式,此时集群不允许任何对文件系统或者数据块修改的操作。通常NameNode会在开始阶段自动地退出安全模式。如果需要,你也可以通过’bin/hadoop dfsadmin -safemode’命令显式地将HDFS置于安全模式。NameNode首页会显示当前是否处于安全模式。

5.编辑放置测试文件 [zhouhh@Hadoop48 hadoop-1.0.3]$ fs -put README.txt /user/zhouhh/README.txt [zhouhh@Hadoop48 hadoop-1.0.3]$ fs -ls . Found 1 items -rw-r–r– 2 zhouhh supergroup 1381 2012-09-26 14:03 /user/zhouhh/README.txt [zhouhh@Hadoop48 hadoop-1.0.3]$ cat test中文.txt 这是测试文件 test001 by zhouhh http://abloz.com 2012.9.26

6. 放到HDFS中 [zhouhh@Hadoop48 hadoop-1.0.3]$ hadoop fs -put test中文.txt .

[zhouhh@Hadoop48 hadoop-1.0.3]$ hadoop fs -ls . Found 2 items -rw-r–r– 2 zhouhh supergroup 1381 2012-09-26 14:03 /user/zhouhh/README.txt -rw-r–r– 2 zhouhh supergroup 65 2012-09-26 14:10 /user/zhouhh/test中文.txt [zhouhh@Hadoop48 ~]$ hadoop fs -cat test中文.txt 这是测试文件 test001 by zhouhh http://abloz.com 2012.9.26

7 杀死Namenode,模拟崩溃 [zhouhh@Hadoop48 ~]$ jps 9633 Bootstrap 23006 Jps 19691 NameNode 19867 JobTracker [zhouhh@Hadoop48 ~]$ kill -9 19691 [zhouhh@Hadoop48 ~]$ jps 9633 Bootstrap 23019 Jps 19867 JobTracker

[zhouhh@Hadoop47 hadoop-1.0.3]$ jps 1716 DataNode 3825 Jps 1935 TaskTracker 1824 SecondaryNameNode

8. 将dfs.name.dir下的内容清空,模拟硬盘损坏 [zhouhh@Hadoop48 ~]$ cd /data/zhouhh/myhadoop/dfs/name/ [zhouhh@Hadoop48 name]$ ls current image in_use.lock previous.checkpoint [zhouhh@Hadoop48 name]$ cd .. 采用改名的方式进行测试 [zhouhh@Hadoop48 dfs]$ mv name name1 此时,name 目录不存在,namenode是会启动失败的

9.数据恢复,从second namenode 复制数据

查看second namenode文件,并打包复制到namenode的fs.checkpoint.dir [zhouhh@Hadoop47 hadoop-1.0.3]$ cd /data/zhouhh/myhadoop/dfs/ [zhouhh@Hadoop47 dfs]$ ls data namesecondary [zhouhh@Hadoop47 dfs]$ cd namesecondary/ [zhouhh@Hadoop47 namesecondary]$ ls current image in_use.lock [zhouhh@Hadoop47 namesecondary]$ cd .. [zhouhh@Hadoop47 dfs]$ scp sec.tar.gz Hadoop48:/data/zhouhh/myhadoop/dfs/ sec.tar.gz

[zhouhh@Hadoop48 dfs]$ ls name1 sec.tar.gz [zhouhh@Hadoop48 dfs]$ tar zxvf sec.tar.gz namesecondary/ namesecondary/current/ namesecondary/current/VERSION namesecondary/current/fsimage namesecondary/current/edits namesecondary/current/fstime namesecondary/image/ namesecondary/image/fsimage namesecondary/in_use.lock [zhouhh@Hadoop48 dfs]$ ls name1 namesecondary sec.tar.gz

如果dfs.name.dir配置的name不存在,需创建name目录(我测试时将其改名了,也可以进入name目录用rm * -f) [zhouhh@Hadoop48 dfs]$ mkdir name

[zhouhh@Hadoop48 dfs]$ hadoop namenode -importCheckpoint 此时name下面已经有数据 Ctrl+C 结束

10.恢复成功,检查数据正确性 [zhouhh@Hadoop48 dfs]$ start-all.sh [zhouhh@Hadoop48 dfs]$ jps 23940 Jps 9633 Bootstrap 19867 JobTracker 23791 NameNode [zhouhh@Hadoop48 dfs]$ hadoop fs -ls . Found 2 items -rw-r–r– 2 zhouhh supergroup 1381 2012-09-26 14:03 /user/zhouhh/README.txt -rw-r–r– 2 zhouhh supergroup 65 2012-09-26 14:10 /user/zhouhh/test中文.txt [zhouhh@Hadoop48 dfs]$ hadoop fs -cat test中文.txt 这是测试文件 test001 by zhouhh http://abloz.com 2012.9.26

[zhouhh@Hadoop48 dfs]$ hadoop fsck /user/zhouhh FSCK started by zhouhh from /192.168.10.48 for path /user/zhouhh at Wed Sep 26 14:42:31 CST 2012 ..Status: HEALTHY

恢复成功