从HDFS中读取文件

abloz 2013-02-01
2013-02-01

周海汉 2013.2.1

本代码可以从本地或hdfs系统中读取文件两次,并在终端打印出来。

/**
 * test read file from hdfs
 */
package my.test;

import java.net.URI;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
/**
 * @author zhouhh
 * @date 2013.2.1
 *
 */
public class TestHdfs {

	public static void main(String[] args) throws Exception {
		String uri = "";
		if (args.length < 1)
		{
			//uri="test.txt";
			uri="hdfs://hadoop48:54310/user/zhouhh/test.txt";
		}
		else
		{
			uri = args[0];
		}
		Configuration conf = new Configuration();
		FileSystem fs = FileSystem.get(URI.create(uri), conf);
		FSDataInputStream in = null;
		try {
			in = fs.open(new Path(uri));
			IOUtils.copyBytes(in, System.out, 4096, false);
			in.seek(0); // go back to the start of the file
			IOUtils.copyBytes(in, System.out, 4096, false);
		} finally {
			IOUtils.closeStream(in);
		}
	}

}

参考:tom white《hadoop the difinitive guide 3nd edition》 第三章 hadoop 分布式文件系统


如非注明转载, 均为原创. 本站遵循知识共享CC协议,转载请注明来源