Skip to content

alagrede/HdfsClient

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

HDFS client

This library allow to connect to the Hadoop datalab cluster without any system installation (except Java).

  • A maven dependency can be import in your Java application for use hdfs
  • A command line interface can be used on your machine hdfs dfs

HDFS in Java application

<dependency>
	<groupId>com.tony.hdfs</groupId>
	<artifactId>HdfsClient</artifactId>
	<version>1.0</version>
</dependency>

Configuration

Define your hadoop.properties in your project

hadoop.cluster=clustername
hadoop.failoverProxy=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
hadoop.namenodes=nn1,nn2
hadoop.rpcAddress=[DNS_NAMENODE1]:[PORT_RPC],[DNS_NAMENODE2]:[PORT_RPC]
hadoop.httpAddress=[DNS_NAMENODE1]:[PORT_HTTP],[DNS_NAMENODE2]:[PORT_HTTP]
hadoop.krb5Url=hadoop/krb5.conf
hadoop.jaasConfUrl=hadoop/jaas.conf

Note: Example krb5.conf and jaas.conf are embedded in jar and must be overriden

Usage

Properties prop = new Properties();

ClassLoader classLoader = getClass().getClassLoader();
 
InputStream input = new FileInputStream("./hadoop.properties");
prop.load(input);

client = new HadoopClient();

client.setHadoopCluster(prop.getProperty("hadoop.cluster"));
client.setNamenodes(prop.getProperty("hadoop.namenodes"));
client.setHttpAaddress(prop.getProperty("hadoop.httpAddress"));
client.setRpcAddress(prop.getProperty("hadoop.rpcAddress"));
client.setHadoopProxy(prop.getProperty("hadoop.failoverProxy"));

# For use internal krb5 and jaas files
URL jaas = classLoader.getResource(prop.getProperty("hadoop.jaasConfUrl"));
URL krb5 = classLoader.getResource(prop.getProperty("hadoop.krb5Url"));

# For use external krb5 and jaas files
#URL jaas = new File(prop.getProperty("hadoop.jaasConfUrl")).toURL();
#URL krb5 = new File(prop.getProperty("hadoop.krb5Url")).toURL();

client.setJaasConfUrl(jaas);
client.setKrbConfUrl(krb5);

String keytabPath = new File("xxx.keytab").getPath();

FileSystem fs = client.hadoopConnectionWithKeytab(keytabPath, "xxx@xxx.CORP");

// or with user/password
FileSystem fs = client.hadoopConnectionWithUserPassword("xxx@xxx.CORP", "xxx");

Command line Interface

The project provide a fat jar with the original Hadoop client hdfs dfs interface usable on your local machine.

Configuration

Copy next to the hadoop-client-cli.jar :

  • your hadoop.properties
  • the krb5.conf
  • the jaas.conf
  • your keytab xxx.keytab (if keytab authentication)

Example hadoop.properties for CLI with keytab

hadoop.cluster=clustername
hadoop.failoverProxy=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
hadoop.namenodes=nn1,nn2
hadoop.rpcAddress=[DNS_NAMENODE1]:[PORT_RPC],[DNS_NAMENODE2]:[PORT_RPC]
hadoop.httpAddress=[DNS_NAMENODE1]:[PORT_HTTP],[DNS_NAMENODE2]:[PORT_HTTP]
hadoop.krb5Url=krb5.conf
hadoop.jaasConfUrl=jaas.conf

#Keytab auth
hadoop.keytab=xxx.keytab
hadoop.principal=xxx@xxx.CORP
#hadoop.password=XXX

Example hadoop.properties for CLI with user/pass authentication

#User/pass auth
#hadoop.keytab=xxx.keytab
hadoop.principal=xx@xxx.CORP
hadoop.password=XXX

jaas.conf

HdfsHaSample {
  com.sun.security.auth.module.Krb5LoginModule required client=TRUE debug=true;
};

Usage

java -jar hadoop-client-cli.jar -ls /

Deploy CLI

Deploy jar and files in %userprofile%/hdfs and add directory to Windows PATH.

Add hdfs.bat

@ECHO OFF

setlocal
cd /d %~dp0
java -jar %userprofile%/hdfs/hadoop-client-cli.jar %*

Usage in cmd:

hdfs -ls /

About

A Java Hdfs client example and full Kerberos example for call hadoop commands directly in java code or on your local machine.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages