aboutsummaryrefslogtreecommitdiff
path: root/lib/hadoop-0.20.0/contrib/hdfsproxy/README
blob: 2c33988926ddc46b26a542f3f4c48b05cb5b6c9e (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
HDFSPROXY is an HTTPS proxy server that exposes the same HSFTP interface as a 
real cluster. It authenticates users via user certificates and enforce access 
control based on configuration files.

Starting up an HDFSPROXY server is similar to starting up an HDFS cluster. 
Simply run "hdfsproxy" shell command. The main configuration file is 
hdfsproxy-default.xml, which should be on the classpath. hdfsproxy-env.sh 
can be used to set up environmental variables. In particular, JAVA_HOME should 
be set. Additional configuration files include user-certs.xml, 
user-permissions.xml and ssl-server.xml, which are used to specify allowed user
certs, allowed directories/files, and ssl keystore information for the proxy, 
respectively. The location of these files can be specified in 
hdfsproxy-default.xml. Environmental variable HDFSPROXY_CONF_DIR can be used to
point to the directory where these configuration files are located. The 
configuration files of the proxied HDFS cluster should also be available on the
classpath (hdfs-default.xml and hdfs-site.xml).

Mirroring those used in HDFS, a few shell scripts are provided to start and 
stop a group of proxy servers. The hosts to run hdfsproxy on are specified in 
hdfsproxy-hosts file, one host per line. All hdfsproxy servers are stateless 
and run independently from each other. Simple load balancing can be set up by 
mapping all hdfsproxy server IP addresses to a single hostname. Users should 
use that hostname to access the proxy. If an IP address look up for that 
hostname returns more than one IP addresses, an HFTP/HSFTP client will randomly
pick one to use.

Command "hdfsproxy -reloadPermFiles" can be used to trigger reloading of 
user-certs.xml and user-permissions.xml files on all proxy servers listed in 
the hdfsproxy-hosts file. Similarly, "hdfsproxy -clearUgiCache" command can be 
used to clear the UGI caches on all proxy servers.