diff options
author | Sean Owen <sowen@cloudera.com> | 2015-11-01 12:25:49 +0000 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2015-11-01 12:25:49 +0000 |
commit | 643c49c75ee95243fd19ae73b5170e6e6e212b8d (patch) | |
tree | ff52206281101054824fa1152b2cb8cff53e196d /docs/configuration.md | |
parent | aa494a9c2ebd59baec47beb434cd09bf3f188218 (diff) | |
download | spark-643c49c75ee95243fd19ae73b5170e6e6e212b8d.tar.gz spark-643c49c75ee95243fd19ae73b5170e6e6e212b8d.tar.bz2 spark-643c49c75ee95243fd19ae73b5170e6e6e212b8d.zip |
[SPARK-11305][DOCS] Remove Third-Party Hadoop Distributions Doc Page
Remove Hadoop third party distro page, and move Hadoop cluster config info to configuration page
CC pwendell
Author: Sean Owen <sowen@cloudera.com>
Closes #9298 from srowen/SPARK-11305.
Diffstat (limited to 'docs/configuration.md')
-rw-r--r-- | docs/configuration.md | 15 |
1 files changed, 15 insertions, 0 deletions
diff --git a/docs/configuration.md b/docs/configuration.md index 682384d424..c276e8e90d 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -1674,3 +1674,18 @@ Spark uses [log4j](http://logging.apache.org/log4j/) for logging. You can config To specify a different configuration directory other than the default "SPARK_HOME/conf", you can set SPARK_CONF_DIR. Spark will use the the configuration files (spark-defaults.conf, spark-env.sh, log4j.properties, etc) from this directory. + +# Inheriting Hadoop Cluster Configuration + +If you plan to read and write from HDFS using Spark, there are two Hadoop configuration files that +should be included on Spark's classpath: + +* `hdfs-site.xml`, which provides default behaviors for the HDFS client. +* `core-site.xml`, which sets the default filesystem name. + +The location of these configuration files varies across CDH and HDP versions, but +a common location is inside of `/etc/hadoop/conf`. Some tools, such as Cloudera Manager, create +configurations on-the-fly, but offer a mechanisms to download copies of them. + +To make these files visible to Spark, set `HADOOP_CONF_DIR` in `$SPARK_HOME/spark-env.sh` +to a location containing the configuration files. |