aboutsummaryrefslogtreecommitdiff
path: root/docs/hadoop-third-party-distributions.md
diff options
context:
space:
mode:
authorBrennon York <brennon.york@capitalone.com>2014-12-27 13:25:18 -0800
committerPatrick Wendell <pwendell@gmail.com>2014-12-27 13:26:38 -0800
commita3e51cc990812c8099dcaf1f3bd6d5bae45cf8e6 (patch)
treeb44d82a54f89d9a976a0ebfd3f59657538ddadb8 /docs/hadoop-third-party-distributions.md
parent080ceb771a1e6b9f844cfd4f1baa01133c106888 (diff)
downloadspark-a3e51cc990812c8099dcaf1f3bd6d5bae45cf8e6.tar.gz
spark-a3e51cc990812c8099dcaf1f3bd6d5bae45cf8e6.tar.bz2
spark-a3e51cc990812c8099dcaf1f3bd6d5bae45cf8e6.zip
[SPARK-4501][Core] - Create build/mvn to automatically download maven/zinc/scalac
Creates a top level directory script (as `build/mvn`) to automatically download zinc and the specific version of scala used to easily build spark. This will also download and install maven if the user doesn't already have it and all packages are hosted under the `build/` directory. Tested on both Linux and OSX OS's and both work. All commands pass through to the maven binary so it acts exactly as a traditional maven call would. Author: Brennon York <brennon.york@capitalone.com> Closes #3707 from brennonyork/SPARK-4501 and squashes the following commits: 0e5a0e4 [Brennon York] minor incorrect doc verbage (with -> this) 9b79e38 [Brennon York] fixed merge conflicts with dev/run-tests, properly quoted args in sbt/sbt, fixed bug where relative paths would fail if passed in from build/mvn d2d41b6 [Brennon York] added blurb about leverging zinc with build/mvn b979c58 [Brennon York] updated the merge conflict c5634de [Brennon York] updated documentation to overview build/mvn, updated all points where sbt/sbt was referenced with build/sbt b8437ba [Brennon York] set progress bars for curl and wget when not run on jenkins, no progress bar when run on jenkins, moved sbt script to build/sbt, wrote stub and warning under sbt/sbt which calls build/sbt, modified build/sbt to use the correct directory, fixed bug in build/sbt-launch-lib.bash to correctly pull the sbt version be11317 [Brennon York] added switch to silence download progress only if AMPLAB_JENKINS is set 28d0a99 [Brennon York] updated to remove the python dependency, uses grep instead 7e785a6 [Brennon York] added silent and quiet flags to curl and wget respectively, added single echo output to denote start of a download if download is needed 14a5da0 [Brennon York] removed unnecessary zinc output on startup 1af4a94 [Brennon York] fixed bug with uppercase vs lowercase variable 3e8b9b3 [Brennon York] updated to properly only restart zinc if it was freshly installed a680d12 [Brennon York] Added comments to functions and tested various mvn calls bb8cc9d [Brennon York] removed package files ef017e6 [Brennon York] removed OS complexities, setup generic install_app call, removed extra file complexities, removed help, removed forced install (defaults now), removed double-dash from cli 07bf018 [Brennon York] Updated to specifically handle pulling down the correct scala version f914dea [Brennon York] Beginning final portions of localized scala home 69c4e44 [Brennon York] working linux and osx installers for purely local mvn build 4a1609c [Brennon York] finalizing working linux install for maven to local ./build/apache-maven folder cbfcc68 [Brennon York] Changed the default sbt/sbt to build/sbt and added a build/mvn which will automatically download, install, and execute maven with zinc for easier build capability
Diffstat (limited to 'docs/hadoop-third-party-distributions.md')
-rw-r--r--docs/hadoop-third-party-distributions.md10
1 files changed, 5 insertions, 5 deletions
diff --git a/docs/hadoop-third-party-distributions.md b/docs/hadoop-third-party-distributions.md
index dd73e9dc54..87dcc58feb 100644
--- a/docs/hadoop-third-party-distributions.md
+++ b/docs/hadoop-third-party-distributions.md
@@ -18,7 +18,7 @@ see the guide on [building with maven](building-spark.html#specifying-the-hadoop
The table below lists the corresponding `hadoop.version` code for each CDH/HDP release. Note that
some Hadoop releases are binary compatible across client versions. This means the pre-built Spark
-distribution may "just work" without you needing to compile. That said, we recommend compiling with
+distribution may "just work" without you needing to compile. That said, we recommend compiling with
the _exact_ Hadoop version you are running to avoid any compatibility errors.
<table>
@@ -50,7 +50,7 @@ the _exact_ Hadoop version you are running to avoid any compatibility errors.
In SBT, the equivalent can be achieved by setting the the `hadoop.version` property:
- sbt/sbt -Dhadoop.version=1.0.4 assembly
+ build/sbt -Dhadoop.version=1.0.4 assembly
# Linking Applications to the Hadoop Version
@@ -98,11 +98,11 @@ Spark can run in a variety of deployment modes:
* Using dedicated set of Spark nodes in your cluster. These nodes should be co-located with your
Hadoop installation.
-* Running on the same nodes as an existing Hadoop installation, with a fixed amount memory and
+* Running on the same nodes as an existing Hadoop installation, with a fixed amount memory and
cores dedicated to Spark on each node.
* Run Spark alongside Hadoop using a cluster resource manager, such as YARN or Mesos.
-These options are identical for those using CDH and HDP.
+These options are identical for those using CDH and HDP.
# Inheriting Cluster Configuration
@@ -116,5 +116,5 @@ The location of these configuration files varies across CDH and HDP versions, bu
a common location is inside of `/etc/hadoop/conf`. Some tools, such as Cloudera Manager, create
configurations on-the-fly, but offer a mechanisms to download copies of them.
-To make these files visible to Spark, set `HADOOP_CONF_DIR` in `$SPARK_HOME/spark-env.sh`
+To make these files visible to Spark, set `HADOOP_CONF_DIR` in `$SPARK_HOME/spark-env.sh`
to a location containing the configuration files.