[SPARK-12814][DOCUMENT] Add deploy instructions for Python in flume integration doc

This PR added instructions to get flume assembly jar for Python users in the flume integration page like Kafka doc. Author: Shixiong Zhu <shixiong@databricks.com> Closes #10746 from zsxwing/flume-doc.
author: Shixiong Zhu <shixiong@databricks.com> 2016-01-18 15:38:03 -0800
committer: Tathagata Das <tathagata.das1565@gmail.com> 2016-01-18 15:38:03 -0800
commit: a973f483f6b819ed4ecac27ff5c064ea13a8dd71 (patch)
tree: e55be6fce5841d0dd1d197f3276bf3bf3ae2398f /docs/streaming-flume-integration.md
parent: 404190221a788ebc3a0cbf5cb47cf532436ce965 (diff)
download: spark-a973f483f6b819ed4ecac27ff5c064ea13a8dd71.tar.gz
spark-a973f483f6b819ed4ecac27ff5c064ea13a8dd71.tar.bz2
spark-a973f483f6b819ed4ecac27ff5c064ea13a8dd71.zip
1 files changed, 11 insertions, 2 deletions
diff --git a/docs/streaming-flume-integration.md b/docs/streaming-flume-integration.md
index 383d954409..e2d589b843 100644
--- a/docs/streaming-flume-integration.md
+++ b/docs/streaming-flume-integration.md
@@ -71,7 +71,16 @@ configuring Flume agents.
     cluster (Mesos, YARN or Spark Standalone), so that resource allocation can match the names and launch
     the receiver in the right machine.
 
-3. **Deploying:** Package `spark-streaming-flume_{{site.SCALA_BINARY_VERSION}}` and its dependencies (except `spark-core_{{site.SCALA_BINARY_VERSION}}` and `spark-streaming_{{site.SCALA_BINARY_VERSION}}` which are provided by `spark-submit`) into the application JAR. Then use `spark-submit` to launch your application (see [Deploying section](streaming-programming-guide.html#deploying-applications) in the main programming guide).
+3. **Deploying:** As with any Spark applications, `spark-submit` is used to launch your application. However, the details are slightly different for Scala/Java applications and Python applications.
+
+	For Scala and Java applications, if you are using SBT or Maven for project management, then package `spark-streaming-flume_{{site.SCALA_BINARY_VERSION}}` and its dependencies into the application JAR. Make sure `spark-core_{{site.SCALA_BINARY_VERSION}}` and `spark-streaming_{{site.SCALA_BINARY_VERSION}}` are marked as `provided` dependencies as those are already present in a Spark installation. Then use `spark-submit` to launch your application (see [Deploying section](streaming-programming-guide.html#deploying-applications) in the main programming guide).
+
+	For Python applications which lack SBT/Maven project management, `spark-streaming-flume_{{site.SCALA_BINARY_VERSION}}` and its dependencies can be directly added to `spark-submit` using `--packages` (see [Application Submission Guide](submitting-applications.html)). That is,
+
+	    ./bin/spark-submit --packages org.apache.spark:spark-streaming-flume_{{site.SCALA_BINARY_VERSION}}:{{site.SPARK_VERSION_SHORT}} ...
+
+	Alternatively, you can also download the JAR of the Maven artifact `spark-streaming-flume-assembly` from the
+	[Maven repository](http://search.maven.org/#search|ga|1|a%3A%22spark-streaming-flume-assembly_{{site.SCALA_BINARY_VERSION}}%22%20AND%20v%3A%22{{site.SPARK_VERSION_SHORT}}%22) and add it to `spark-submit` with `--jars`.
 
 ## Approach 2: Pull-based Approach using a Custom Sink
 Instead of Flume pushing data directly to Spark Streaming, this approach runs a custom Flume sink that allows the following.
@@ -157,7 +166,7 @@ configuring Flume agents.
 
 	Note that each input DStream can be configured to receive data from multiple sinks.
 
-3. **Deploying:** Package `spark-streaming-flume_{{site.SCALA_BINARY_VERSION}}` and its dependencies (except `spark-core_{{site.SCALA_BINARY_VERSION}}` and `spark-streaming_{{site.SCALA_BINARY_VERSION}}` which are provided by `spark-submit`) into the application JAR. Then use `spark-submit` to launch your application (see [Deploying section](streaming-programming-guide.html#deploying-applications) in the main programming guide).
+3. **Deploying:** This is same as the first approach.
author	Shixiong Zhu <shixiong@databricks.com>	2016-01-18 15:38:03 -0800
committer	Tathagata Das <tathagata.das1565@gmail.com>	2016-01-18 15:38:03 -0800
commit	a973f483f6b819ed4ecac27ff5c064ea13a8dd71 (patch)
tree	e55be6fce5841d0dd1d197f3276bf3bf3ae2398f /docs/streaming-flume-integration.md
parent	404190221a788ebc3a0cbf5cb47cf532436ce965 (diff)
download	spark-a973f483f6b819ed4ecac27ff5c064ea13a8dd71.tar.gz spark-a973f483f6b819ed4ecac27ff5c064ea13a8dd71.tar.bz2 spark-a973f483f6b819ed4ecac27ff5c064ea13a8dd71.zip