[SPARK-787] Add S3 configuration parameters to the EC2 deploy scripts

When deploying to AWS, there is additional configuration that is required to read S3 files. EMR creates it automatically, there is no reason that the Spark EC2 script shouldn't. This PR requires a corresponding PR to the mesos/spark-ec2 to be merged, as it gets cloned in the process of setting up machines: https://github.com/mesos/spark-ec2/pull/58 Author: Dan Osipov <daniil.osipov@shazam.com> Closes #1120 from danosipov/s3_credentials and squashes the following commits: 758da8b [Dan Osipov] Modify documentation to include the new parameter 71fab14 [Dan Osipov] Use a parameter --copy-aws-credentials to enable S3 credential deployment 7e0da26 [Dan Osipov] Get AWS credentials out of boto connection instance 39bdf30 [Dan Osipov] Add S3 configuration parameters to the EC2 deploy scripts
author: Dan Osipov <daniil.osipov@shazam.com> 2014-09-16 13:40:16 -0700
committer: Patrick Wendell <pwendell@gmail.com> 2014-09-16 13:40:16 -0700
commit: b20171267d610715d5b0a86b474c903e9bc3a1a3 (patch)
tree: cc5b31cb62a4764412a4aa8569fa05a9875b49f3 /docs/ec2-scripts.md
parent: ec1adecbb72d291d7ef122fb0505bae53116e0e6 (diff)
download: spark-b20171267d610715d5b0a86b474c903e9bc3a1a3.tar.gz
spark-b20171267d610715d5b0a86b474c903e9bc3a1a3.tar.bz2
spark-b20171267d610715d5b0a86b474c903e9bc3a1a3.zip
1 files changed, 1 insertions, 1 deletions
diff --git a/docs/ec2-scripts.md b/docs/ec2-scripts.md
index f5ac6d894e..b2ca6a9b48 100644
--- a/docs/ec2-scripts.md
+++ b/docs/ec2-scripts.md
@@ -156,6 +156,6 @@ If you have a patch or suggestion for one of these limitations, feel free to
 
 # Accessing Data in S3
 
-Spark's file interface allows it to process data in Amazon S3 using the same URI formats that are supported for Hadoop. You can specify a path in S3 as input through a URI of the form `s3n://<bucket>/path`. You will also need to set your Amazon security credentials, either by setting the environment variables `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` before your program or through `SparkContext.hadoopConfiguration`. Full instructions on S3 access using the Hadoop input libraries can be found on the [Hadoop S3 page](http://wiki.apache.org/hadoop/AmazonS3).
+Spark's file interface allows it to process data in Amazon S3 using the same URI formats that are supported for Hadoop. You can specify a path in S3 as input through a URI of the form `s3n://<bucket>/path`. To provide AWS credentials for S3 access, launch the Spark cluster with the option `--copy-aws-credentials`. Full instructions on S3 access using the Hadoop input libraries can be found on the [Hadoop S3 page](http://wiki.apache.org/hadoop/AmazonS3).
 
 In addition to using a single input file, you can also use a directory of files as input by simply giving the path to the directory.
author	Dan Osipov <daniil.osipov@shazam.com>	2014-09-16 13:40:16 -0700
committer	Patrick Wendell <pwendell@gmail.com>	2014-09-16 13:40:16 -0700
commit	b20171267d610715d5b0a86b474c903e9bc3a1a3 (patch)
tree	cc5b31cb62a4764412a4aa8569fa05a9875b49f3 /docs/ec2-scripts.md
parent	ec1adecbb72d291d7ef122fb0505bae53116e0e6 (diff)
download	spark-b20171267d610715d5b0a86b474c903e9bc3a1a3.tar.gz spark-b20171267d610715d5b0a86b474c903e9bc3a1a3.tar.bz2 spark-b20171267d610715d5b0a86b474c903e9bc3a1a3.zip