aboutsummaryrefslogtreecommitdiff
path: root/ec2
Commit message (Collapse)AuthorAgeFilesLines
* updated ec2 instance typesBrendan Collins2015-05-081-23/+47
| | | | | | | | | | | | | | | I needed to run some d2 instances, so I updated the spark_ec2.py accordingly Author: Brendan Collins <bcollins@blueraster.com> Closes #6014 from brendancol/ec2-instance-types-update and squashes the following commits: d7b4191 [Brendan Collins] Merge branch 'ec2-instance-types-update' of github.com:brendancol/spark into ec2-instance-types-update 6366c45 [Brendan Collins] added back cc1.4xlarge fc2931f [Brendan Collins] updated ec2 instance types 80c2aa6 [Brendan Collins] vertically aligned whitespace 85c6236 [Brendan Collins] vertically aligned whitespace 1657c26 [Brendan Collins] updated ec2 instance types
* [SPARK-4897] [PySpark] Python 3 supportDavies Liu2015-04-161-131/+131
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This PR update PySpark to support Python 3 (tested with 3.4). Known issue: unpickle array from Pyrolite is broken in Python 3, those tests are skipped. TODO: ec2/spark-ec2.py is not fully tested with python3. Author: Davies Liu <davies@databricks.com> Author: twneale <twneale@gmail.com> Author: Josh Rosen <joshrosen@databricks.com> Closes #5173 from davies/python3 and squashes the following commits: d7d6323 [Davies Liu] fix tests 6c52a98 [Davies Liu] fix mllib test 99e334f [Davies Liu] update timeout b716610 [Davies Liu] Merge branch 'master' of github.com:apache/spark into python3 cafd5ec [Davies Liu] adddress comments from @mengxr bf225d7 [Davies Liu] Merge branch 'master' of github.com:apache/spark into python3 179fc8d [Davies Liu] tuning flaky tests 8c8b957 [Davies Liu] fix ResourceWarning in Python 3 5c57c95 [Davies Liu] Merge branch 'master' of github.com:apache/spark into python3 4006829 [Davies Liu] fix test 2fc0066 [Davies Liu] add python3 path 71535e9 [Davies Liu] fix xrange and divide 5a55ab4 [Davies Liu] Merge branch 'master' of github.com:apache/spark into python3 125f12c [Davies Liu] Merge branch 'master' of github.com:apache/spark into python3 ed498c8 [Davies Liu] fix compatibility with python 3 820e649 [Davies Liu] Merge branch 'master' of github.com:apache/spark into python3 e8ce8c9 [Davies Liu] Merge branch 'master' of github.com:apache/spark into python3 ad7c374 [Davies Liu] fix mllib test and warning ef1fc2f [Davies Liu] fix tests 4eee14a [Davies Liu] Merge branch 'master' of github.com:apache/spark into python3 20112ff [Davies Liu] Merge branch 'master' of github.com:apache/spark into python3 59bb492 [Davies Liu] fix tests 1da268c [Davies Liu] Merge branch 'master' of github.com:apache/spark into python3 ca0fdd3 [Davies Liu] fix code style 9563a15 [Davies Liu] add imap back for python 2 0b1ec04 [Davies Liu] make python examples work with Python 3 d2fd566 [Davies Liu] Merge branch 'master' of github.com:apache/spark into python3 a716d34 [Davies Liu] test with python 3.4 f1700e8 [Davies Liu] fix test in python3 671b1db [Davies Liu] fix test in python3 692ff47 [Davies Liu] fix flaky test 7b9699f [Davies Liu] invalidate import cache for Python 3.3+ 9c58497 [Davies Liu] fix kill worker 309bfbf [Davies Liu] keep compatibility 5707476 [Davies Liu] cleanup, fix hash of string in 3.3+ 8662d5b [Davies Liu] Merge branch 'master' of github.com:apache/spark into python3 f53e1f0 [Davies Liu] fix tests 70b6b73 [Davies Liu] compile ec2/spark_ec2.py in python 3 a39167e [Davies Liu] support customize class in __main__ 814c77b [Davies Liu] run unittests with python 3 7f4476e [Davies Liu] mllib tests passed d737924 [Davies Liu] pass ml tests 375ea17 [Davies Liu] SQL tests pass 6cc42a9 [Davies Liu] rename 431a8de [Davies Liu] streaming tests pass 78901a7 [Davies Liu] fix hash of serializer in Python 3 24b2f2e [Davies Liu] pass all RDD tests 35f48fe [Davies Liu] run future again 1eebac2 [Davies Liu] fix conflict in ec2/spark_ec2.py 6e3c21d [Davies Liu] make cloudpickle work with Python3 2fb2db3 [Josh Rosen] Guard more changes behind sys.version; still doesn't run 1aa5e8f [twneale] Turned out `pickle.DictionaryType is dict` == True, so swapped it out 7354371 [twneale] buffer --> memoryview I'm not super sure if this a valid change, but the 2.7 docs recommend using memoryview over buffer where possible, so hoping it'll work. b69ccdf [twneale] Uses the pure python pickle._Pickler instead of c-extension _pickle.Pickler. It appears pyspark 2.7 uses the pure python pickler as well, so this shouldn't degrade pickling performance (?). f40d925 [twneale] xrange --> range e104215 [twneale] Replaces 2.7 types.InstsanceType with 3.4 `object`....could be horribly wrong depending on how types.InstanceType is used elsewhere in the package--see http://bugs.python.org/issue8206 79de9d0 [twneale] Replaces python2.7 `file` with 3.4 _io.TextIOWrapper 2adb42d [Josh Rosen] Fix up some import differences between Python 2 and 3 854be27 [Josh Rosen] Run `futurize` on Python code: 7c5b4ce [Josh Rosen] Remove Python 3 check in shell.py.
* [SPARK-5242]: Add --private-ips flag to EC2 scriptMichelangelo D'Agostino2015-04-081-17/+47
| | | | | | | | | | | | | | The `spark_ec2.py` script currently references the `ip_address` and `public_dns_name` attributes of an instance. On private networks, these fields aren't set, so we have problems. This PR introduces a `--private-ips` flag that instead refers to the `private_ip_address` attribute in both cases. Author: Michelangelo D'Agostino <mdagostino@civisanalytics.com> Closes #5244 from mdagost/ec2_private_nets and squashes the following commits: b684c67 [Michelangelo D'Agostino] STY: A few python lint changes. a4a2eac [Michelangelo D'Agostino] ENH: Fix IP's typo and refactor conditional logic into functions. c004604 [Michelangelo D'Agostino] ENH: Add --private-ips flag.
* [SPARK-6636] Use public DNS hostname everywhere in spark_ec2.pyMatt Aasted2015-04-061-1/+1
| | | | | | | | | | The spark_ec2.py script uses public_dns_name everywhere in the script except for testing ssh availability, which is done using the public ip address of the instances. This breaks the script for users who are deploying the cluster with a private-network-only security group. The fix is to use public_dns_name in the remaining place. Author: Matt Aasted <aasted@twitch.tv> Closes #5302 from aasted/master and squashes the following commits: 60cf6ee [Matt Aasted] [SPARK-6636] Use public DNS hostname everywhere in spark_ec2.py
* [EC2] [SPARK-6600] Open ports in ec2/spark_ec2.py to allow HDFS NFS gatewayFlorian Verhein2015-04-011-0/+7
| | | | | | | | | | Authorizes incoming access to master on the ports required to use the hadoop hdfs nfs gateway from outside the cluster. Author: Florian Verhein <florian.verhein@gmail.com> Closes #5257 from florianverhein/master and squashes the following commits: 72a586a [Florian Verhein] [EC2] [SPARK-6600] initial impl
* [SPARK-6219] [Build] Check that Python code compilesNicholas Chammas2015-03-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This PR expands the Python lint checks so that they check for obvious compilation errors in our Python code. For example: ``` $ ./dev/lint-python Python lint checks failed. Compiling ./ec2/spark_ec2.py ... File "./ec2/spark_ec2.py", line 618 return (master_nodes,, slave_nodes) ^ SyntaxError: invalid syntax ./ec2/spark_ec2.py:618:25: E231 missing whitespace after ',' ./ec2/spark_ec2.py:1117:101: E501 line too long (102 > 100 characters) ``` This PR also bumps up the version of `pep8`. It ignores new types of checks introduced by that version bump while fixing problems missed by the older version of `pep8` we were using. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #4941 from nchammas/compile-spark-ec2 and squashes the following commits: 75e31d8 [Nicholas Chammas] upgrade pep8 + check compile b33651c [Nicholas Chammas] PEP8 line length
* [SPARK-6402][DOC] - Remove some refererences to shark in docs and ec2Pierre Borckmans2015-03-191-1/+0
| | | | | | | | | | | | | | | EC2 script and job scheduling documentation still refered to Shark. I removed these references. I also removed a remaining `SHARK_VERSION` variable from `ec2-variables.sh`. Author: Pierre Borckmans <pierre.borckmans@realimpactanalytics.com> Closes #5083 from pierre-borckmans/remove_refererences_to_shark_in_docs and squashes the following commits: 4e90ffc [Pierre Borckmans] Removed deprecated SHARK_VERSION caea407 [Pierre Borckmans] Remove shark reference from ec2 script doc 196c744 [Pierre Borckmans] Removed references to Shark
* [SPARK-6186] [EC2] Make Tachyon version configurable in EC2 deployment scriptcheng chang2015-03-102-1/+21
| | | | | | | | | | | | | | | | | This PR comes from Tachyon community to solve the issue: https://tachyon.atlassian.net/browse/TACHYON-11 An accompanying PR is in mesos/spark-ec2: https://github.com/mesos/spark-ec2/pull/101 Author: cheng chang <myairia@gmail.com> Closes #4901 from uronce-cc/master and squashes the following commits: 313aa36 [cheng chang] minor re-wording fd2a48e [cheng chang] Remove Tachyon when deploying through git hash 1d53c5c [cheng chang] add default value to --tachyon-version 6f8887e [cheng chang] make tachyon version configurable
* [SPARK-6191] [EC2] Generalize ability to download libsNicholas Chammas2015-03-101-28/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now we have a method to specifically download boto. This PR generalizes it so it's easy to download additional libraries if we want. For example, adding new external libraries for spark-ec2 is now as simple as: ```python external_libs = [ { "name": "boto", "version": "2.34.0", "md5": "5556223d2d0cc4d06dd4829e671dcecd" }, { "name": "PyYAML", "version": "3.11", "md5": "f50e08ef0fe55178479d3a618efe21db" }, { "name": "argparse", "version": "1.3.0", "md5": "9bcf7f612190885c8c85e30ba41db3c7" } ] ``` Likely use cases: * Downloading PyYAML to allow spark-ec2 configs to be persisted as a YAML file. ([SPARK-925](https://issues.apache.org/jira/browse/SPARK-925)) * Downloading argparse to clean up / modernize our option parsing. First run output, with PyYAML and argparse added just for demonstration purposes: ```shell $ ./spark-ec2 --version Downloading external libraries that spark-ec2 needs from PyPI to /path/to/spark/ec2/lib... This should be a one-time operation. - Downloading boto... - Finished downloading boto. - Downloading PyYAML... - Finished downloading PyYAML. - Downloading argparse... - Finished downloading argparse. spark-ec2 1.2.1 ``` Output thereafter: ```shell $ ./spark-ec2 --version spark-ec2 1.2.1 ``` Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #4919 from nchammas/setup-ec2-libs and squashes the following commits: a077955 [Nicholas Chammas] print default region c95fb7d [Nicholas Chammas] to docstring 5448845 [Nicholas Chammas] remove libs added for demo purposes 60d8c23 [Nicholas Chammas] generalize ability to download libs
* [EC2] [SPARK-6188] Instance types can be mislabeled when re-starting cluster ↵Theodore Vasiloudis2015-03-091-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | with default arguments As described in https://issues.apache.org/jira/browse/SPARK-6188 and discovered in https://issues.apache.org/jira/browse/SPARK-5838. When re-starting a cluster, if the user does not provide the instance types, which is the recommended behavior in the docs currently, the instance will be assigned the default type m1.large. This then affects the setup of the machines. This solves this by getting the instance types from the existing instances, and overwriting the default options. EDIT: Further clarification of the issue: In short, while the instances themselves are the same as launched, their setup is done assuming the default instance type, m1.large. This means that the machines are assumed to have 2 disks, and that leads to problems that are described in in issue [5838](https://issues.apache.org/jira/browse/SPARK-5838), where machines that have one disk end up having shuffle spills in the in the small (8GB) snapshot partitions that quickly fills up and results in failing jobs due to "No space left on device" errors. Other instance specific settings that are set in the spark_ec2.py script are likely to be wrong as well. Author: Theodore Vasiloudis <thvasilo@users.noreply.github.com> Author: Theodore Vasiloudis <tvas@sics.se> Closes #4916 from thvasilo/SPARK-6188]-Instance-types-can-be-mislabeled-when-re-starting-cluster-with-default-arguments and squashes the following commits: 6705b98 [Theodore Vasiloudis] Added comment to clarify setting master instance type to the empty string. a3d29fe [Theodore Vasiloudis] More trailing whitespace 7b32429 [Theodore Vasiloudis] Removed trailing whitespace 3ebd52a [Theodore Vasiloudis] Make sure that the instance type is correct when relaunching a cluster.
* [SPARK-6193] [EC2] Push group filter up to EC2Nicholas Chammas2015-03-081-37/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When looking for a cluster, spark-ec2 currently pulls down [info for all instances](https://github.com/apache/spark/blob/eb48fd6e9d55fb034c00e61374bb9c2a86a82fb8/ec2/spark_ec2.py#L620) and filters locally. When working on an AWS account with hundreds of active instances, this step alone can take over 10 seconds. This PR improves how spark-ec2 searches for clusters by pushing the filter up to EC2. Basically, the problem (and solution) look like this: ```python >>> timeit.timeit('blah = conn.get_all_reservations()', setup='from __main__ import conn', number=10) 116.96390509605408 >>> timeit.timeit('blah = conn.get_all_reservations(filters={"instance.group-name": ["my-cluster-master"]})', setup='from __main__ import conn', number=10) 4.629754066467285 ``` Translated to a user-visible action, this looks like (against an AWS account with ~200 active instances): ```shell # master $ python -m timeit -n 3 --setup 'import subprocess' 'subprocess.call("./spark-ec2 get-master my-cluster --region us-west-2", shell=True)' ... 3 loops, best of 3: 9.83 sec per loop # this PR $ python -m timeit -n 3 --setup 'import subprocess' 'subprocess.call("./spark-ec2 get-master my-cluster --region us-west-2", shell=True)' ... 3 loops, best of 3: 1.47 sec per loop ``` This PR also refactors `get_existing_cluster()` to make it, I hope, simpler. Finally, this PR fixes some minor grammar issues related to printing status to the user. :tophat: :clap: Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #4922 from nchammas/get-existing-cluster-faster and squashes the following commits: 18802f1 [Nicholas Chammas] ignore shutting-down f2a5b9f [Nicholas Chammas] fix grammar d96a489 [Nicholas Chammas] push group filter up to EC2
* [SPARK-5641] [EC2] Allow spark_ec2.py to copy arbitrary files to clusterFlorian Verhein2015-03-071-0/+42
| | | | | | | | | | | | | | Give users an easy way to rcp a directory structure to the master's / as part of the cluster launch, at a useful point in the workflow (before setup.sh is called on the master). This is an alternative approach to meeting requirements discussed in https://github.com/apache/spark/pull/4487 Author: Florian Verhein <florian.verhein@gmail.com> Closes #4583 from florianverhein/master and squashes the following commits: 49dee88 [Florian Verhein] removed addition of trailing / in rsync to give user this option, added documentation in help 7b8e3d8 [Florian Verhein] remove unused args 87d922c [Florian Verhein] [SPARK-5641] [EC2] implement --deploy-root-dir
* [EC2] Reorder print statements on terminationNicholas Chammas2015-03-071-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PR reorders some print statements slightly on cluster termination so that they read better. For example, from this: ``` Are you sure you want to destroy the cluster spark-cluster-test? The following instances will be terminated: Searching for existing cluster spark-cluster-test in region us-west-2... Found 1 master(s), 2 slaves > ... ALL DATA ON ALL NODES WILL BE LOST!! Destroy cluster spark-cluster-test (y/N): ``` To this: ``` Searching for existing cluster spark-cluster-test in region us-west-2... Found 1 master(s), 2 slaves The following instances will be terminated: > ... ALL DATA ON ALL NODES WILL BE LOST!! Are you sure you want to destroy the cluster spark-cluster-test? (y/N) ``` Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #4932 from nchammas/termination-print-order and squashes the following commits: c23711d [Nicholas Chammas] reorder prints on termination
* [SPARK-5335] Fix deletion of security groups within a VPCVladimir Grigor2015-02-121-3/+4
| | | | | | | | | | | | | | | | Please see https://issues.apache.org/jira/browse/SPARK-5335. The fix itself is in e58a8b01a8bedcbfbbc6d04b1c1489255865cf87 commit. Two earlier commits are fixes of another VPC related bug waiting to be merged. I should have created former bug fix in own branch then this fix would not have former fixes. :( This code is released under the project's license. Author: Vladimir Grigor <vladimir@kiosked.com> Author: Vladimir Grigor <vladimir@voukka.com> Closes #4122 from voukka/SPARK-5335_delete_sg_vpc and squashes the following commits: 090dca9 [Vladimir Grigor] fixes as per review: removed printing of group_id and added comment 730ec05 [Vladimir Grigor] fix for SPARK-5335: Destroying cluster in VPC with "--delete-groups" fails to remove security groups
* [EC2] Update default Spark version to 1.2.1Katsunori Kanda2015-02-121-1/+2
| | | | | | | | Author: Katsunori Kanda <potix2@gmail.com> Closes #4566 from potix2/ec2-update-version-1-2-1 and squashes the following commits: 77e7840 [Katsunori Kanda] [EC2] Update default Spark version to 1.2.1
* [SPARK-5668] Display region in spark_ec2.py get_existing_cluster()Miguel Peralvo2015-02-101-4/+7
| | | | | | | | | | | | | Show the region for the different messages displayed by get_existing_cluster(): The search, found and error messages. Author: Miguel Peralvo <miguel.peralvo@gmail.com> Closes #4457 from MiguelPeralvo/patch-2 and squashes the following commits: a5514c8 [Miguel Peralvo] Update spark_ec2.py 0a837b0 [Miguel Peralvo] Update spark_ec2.py 3923f36 [Miguel Peralvo] Update spark_ec2.py 4ecd9f9 [Miguel Peralvo] [SPARK-5668] Display region in spark_ec2.py get_existing_cluster()
* [SPARK-1805] [EC2] Validate instance typesNicholas Chammas2015-02-101-51/+81
| | | | | | | | | | | | | | | | | Addresses [SPARK-1805](https://issues.apache.org/jira/browse/SPARK-1805), though doesn't resolve it completely. Error out quickly if the user asks for the master and slaves to have different AMI virtualization types, since we don't currently support that. In addition to that, we print warnings if the inputted instance types are not recognized, though I would prefer if we errored out. Elsewhere in the script it seems [we allow unrecognized instance types](https://github.com/apache/spark/blob/5de14cc2763a8211f77eeb55940dec025822eb78/ec2/spark_ec2.py#L331), though I think we should remove that. It's messy, but it should serve us until we enhance spark-ec2 to support clusters with mixed virtualization types. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #4455 from nchammas/ec2-master-slave-different-virtualization and squashes the following commits: ce28609 [Nicholas Chammas] fix style b0adba0 [Nicholas Chammas] validate input instance types
* [SPARK-5611] [EC2] Allow spark-ec2 repo and branch to be set on CLI of ↵Florian Verhein2015-02-091-5/+32
| | | | | | | | | | | | | | | | | | | spark_ec2.py and by extension, the ami-list Useful for using alternate spark-ec2 repos or branches. Author: Florian Verhein <florian.verhein@gmail.com> Closes #4385 from florianverhein/master and squashes the following commits: 7e2b4be [Florian Verhein] [SPARK-5611] [EC2] typo 8b653dc [Florian Verhein] [SPARK-5611] [EC2] Enforce only supporting spark-ec2 forks from github, log improvement bc4b0ed [Florian Verhein] [SPARK-5611] allow spark-ec2 repos with different names 8b5c551 [Florian Verhein] improve option naming, fix logging, fix lint failing, add guard to enforce spark-ec2 7724308 [Florian Verhein] [SPARK-5611] [EC2] fixes b42b68c [Florian Verhein] [SPARK-5611] [EC2] Allow spark-ec2 repo and branch to be set on CLI of spark_ec2.py
* [SPARK-5473] [EC2] Expose SSH failures after status checks passNicholas Chammas2015-02-091-12/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there is some fatal problem with launching a cluster, `spark-ec2` just hangs without giving the user useful feedback on what the problem is. This PR exposes the output of the SSH calls to the user if the SSH test fails during cluster launch for any reason but the instance status checks are all green. It also removes the growing trail of dots while waiting in favor of a fixed 3 dots. For example: ``` $ ./ec2/spark-ec2 -k key -i /incorrect/path/identity.pem --instance-type m3.medium --slaves 1 --zone us-east-1c launch "spark-test" Setting up security groups... Searching for existing cluster spark-test... Spark AMI: ami-35b1885c Launching instances... Launched 1 slaves in us-east-1c, regid = r-7dadd096 Launched master in us-east-1c, regid = r-fcadd017 Waiting for cluster to enter 'ssh-ready' state... Warning: SSH connection error. (This could be temporary.) Host: 127.0.0.1 SSH return code: 255 SSH output: Warning: Identity file /incorrect/path/identity.pem not accessible: No such file or directory. Warning: Permanently added '127.0.0.1' (RSA) to the list of known hosts. Permission denied (publickey). ``` This should give users enough information when some unrecoverable error occurs during launch so they can know to abort the launch. This will help avoid situations like the ones reported [here on Stack Overflow](http://stackoverflow.com/q/28002443/) and [here on the user list](http://mail-archives.apache.org/mod_mbox/spark-user/201501.mbox/%3C1422323829398-21381.postn3.nabble.com%3E), where the users couldn't tell what the problem was because it was being hidden by `spark-ec2`. This is a usability improvement that should be backported to 1.2. Resolves [SPARK-5473](https://issues.apache.org/jira/browse/SPARK-5473). Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #4262 from nchammas/expose-ssh-failure and squashes the following commits: 8bda6ed [Nicholas Chammas] default to print SSH output 2b92534 [Nicholas Chammas] show SSH output after status check pass
* [SPARK-5366][EC2] Check the mode of private keyliuchang08122015-02-081-0/+15
| | | | | | | | | | | | | | | | | | | Check the mode of private key file. Author: liuchang0812 <liuchang0812@gmail.com> Closes #4162 from Liuchang0812/ec2-script and squashes the following commits: fc37355 [liuchang0812] quota file name 01ed464 [liuchang0812] more output ce2a207 [liuchang0812] pep8 f44efd2 [liuchang0812] move code to real_main 8475a54 [liuchang0812] fix bug cd61a1a [liuchang0812] import stat c106cb2 [liuchang0812] fix trivis bug 89c9953 [liuchang0812] more output about checking private key 1177a90 [liuchang0812] remove commet 41188ab [liuchang0812] check the mode of private key
* SPARK-5403: Ignore UserKnownHostsFile in SSH callsGrzegorz Dubicki2015-02-061-0/+1
| | | | | | | | | | See https://issues.apache.org/jira/browse/SPARK-5403 Author: Grzegorz Dubicki <grzegorz.dubicki@gmail.com> Closes #4196 from grzegorz-dubicki/SPARK-5403 and squashes the following commits: a7d863f [Grzegorz Dubicki] Resolve start command hanging issue
* [SPARK-4983] Insert waiting time before tagging EC2 instancesGenTang2015-02-061-0/+3
| | | | | | | | | | | | | | | | | | | | | The boto API doesn't support tag EC2 instances in the same call that launches them. We add a five-second wait so EC2 has enough time to propagate the information so that the tagging can succeed. Author: GenTang <gen.tang86@gmail.com> Author: Gen TANG <gen.tang86@gmail.com> Closes #3986 from GenTang/spark-4983 and squashes the following commits: 13e257d [Gen TANG] modification of comments 47f06755 [GenTang] print the information ab7a931 [GenTang] solve the issus spark-4983 by inserting waiting time 3179737 [GenTang] Revert "handling exceptions about adding tags to ec2" 6a8b53b [GenTang] Revert "the improvement of exception handling" 13e97a6 [GenTang] Revert "typo" 63fd360 [GenTang] typo 692fc2b [GenTang] the improvement of exception handling 6adcf6d [GenTang] handling exceptions about adding tags to ec2
* [SPARK-5628] Add version option to spark-ec2Nicholas Chammas2015-02-061-8/+8
| | | | | | | | | | | | | | Every proper command line tool should include a `--version` option or something similar. This PR adds this to `spark-ec2` using the standard functionality provided by `optparse`. One thing we don't do here is follow the Python convention of setting `__version__`, since it seems awkward given how `spark-ec2` is laid out. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #4414 from nchammas/spark-ec2-show-version and squashes the following commits: 914cab5 [Nicholas Chammas] add version info
* [SPARK-5434] [EC2] Preserve spaces in EC2 pathNicholas Chammas2015-01-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Fixes [SPARK-5434](https://issues.apache.org/jira/browse/SPARK-5434). Simple demonstration of the problem and the fix: ``` $ spacey_path="/path/with some/spaces" $ dirname $spacey_path usage: dirname path $ echo $? 1 $ dirname "$spacey_path" /path/with some $ echo $? 0 ``` Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #4224 from nchammas/patch-1 and squashes the following commits: 960711a [Nicholas Chammas] [EC2] Preserve spaces in EC2 path
* [SPARK-5122] Remove Shark from spark-ec2Nicholas Chammas2015-01-081-34/+44
| | | | | | | | | | | | | | | | I moved the Spark-Shark version map [to the wiki](https://cwiki.apache.org/confluence/display/SPARK/Spark-Shark+version+mapping). This PR has a [matching PR in mesos/spark-ec2](https://github.com/mesos/spark-ec2/pull/89). Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #3939 from nchammas/remove-shark and squashes the following commits: 66e0841 [Nicholas Chammas] fix style ceeab85 [Nicholas Chammas] show default Spark GitHub repo 7270126 [Nicholas Chammas] validate Spark hashes db4935d [Nicholas Chammas] validate spark version upfront fc0d5b9 [Nicholas Chammas] remove Shark
* [EC2] Update mesos/spark-ec2 branch to branch-1.3Nicholas Chammas2014-12-251-1/+1
| | | | | | | | | | Going forward, we'll use matching branch names across the mesos/spark-ec2 and apache/spark repositories, per [the discussion here](https://github.com/mesos/spark-ec2/pull/85#issuecomment-68069589). Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #3804 from nchammas/patch-2 and squashes the following commits: cd2c0d4 [Nicholas Chammas] [EC2] Update mesos/spark-ec2 branch to branch-1.3
* [EC2] Update default Spark version to 1.2.0Nicholas Chammas2014-12-251-1/+4
| | | | | | | | | | | Now that 1.2.0 is out, let's update the default Spark version. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #3793 from nchammas/patch-1 and squashes the following commits: 3255832 [Nicholas Chammas] add 1.2.0 version to Spark-Shark map ec0e904 [Nicholas Chammas] [EC2] Update default Spark version to 1.2.0
* [SPARK-4890] Upgrade Boto to 2.34.0; automatically download Boto from PyPi ↵Josh Rosen2014-12-193-13/+38
| | | | | | | | | | | | | | | | | | instead of packaging it This patch upgrades `spark-ec2`'s Boto version to 2.34.0, since this is blocking several features. Newer versions of Boto don't work properly when they're loaded from a zipfile since they try to read a JSON file from a path relative to the Boto library sources. Therefore, this patch also changes spark-ec2 to automatically download Boto from PyPi if it's not present in `SPARK_EC2_DIR/lib`, similar to what we do in the `sbt/sbt` script. This shouldn't ben an issue for users since they already need to have an internet connection to launch an EC2 cluster. By performing the downloading in spark_ec2.py instead of the Bash script, this should also work for Windows users. I've tested this with Python 2.6, too. Author: Josh Rosen <joshrosen@databricks.com> Closes #3737 from JoshRosen/update-boto and squashes the following commits: 0aa43cc [Josh Rosen] Remove unused setup_standalone_cluster() method. f02935d [Josh Rosen] Enable Python deprecation warnings and fix one Boto warning: 587ae89 [Josh Rosen] [SPARK-4890] Upgrade Boto to 2.34.0; automatically download Boto from PyPi instead of packaging it
* SPARK-4767: Add support for launching in a specified placement group to ↵Holden Karau2014-12-161-0/+8
| | | | | | | | | | | | | spark_ec2 Placement groups are cool and all the cool kids are using them. Lets add support for them to spark_ec2.py because I'm lazy Author: Holden Karau <holden@pigscanfly.ca> Closes #3623 from holdenk/SPARK-4767-add-support-for-launching-in-a-specified-placement-group-to-spark-ec2-scripts and squashes the following commits: 111a5fd [Holden Karau] merge in master 70ace25 [Holden Karau] Placement groups are cool and all the cool kids are using them. Lets add support for them to spark_ec2.py because I'm lazy
* [SPARK-3405] add subnet-id and vpc-id options to spark_ec2.pyMike Jennings2014-12-161-15/+51
| | | | | | | | | | | | | | | | | | | | Based on this gist: https://gist.github.com/amar-analytx/0b62543621e1f246c0a2 We use security group ids instead of security group to get around this issue: https://github.com/boto/boto/issues/350 Author: Mike Jennings <mvj101@gmail.com> Author: Mike Jennings <mvj@google.com> Closes #2872 from mvj101/SPARK-3405 and squashes the following commits: be9cb43 [Mike Jennings] `pep8 spark_ec2.py` runs cleanly. 4dc6756 [Mike Jennings] Remove duplicate comment 731d94c [Mike Jennings] Update for code review. ad90a36 [Mike Jennings] Merge branch 'master' of https://github.com/apache/spark into SPARK-3405 1ebffa1 [Mike Jennings] Merge branch 'master' into SPARK-3405 52aaeec [Mike Jennings] [SPARK-3405] add subnet-id and vpc-id options to spark_ec2.py
* [SPARK-4745] Fix get_existing_cluster() function with multiple security groupsalexdebrie2014-12-041-2/+2
| | | | | | | | | | The current get_existing_cluster() function would only find an instance belonged to a cluster if the instance's security groups == cluster_name + "-master" (or "-slaves"). This fix allows for multiple security groups by checking if the cluster_name + "-master" security group is in the list of groups for a particular instance. Author: alexdebrie <alexdebrie1@gmail.com> Closes #3596 from alexdebrie/master and squashes the following commits: 9d51232 [alexdebrie] Fix get_existing_cluster() function with multiple security groups
* [SPARK-3398] [SPARK-4325] [EC2] Use EC2 status checks.Nicholas Chammas2014-11-291-12/+36
| | | | | | | | | | | | | | | This PR re-introduces [0e648bc](https://github.com/apache/spark/commit/0e648bc2bedcbeb55fce5efac04f6dbad9f063b4) from PR #2339, which somehow never made it into the codebase. Additionally, it removes a now-unnecessary linear backoff on the SSH checks since we are blocking on EC2 status checks before testing SSH. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #3195 from nchammas/remove-ec2-ssh-backoff and squashes the following commits: efb29e1 [Nicholas Chammas] Revert "Remove linear backoff." ef3ca99 [Nicholas Chammas] reuse conn adb4eaa [Nicholas Chammas] Remove linear backoff. 55caa24 [Nicholas Chammas] Check EC2 status checks before SSH.
* SPARK-1450 [EC2] Specify the default zone in the EC2 script helpSean Owen2014-11-281-1/+1
| | | | | | | | | | | | | | | This looks like a one-liner, so I took a shot at it. There can be no fixed default availability zone since the names are different per region. But the default behavior can be documented: ``` if opts.zone == "": opts.zone = random.choice(conn.get_all_zones()).name ``` Author: Sean Owen <sowen@cloudera.com> Closes #3454 from srowen/SPARK-1450 and squashes the following commits: 9193cf3 [Sean Owen] Document that --zone defaults to a single random zone
* [Spark-4509] Revert EC2 tag-based cluster membership patchXiangrui Meng2014-11-251-61/+22
| | | | | | | | | | | | | | | | | | | | | | This PR reverts changes related to tag-based cluster membership. As discussed in SPARK-3332, we didn't figure out a safe strategy to use tags to determine cluster membership, because tagging is not atomic. The following changes are reverted: SPARK-2333: 94053a7b766788bb62e2dbbf352ccbcc75f71fc0 SPARK-3213: 7faf755ae4f0cf510048e432340260a6e609066d SPARK-3608: 78d4220fa0bf2f9ee663e34bbf3544a5313b02f0. I tested launch, login, and destroy. It is easy to check the diff by comparing it to Josh's patch for branch-1.1: https://github.com/apache/spark/pull/2225/files JoshRosen I sent the PR to master. It might be easier for us to keep master and branch-1.2 the same at this time. We can always re-apply the patch once we figure out a stable solution. Author: Xiangrui Meng <meng@databricks.com> Closes #3453 from mengxr/SPARK-4509 and squashes the following commits: f0b708b [Xiangrui Meng] revert 94053a7b766788bb62e2dbbf352ccbcc75f71fc0 4298ea5 [Xiangrui Meng] revert 7faf755ae4f0cf510048e432340260a6e609066d 35963a1 [Xiangrui Meng] Revert "SPARK-3608 Break if the instance tag naming succeeds"
* [SPARK-4137] [EC2] Don't change working dir on userNicholas Chammas2014-11-052-3/+17
| | | | | | | | | | | | | | | | | | | | | | | This issue was uncovered after [this discussion](https://issues.apache.org/jira/browse/SPARK-3398?focusedCommentId=14187471&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14187471). Don't change the working directory on the user. This breaks relative paths the user may pass in, e.g., for the SSH identity file. ``` ./ec2/spark-ec2 -i ../my.pem ``` This patch will preserve the user's current working directory and allow calls like the one above to work. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #2988 from nchammas/spark-ec2-cwd and squashes the following commits: f3850b5 [Nicholas Chammas] pep8 fix fbc20c7 [Nicholas Chammas] revert to old commenting style 752f958 [Nicholas Chammas] specify deploy.generic path absolutely bcdf6a5 [Nicholas Chammas] fix typo 77871a2 [Nicholas Chammas] add clarifying comment ce071fc [Nicholas Chammas] don't change working dir
* [EC2] Factor out Mesos spark-ec2 branchNicholas Chammas2014-11-031-2/+9
| | | | | | | | | | We reference a specific branch in two places. This patch makes it one place. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #3008 from nchammas/mesos-spark-ec2-branch and squashes the following commits: 10a6089 [Nicholas Chammas] factor out mess spark-ec2 branch
* Fetch from branch v4 in Spark EC2 script.Josh Rosen2014-10-081-1/+1
|
* [SPARK-3398] [EC2] Have spark-ec2 intelligently wait for specific cluster statesNicholas Chammas2014-10-071-25/+86
| | | | | | | | | | | | | | | | | | | | | Instead of waiting arbitrary amounts of time for the cluster to reach a specific state, this patch lets `spark-ec2` explicitly wait for a cluster to reach a desired state. This is useful in a couple of situations: * The cluster is launching and you want to wait until SSH is available before installing stuff. * The cluster is being terminated and you want to wait until all the instances are terminated before trying to delete security groups. This patch removes the need for the `--wait` option and removes some of the time-based retry logic that was being used. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #2339 from nchammas/spark-ec2-wait-properly and squashes the following commits: 43a69f0 [Nicholas Chammas] short-circuit SSH check; linear backoff 9a9e035 [Nicholas Chammas] remove extraneous comment 26c5ed0 [Nicholas Chammas] replace print with write() bb67c06 [Nicholas Chammas] deprecate wait option; remove dead code 7969265 [Nicholas Chammas] fix long line (PEP 8) 126e4cf [Nicholas Chammas] wait for specific cluster states
* [EC2] Sort long, manually-inputted dictionariesNicholas Chammas2014-09-291-31/+38
| | | | | | | | | | Similar to the work done in #2571, this PR just sorts the remaining manually-inputted dicts in the EC2 script so they are easier to maintain. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #2578 from nchammas/ec2-dict-sort and squashes the following commits: f55c692 [Nicholas Chammas] sort long dictionaries
* [EC2] Cleanup Python parens and disk dictNicholas Chammas2014-09-281-30/+30
| | | | | | | | | | | | Minor fixes: * Remove unnecessary parens (Python style) * Sort `disks_by_instance` dict and remove duplicate `t1.micro` key Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #2571 from nchammas/ec2-polish and squashes the following commits: 9d203d5 [Nicholas Chammas] paren and dict cleanup
* [SPARK-3659] Set EC2 version to 1.1.0 and update version mapShivaram Venkataraman2014-09-241-2/+2
| | | | | | | | | | This brings the master branch in sync with branch-1.1 Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #2510 from shivaram/spark-ec2-version and squashes the following commits: bb0dd16 [Shivaram Venkataraman] Set EC2 version to 1.1.0 and update version map
* SPARK-3608 Break if the instance tag naming succeedsVida Ha2014-09-201-0/+1
| | | | | | | | Author: Vida Ha <vida@databricks.com> Closes #2466 from vidaha/vida/spark-3608 and squashes the following commits: 9509776 [Vida Ha] Break if the instance tag naming succeeds
* [SPARK-787] Add S3 configuration parameters to the EC2 deploy scriptsDan Osipov2014-09-162-0/+12
| | | | | | | | | | | | | | | When deploying to AWS, there is additional configuration that is required to read S3 files. EMR creates it automatically, there is no reason that the Spark EC2 script shouldn't. This PR requires a corresponding PR to the mesos/spark-ec2 to be merged, as it gets cloned in the process of setting up machines: https://github.com/mesos/spark-ec2/pull/58 Author: Dan Osipov <daniil.osipov@shazam.com> Closes #1120 from danosipov/s3_credentials and squashes the following commits: 758da8b [Dan Osipov] Modify documentation to include the new parameter 71fab14 [Dan Osipov] Use a parameter --copy-aws-credentials to enable S3 credential deployment 7e0da26 [Dan Osipov] Get AWS credentials out of boto connection instance 39bdf30 [Dan Osipov] Add S3 configuration parameters to the EC2 deploy scripts
* [SPARK-3540] Add reboot-slaves functionality to the ec2 scriptReynold Xin2014-09-151-1/+15
| | | | | | | | | | Tested on a real cluster. Author: Reynold Xin <rxin@apache.org> Closes #2404 from rxin/ec2-reboot-slaves and squashes the following commits: 00a2dbd [Reynold Xin] Allow rebooting slaves.
* [EC2] don't duplicate default valuesNicholas Chammas2014-09-061-11/+13
| | | | | | | | | | | | | This PR makes two minor changes to the `spark-ec2` script: 1. The script's input parameter default values are duplicated into the help text. This is unnecessary. This PR replaces the duplicated info with the appropriate `optparse` placeholder. 2. The default Spark version currently needs to be updated by hand during each release, which is known to be a faulty process. This PR places that default value in an easy-to-spot place. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #2290 from nchammas/spark-ec2-default-version and squashes the following commits: 0c6d3bb [Nicholas Chammas] don't duplicate default values
* [SPARK-3361] Expand PEP 8 checks to include EC2 script and Python examplesNicholas Chammas2014-09-051-6/+14
| | | | | | | | | | | | | | This PR resolves [SPARK-3361](https://issues.apache.org/jira/browse/SPARK-3361) by expanding the PEP 8 checks to cover the remaining Python code base: * The EC2 script * All Python / PySpark examples Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #2297 from nchammas/pep8-rulez and squashes the following commits: 1e5ac9a [Nicholas Chammas] PEP 8 fixes to Python examples c3dbeff [Nicholas Chammas] PEP 8 fixes to EC2 script 65ef6e8 [Nicholas Chammas] expand PEP 8 checks
* [SPARK-3391][EC2] Support attaching up to 8 EBS volumes.Reynold Xin2014-09-041-8/+25
| | | | | | | | | | | | | | Please merge this at the same time as https://github.com/mesos/spark-ec2/pull/66 Author: Reynold Xin <rxin@apache.org> Closes #2260 from rxin/ec2-ebs-vol and squashes the following commits: b9527d9 [Reynold Xin] Removed io1 ebs type. bf9c403 [Reynold Xin] Made EBS volume type configurable. c8e25ea [Reynold Xin] Support up to 8 EBS volumes. adf4f2e [Reynold Xin] Revert git repo change. 020c542 [Reynold Xin] [SPARK-3391] Support attaching more than 1 EBS volumes.
* SPARK-3358: [EC2] Switch back to HVM instances for m3.X.Patrick Wendell2014-09-021-4/+4
| | | | | | | | | | | | During regression tests of Spark 1.1 we discovered perf issues with PVM instances when running PySpark. This reverts a change added in #1156 which changed the default type for m3 instances to PVM. Author: Patrick Wendell <pwendell@gmail.com> Closes #2244 from pwendell/ec2-hvm and squashes the following commits: 1342d7e [Patrick Wendell] SPARK-3358: [EC2] Switch back to HVM instances for m3.X.
* [SPARK-3342] Add SSDs to block device mappingDaniel Darabos2014-09-011-1/+11
| | | | | | | | | | | | | | | | | | | | | On `m3.2xlarge` instances the 2x80GB SSDs are inaccessible if not added to the block device mapping when the instance is created. They work when added with this patch. I have not tested this with other instance types, and I do not know much about this script and EC2 deployment in general. Maybe this code needs to depend on the instance type. The requirement for this mapping is described in the AWS docs at: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html#InstanceStore_UsageScenarios "For M3 instances, you must specify instance store volumes in the block device mapping for the instance. When you launch an M3 instance, we ignore any instance store volumes specified in the block device mapping for the AMI." Author: Daniel Darabos <darabos.daniel@gmail.com> Closes #2081 from darabos/patch-1 and squashes the following commits: 1ceb2c8 [Daniel Darabos] Use %d string interpolation instead of {}. a1854d7 [Daniel Darabos] Only specify ephemeral device mapping for M3. e0d9e37 [Daniel Darabos] Create ephemeral device mapping based on get_num_disks(). 6b116a6 [Daniel Darabos] Add SSDs to block device mapping
* Spark-3213 Fixes issue with spark-ec2 not detecting slaves created with ↵Vida Ha2014-08-271-20/+25
| | | | | | | | | | | | "Launch More like this" ... copy the spark_cluster_tag from a spot instance requests over to the instances. Author: Vida Ha <vida@databricks.com> Closes #2163 from vidaha/vida/spark-3213 and squashes the following commits: 5070a70 [Vida Ha] Spark-3214 Fix issue with spark-ec2 not detecting slaves created with 'Launch More Like This' and using Spot Requests