aboutsummaryrefslogtreecommitdiff
path: root/repl
Commit message (Collapse)AuthorAgeFilesLines
* [SQL] Update SparkSQL and ScalaTest in branch-1.0 to match master.Michael Armbrust2014-06-131-2/+4
| | | | | | | | | | | | | #511 and #863 got left out of branch-1.0 since we were really close to the release. Now that they have been tested a little I see no reason to leave them out. Author: Michael Armbrust <michael@databricks.com> Author: witgo <witgo@qq.com> Closes #1078 from marmbrus/branch-1.0 and squashes the following commits: 22be674 [witgo] [SPARK-1841]: update scalatest to version 2.1.5 fc8fc79 [Michael Armbrust] Include #1071 as well. c5d0adf [Michael Armbrust] Update SparkSQL in branch-1.0 to match master.
* Improve maven plugin configurationwitgo2014-06-011-30/+0
| | | | | | | | | | | | | Author: witgo <witgo@qq.com> Closes #786 from witgo/maven_plugin and squashes the following commits: 5de86a2 [witgo] Merge branch 'master' of https://github.com/apache/spark into maven_plugin c35ef73 [witgo] Improve maven plugin configuration Conflicts: pom.xml
* [maven-release-plugin] prepare for next development iterationTathagata Das2014-05-261-1/+1
|
* [maven-release-plugin] prepare release v1.0.0-rc11v1.0.0Tathagata Das2014-05-261-1/+1
|
* Revert "[maven-release-plugin] prepare release v1.0.0-rc11"Tathagata Das2014-05-261-1/+1
| | | | This reverts commit 2f1dc868e5714882cf40d2633fb66772baf34789.
* Revert "[maven-release-plugin] prepare for next development iteration"Tathagata Das2014-05-261-1/+1
| | | | This reverts commit 832dc594e7666f1d402334f8015ce29917d9c888.
* [maven-release-plugin] prepare for next development iterationTathagata Das2014-05-251-1/+1
|
* [maven-release-plugin] prepare release v1.0.0-rc11Tathagata Das2014-05-251-1/+1
|
* Revert "[maven-release-plugin] prepare release v1.0.0-rc10"Tathagata Das2014-05-251-1/+1
| | | | This reverts commit d807023479ce10aec28ef3c1ab646ddefc2e663c.
* Revert "[maven-release-plugin] prepare for next development iteration"Tathagata Das2014-05-251-1/+1
| | | | This reverts commit 67dd53d2556f03ce292e6889128cf441f1aa48f8.
* [SPARK-1900 / 1918] PySpark on YARN is brokenAndrew Or2014-05-241-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If I run the following on a YARN cluster ``` bin/spark-submit sheep.py --master yarn-client ``` it fails because of a mismatch in paths: `spark-submit` thinks that `sheep.py` resides on HDFS, and balks when it can't find the file there. A natural workaround is to add the `file:` prefix to the file: ``` bin/spark-submit file:/path/to/sheep.py --master yarn-client ``` However, this also fails. This time it is because python does not understand URI schemes. This PR fixes this by automatically resolving all paths passed as command line argument to `spark-submit` properly. This has the added benefit of keeping file and jar paths consistent across different cluster modes. For python, we strip the URI scheme before we actually try to run it. Much of the code is originally written by @mengxr. Tested on YARN cluster. More tests pending. Author: Andrew Or <andrewor14@gmail.com> Closes #853 from andrewor14/submit-paths and squashes the following commits: 0bb097a [Andrew Or] Format path correctly before adding it to PYTHONPATH 323b45c [Andrew Or] Include --py-files on PYTHONPATH for pyspark shell 3c36587 [Andrew Or] Improve error messages (minor) 854aa6a [Andrew Or] Guard against NPE if user gives pathological paths 6638a6b [Andrew Or] Fix spark-shell jar paths after #849 went in 3bb0359 [Andrew Or] Update more comments (minor) 2a1f8a0 [Andrew Or] Update comments (minor) 6af2c77 [Andrew Or] Merge branch 'master' of github.com:apache/spark into submit-paths a68c4d1 [Andrew Or] Handle Windows python file path correctly 427a250 [Andrew Or] Resolve paths properly for Windows a591a4a [Andrew Or] Update tests for resolving URIs 6c8621c [Andrew Or] Move resolveURIs to Utils db8255e [Andrew Or] Merge branch 'master' of github.com:apache/spark into submit-paths f542dce [Andrew Or] Fix outdated tests 691c4ce [Andrew Or] Ignore special primary resource names 5342ac7 [Andrew Or] Add missing space in error message 02f77f3 [Andrew Or] Resolve command line arguments to spark-submit properly (cherry picked from commit 5081a0a9d47ca31900ea4de570de2cbb0e063105) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
* [SPARK-1896] Respect spark.master (and --master) before MASTER in spark-shellAndrew Or2014-05-221-3/+2
| | | | | | | | | | | | | | | | | | | | | | The hierarchy for configuring the Spark master in the shell is as follows: ``` MASTER > --master > spark.master (spark-defaults.conf) ``` This is inconsistent with the way we run normal applications, which is: ``` --master > spark.master (spark-defaults.conf) > MASTER ``` I was trying to run a shell locally on a standalone cluster launched through the ec2 scripts, which automatically set `MASTER` in spark-env.sh. It was surprising to me that `--master` didn't take effect, considering that this is the way we tell users to set their masters [here](http://people.apache.org/~pwendell/spark-1.0.0-rc7-docs/scala-programming-guide.html#initializing-spark). Author: Andrew Or <andrewor14@gmail.com> Closes #846 from andrewor14/shell-master and squashes the following commits: 2cb81c9 [Andrew Or] Respect spark.master before MASTER in REPL (cherry picked from commit cce77457e00aa5f1f4db3d50454cf257efb156ed) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
* [SPARK-1897] Respect spark.jars (and --jars) in spark-shellAndrew Or2014-05-221-1/+7
| | | | | | | | | | | | | | | | Spark shell currently overwrites `spark.jars` with `ADD_JARS`. In all modes except yarn-cluster, this means the `--jar` flag passed to `bin/spark-shell` is also discarded. However, in the [docs](http://people.apache.org/~pwendell/spark-1.0.0-rc7-docs/scala-programming-guide.html#initializing-spark), we explicitly tell the users to add the jars this way. Author: Andrew Or <andrewor14@gmail.com> Closes #849 from andrewor14/shell-jars and squashes the following commits: 928a7e6 [Andrew Or] ',' -> "," (minor) afc357c [Andrew Or] Handle spark.jars == "" in SparkILoop, not SparkSubmit c6da113 [Andrew Or] Do not set spark.jars to "" d8549f7 [Andrew Or] Respect spark.jars and --jars in spark-shell (cherry picked from commit 8edbee7d1b4afc192d97ba192a5526affc464205) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
* [maven-release-plugin] prepare for next development iterationTathagata Das2014-05-201-1/+1
|
* [maven-release-plugin] prepare release v1.0.0-rc10Tathagata Das2014-05-201-1/+1
|
* Revert "[maven-release-plugin] prepare release v1.0.0-rc9"Tathagata Das2014-05-191-1/+1
| | | | This reverts commit 920f947eb5a22a679c0c3186cf69ee75f6041c75.
* Revert "[maven-release-plugin] prepare for next development iteration"Tathagata Das2014-05-191-1/+1
| | | | This reverts commit f8e611955096c5c1c7db5764b9d2851b1d295f0d.
* [maven-release-plugin] prepare for next development iterationPatrick Wendell2014-05-171-1/+1
|
* [maven-release-plugin] prepare release v1.0.0-rc9Patrick Wendell2014-05-171-1/+1
|
* Revert "[maven-release-plugin] prepare release v1.0.0-rc8"Patrick Wendell2014-05-161-1/+1
| | | | This reverts commit 80eea0f111c06260ffaa780d2f3f7facd09c17bc.
* Revert "[maven-release-plugin] prepare for next development iteration"Patrick Wendell2014-05-161-1/+1
| | | | This reverts commit e5436b8c1a79ce108f3af402455ac5f6dc5d1eb3.
* [maven-release-plugin] prepare for next development iterationPatrick Wendell2014-05-161-1/+1
|
* [maven-release-plugin] prepare release v1.0.0-rc8Patrick Wendell2014-05-161-1/+1
|
* Revert "[maven-release-plugin] prepare release v1.0.0-rc7"Patrick Wendell2014-05-161-1/+1
| | | | This reverts commit 9212b3e5bb5545ccfce242da8d89108e6fb1c464.
* Revert "[maven-release-plugin] prepare for next development iteration"Patrick Wendell2014-05-161-1/+1
| | | | This reverts commit c4746aa6fe4aaf383e69e34353114d36d1eb9ba6.
* [maven-release-plugin] prepare for next development iterationPatrick Wendell2014-05-151-1/+1
|
* [maven-release-plugin] prepare release v1.0.0-rc7Patrick Wendell2014-05-151-1/+1
|
* Revert "[maven-release-plugin] prepare release v1.0.0-rc6"Patrick Wendell2014-05-141-1/+1
| | | | This reverts commit 54133abdce0246f6643a1112a5204afb2c4caa82.
* Revert "[maven-release-plugin] prepare for next development iteration"Patrick Wendell2014-05-141-1/+1
| | | | This reverts commit e480bcfbd269ae1d7a6a92cfb50466cf192fe1fb.
* [maven-release-plugin] prepare for next development iterationPatrick Wendell2014-05-141-1/+1
|
* [maven-release-plugin] prepare release v1.0.0-rc6Patrick Wendell2014-05-141-1/+1
|
* Revert "[maven-release-plugin] prepare release v1.0.0-rc5"Patrick Wendell2014-05-141-1/+1
| | | | This reverts commit 18f062303303824139998e8fc8f4158217b0dbc3.
* Revert "[maven-release-plugin] prepare for next development iteration"Patrick Wendell2014-05-141-1/+1
| | | | This reverts commit d08e9604fc9958b7c768e91715c8152db2ed6fd0.
* [maven-release-plugin] prepare for next development iterationPatrick Wendell2014-05-131-1/+1
|
* [maven-release-plugin] prepare release v1.0.0-rc5Patrick Wendell2014-05-131-1/+1
|
* Revert "[maven-release-plugin] prepare release v1.0.0-rc4"Patrick Wendell2014-05-121-1/+1
| | | | This reverts commit 3d0a44833ab50360bf9feccc861cb5e8c44a4866.
* Revert "[maven-release-plugin] prepare for next development iteration"Patrick Wendell2014-05-121-1/+1
| | | | This reverts commit 9772d85c6f3893d42044f4bab0e16f8b6287613a.
* [maven-release-plugin] prepare for next development iterationPatrick Wendell2014-05-131-1/+1
|
* [maven-release-plugin] prepare release v1.0.0-rc4Patrick Wendell2014-05-131-1/+1
|
* Rollback versions for 1.0.0-rc4Patrick Wendell2014-05-121-1/+1
|
* [maven-release-plugin] prepare for next development iterationPatrick Wendell2014-05-121-1/+1
|
* [maven-release-plugin] prepare release v1.0.0-rc4Patrick Wendell2014-05-121-1/+1
|
* SPARK-1798. Tests should clean up temp filesSean Owen2014-05-123-6/+23
| | | | | | | | | | | | | | | | | | | | | | | Three issues related to temp files that tests generate – these should be touched up for hygiene but are not urgent. Modules have a log4j.properties which directs the unit-test.log output file to a directory like `[module]/target/unit-test.log`. But this ends up creating `[module]/[module]/target/unit-test.log` instead of former. The `work/` directory is not deleted by "mvn clean", in the parent and in modules. Neither is the `checkpoint/` directory created under the various external modules. Many tests create a temp directory, which is not usually deleted. This can be largely resolved by calling `deleteOnExit()` at creation and trying to call `Utils.deleteRecursively` consistently to clean up, sometimes in an `@After` method. _If anyone seconds the motion, I can create a more significant change that introduces a new test trait along the lines of `LocalSparkContext`, which provides management of temp directories for subclasses to take advantage of._ Author: Sean Owen <sowen@cloudera.com> Closes #732 from srowen/SPARK-1798 and squashes the following commits: 5af578e [Sean Owen] Try to consistently delete test temp dirs and files, and set deleteOnExit() for each b21b356 [Sean Owen] Remove work/ and checkpoint/ dirs with mvn clean bdd0f41 [Sean Owen] Remove duplicate module dir in log4j.properties output path for tests (cherry picked from commit 7120a2979d0a9f0f54a88b2416be7ca10e74f409) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
* [SPARK-1549] Add Python support to spark-submitMatei Zaharia2014-05-061-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | This PR updates spark-submit to allow submitting Python scripts (currently only with deploy-mode=client, but that's all that was supported before) and updates the PySpark code to properly find various paths, etc. One significant change is that we assume we can always find the Python files either from the Spark assembly JAR (which will happen with the Maven assembly build in make-distribution.sh) or from SPARK_HOME (which will exist in local mode even if you use sbt assembly, and should be enough for testing). This means we no longer need a weird hack to modify the environment for YARN. This patch also updates the Python worker manager to run python with -u, which means unbuffered output (send it to our logs right away instead of waiting a while after stuff was written); this should simplify debugging. In addition, it fixes https://issues.apache.org/jira/browse/SPARK-1709, setting the main class from a JAR's Main-Class attribute if not specified by the user, and fixes a few help strings and style issues in spark-submit. In the future we may want to make the `pyspark` shell use spark-submit as well, but it seems unnecessary for 1.0. Author: Matei Zaharia <matei@databricks.com> Closes #664 from mateiz/py-submit and squashes the following commits: 15e9669 [Matei Zaharia] Fix some uses of path.separator property 051278c [Matei Zaharia] Small style fixes 0afe886 [Matei Zaharia] Add license headers 4650412 [Matei Zaharia] Add pyFiles to PYTHONPATH in executors, remove old YARN stuff, add tests 15f8e1e [Matei Zaharia] Set PYTHONPATH in PythonWorkerFactory in case it wasn't set from outside 47c0655 [Matei Zaharia] More work to make spark-submit work with Python: d4375bd [Matei Zaharia] Clean up description of spark-submit args a bit and add Python ones (cherry picked from commit 951a5d939863b42da83ac2569d5e9d7ed680e119) Signed-off-by: Matei Zaharia <matei@databricks.com>
* [maven-release-plugin] prepare for next development iterationPatrick Wendell2014-04-291-1/+1
|
* [maven-release-plugin] prepare release v1.0.0-rc3Patrick Wendell2014-04-291-1/+1
|
* Manual revert of rc2 version changes.Patrick Wendell2014-04-281-1/+1
|
* Improved build configurationwitgo2014-04-281-14/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1, Fix SPARK-1441: compile spark core error with hadoop 0.23.x 2, Fix SPARK-1491: maven hadoop-provided profile fails to build 3, Fix org.scala-lang: * ,org.apache.avro:* inconsistent versions dependency 4, A modified on the sql/catalyst/pom.xml,sql/hive/pom.xml,sql/core/pom.xml (Four spaces formatted into two spaces) Author: witgo <witgo@qq.com> Closes #480 from witgo/format_pom and squashes the following commits: 03f652f [witgo] review commit b452680 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom bee920d [witgo] revert fix SPARK-1629: Spark Core missing commons-lang dependence 7382a07 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom 6902c91 [witgo] fix SPARK-1629: Spark Core missing commons-lang dependence 0da4bc3 [witgo] merge master d1718ed [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom e345919 [witgo] add avro dependency to yarn-alpha 77fad08 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom 62d0862 [witgo] Fix org.scala-lang: * inconsistent versions dependency 1a162d7 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom 934f24d [witgo] review commit cf46edc [witgo] exclude jruby 06e7328 [witgo] Merge branch 'SparkBuild' into format_pom 99464d2 [witgo] fix maven hadoop-provided profile fails to build 0c6c1fc [witgo] Fix compile spark core error with hadoop 0.23.x 6851bec [witgo] Maintain consistent SparkBuild.scala, pom.xml (cherry picked from commit 030f2c2126d5075576cd6d83a1ee7462c48b953b) Conflicts: sql/catalyst/pom.xml sql/core/pom.xml sql/hive/pom.xml
* SPARK-1619 Launch spark-shell with spark-submitPatrick Wendell2014-04-241-2/+3
| | | | | | | | | | | | | | | | | | This simplifies the shell a bunch and passes all arguments through to spark-submit. There is a tiny incompatibility from 0.9.1 which is that you can't put `-c` _or_ `--cores`, only `--cores`. However, spark-submit will give a good error message in this case, I don't think many people used this, and it's a trivial change for users. Author: Patrick Wendell <pwendell@gmail.com> Closes #542 from pwendell/spark-shell and squashes the following commits: 9eb3e6f [Patrick Wendell] Updating Spark docs b552459 [Patrick Wendell] Andrew's feedback 97720fa [Patrick Wendell] Review feedback aa2900b [Patrick Wendell] SPARK-1619 Launch spark-shell with spark-submit (cherry picked from commit dc3b640a0ab3501b678b591be3e99fbcf3badbec) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
* SPARK-1586 Windows build fixesMridul Muralidharan2014-04-241-2/+4
| | | | | | | | | | | | | | | | | | | | | | Unfortunately, this is not exhaustive - particularly hive tests still fail due to path issues. Author: Mridul Muralidharan <mridulm80@apache.org> This patch had conflicts when merged, resolved by Committer: Matei Zaharia <matei@databricks.com> Closes #505 from mridulm/windows_fixes and squashes the following commits: ef12283 [Mridul Muralidharan] Move to org.apache.commons.lang3 for StringEscapeUtils. Earlier version was buggy appparently cdae406 [Mridul Muralidharan] Remove leaked changes from > 2G fix branch 3267f4b [Mridul Muralidharan] Fix build failures 35b277a [Mridul Muralidharan] Fix Scalastyle failures bc69d14 [Mridul Muralidharan] Change from hardcoded path separator 10c4d78 [Mridul Muralidharan] Use explicit encoding while using getBytes 1337abd [Mridul Muralidharan] fix classpath while running in windows (cherry picked from commit 968c0187a12f5ae4a696c02c1ff088e998ed7edd) Signed-off-by: Matei Zaharia <matei@databricks.com>