aboutsummaryrefslogtreecommitdiff
path: root/dev
Commit message (Collapse)AuthorAgeFilesLines
...
* [Release] Bring audit scripts up-to-dateAndrew Or2014-11-124-286/+75
| | | | | | | | This involves a few main changes: - Log all output message to the log file. Previously the log file was not useful because it did not indicate progress. - Remove hive-site.xml in sbt_hive_app to avoid interference - Add the appropriate repositories for new dependencies
* [Release] Correct make-distribution.sh log pathAndrew Or2014-11-121-1/+1
|
* Support cross building for Scala 2.11Prashant Sharma2014-11-115-14/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's give this another go using a version of Hive that shades its JLine dependency. Author: Prashant Sharma <prashant.s@imaginea.com> Author: Patrick Wendell <pwendell@gmail.com> Closes #3159 from pwendell/scala-2.11-prashant and squashes the following commits: e93aa3e [Patrick Wendell] Restoring -Phive-thriftserver profile and cleaning up build script. f65d17d [Patrick Wendell] Fixing build issue due to merge conflict a8c41eb [Patrick Wendell] Reverting dev/run-tests back to master state. 7a6eb18 [Patrick Wendell] Merge remote-tracking branch 'apache/master' into scala-2.11-prashant 583aa07 [Prashant Sharma] REVERT ME: removed hive thirftserver 3680e58 [Prashant Sharma] Revert "REVERT ME: Temporarily removing some Cli tests." 935fb47 [Prashant Sharma] Revert "Fixed by disabling a few tests temporarily." 925e90f [Prashant Sharma] Fixed by disabling a few tests temporarily. 2fffed3 [Prashant Sharma] Exclude groovy from sbt build, and also provide a way for such instances in future. 8bd4e40 [Prashant Sharma] Switched to gmaven plus, it fixes random failures observer with its predecessor gmaven. 5272ce5 [Prashant Sharma] SPARK_SCALA_VERSION related bugs. 2121071 [Patrick Wendell] Migrating version detection to PySpark b1ed44d [Patrick Wendell] REVERT ME: Temporarily removing some Cli tests. 1743a73 [Patrick Wendell] Removing decimal test that doesn't work with Scala 2.11 f5cad4e [Patrick Wendell] Add Scala 2.11 docs 210d7e1 [Patrick Wendell] Revert "Testing new Hive version with shaded jline" 48518ce [Patrick Wendell] Remove association of Hive and Thriftserver profiles. e9d0a06 [Patrick Wendell] Revert "Enable thritfserver for Scala 2.10 only" 67ec364 [Patrick Wendell] Guard building of thriftserver around Scala 2.10 check 8502c23 [Patrick Wendell] Enable thritfserver for Scala 2.10 only e22b104 [Patrick Wendell] Small fix in pom file ec402ab [Patrick Wendell] Various fixes 0be5a9d [Patrick Wendell] Testing new Hive version with shaded jline 4eaec65 [Prashant Sharma] Changed scripts to ignore target. 5167bea [Prashant Sharma] small correction a4fcac6 [Prashant Sharma] Run against scala 2.11 on jenkins. 80285f4 [Prashant Sharma] MAven equivalent of setting spark.executor.extraClasspath during tests. 034b369 [Prashant Sharma] Setting test jars on executor classpath during tests from sbt. d4874cb [Prashant Sharma] Fixed Python Runner suite. null check should be first case in scala 2.11. 6f50f13 [Prashant Sharma] Fixed build after rebasing with master. We should use ${scala.binary.version} instead of just 2.10 e56ca9d [Prashant Sharma] Print an error if build for 2.10 and 2.11 is spotted. 937c0b8 [Prashant Sharma] SCALA_VERSION -> SPARK_SCALA_VERSION cb059b0 [Prashant Sharma] Code review 0476e5e [Prashant Sharma] Scala 2.11 support with repl and all build changes.
* [Release] Log build output for each distributionAndrew Or2014-11-111-1/+2
|
* [SPARK-4000][Build] Uploads HiveCompatibilitySuite logsCheng Lian2014-11-101-1/+7
| | | | | | | | | | | | This is a follow up of #2845. In addition to unit-tests.log files, also upload failure output files generated by `HiveCompatibilitySuite` to Jenkins master. These files can be very helpful to debug Hive compatibility test failures. /cc pwendell marmbrus Author: Cheng Lian <lian@databricks.com> Closes #2993 from liancheng/upload-hive-compat-logs and squashes the following commits: 8e6247f [Cheng Lian] Uploads HiveCompatibilitySuite logs
* [SPARK-3573][MLLIB] Make MLlib's Vector compatible with SQL's SchemaRDDXiangrui Meng2014-11-031-1/+1
| | | | | | | | | | | | | | | Register MLlib's Vector as a SQL user-defined type (UDT) in both Scala and Python. With this PR, we can easily map a RDD[LabeledPoint] to a SchemaRDD, and then select columns or save to a Parquet file. Examples in Scala/Python are attached. The Scala code was copied from jkbradley. ~~This PR contains the changes from #3068 . I will rebase after #3068 is merged.~~ marmbrus jkbradley Author: Xiangrui Meng <meng@databricks.com> Closes #3070 from mengxr/SPARK-3573 and squashes the following commits: 3a0b6e5 [Xiangrui Meng] organize imports 236f0a0 [Xiangrui Meng] register vector as UDT and provide dataset examples
* [SPARK-3826][SQL]enable hive-thriftserver to support hive-0.13.1wangfei2014-10-311-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In #2241 hive-thriftserver is not enabled. This patch enable hive-thriftserver to support hive-0.13.1 by using a shim layer refer to #2241. 1 A light shim layer(code in sql/hive-thriftserver/hive-version) for each different hive version to handle api compatibility 2 New pom profiles "hive-default" and "hive-versions"(copy from #2241) to activate different hive version 3 SBT cmd for different version as follows: hive-0.12.0 --- sbt/sbt -Phive,hadoop-2.3 -Phive-0.12.0 assembly hive-0.13.1 --- sbt/sbt -Phive,hadoop-2.3 -Phive-0.13.1 assembly 4 Since hive-thriftserver depend on hive subproject, this patch should be merged with #2241 to enable hive-0.13.1 for hive-thriftserver Author: wangfei <wangfei1@huawei.com> Author: scwf <wangfei1@huawei.com> Closes #2685 from scwf/shim-thriftserver1 and squashes the following commits: f26f3be [wangfei] remove clean to save time f5cac74 [wangfei] remove local hivecontext test 578234d [wangfei] use new shaded hive 18fb1ff [wangfei] exclude kryo in hive pom fa21d09 [wangfei] clean package assembly/assembly 8a4daf2 [wangfei] minor fix 0d7f6cf [wangfei] address comments f7c93ae [wangfei] adding build with hive 0.13 before running tests bcf943f [wangfei] Merge branch 'master' of https://github.com/apache/spark into shim-thriftserver1 c359822 [wangfei] reuse getCommandProcessor in hiveshim 52674a4 [scwf] sql/hive included since examples depend on it 3529e98 [scwf] move hive module to hive profile f51ff4e [wangfei] update and fix conflicts f48d3a5 [scwf] Merge branch 'master' of https://github.com/apache/spark into shim-thriftserver1 41f727b [scwf] revert pom changes 13afde0 [scwf] fix small bug 4b681f4 [scwf] enable thriftserver in profile hive-0.13.1 0bc53aa [scwf] fixed when result filed is null dfd1c63 [scwf] update run-tests to run hive-0.12.0 default now c6da3ce [scwf] Merge branch 'master' of https://github.com/apache/spark into shim-thriftserver 7c66b8e [scwf] update pom according spark-2706 ae47489 [scwf] update and fix conflicts
* [SPARK-3997][Build]scalastyle should output the error locationGuoQiang Li2014-10-261-1/+1
| | | | | | | | Author: GuoQiang Li <witgo@qq.com> Closes #2846 from witgo/SPARK-3997 and squashes the following commits: d6a57f8 [GuoQiang Li] scalastyle should output the error location
* [HOTFIX][SQL] Temporarily turn off hive-server tests.Michael Armbrust2014-10-261-1/+1
| | | | | | | | | | The thirift server is not available in the default (hive13) profile yet which is breaking all SQL only PRs. This turns off these test until #2685 is merged. Author: Michael Armbrust <michael@databricks.com> Closes #2950 from marmbrus/fixTests and squashes the following commits: 1a6dfee [Michael Armbrust] [HOTFIX][SQL] Temporarily turn of hive-server tests.
* [SQL] Update Hive test harness for Hive 12 and 13Michael Armbrust2014-10-241-1/+1
| | | | | | | | | | | | | As part of the upgrade I also copy the newest version of the query tests, and whitelist a bunch of new ones that are now passing. Author: Michael Armbrust <michael@databricks.com> Closes #2936 from marmbrus/fix13tests and squashes the following commits: d9cbdab [Michael Armbrust] Remove user specific tests 65801cd [Michael Armbrust] style and rat 8f6b09a [Michael Armbrust] Update test harness to work with both Hive 12 and 13. f044843 [Michael Armbrust] Update Hive query tests and golden files to 0.13
* [SPARK-2706][SQL] Enable Spark to support Hive 0.13Zhan Zhang2014-10-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given that a lot of users are trying to use hive 0.13 in spark, and the incompatibility between hive-0.12 and hive-0.13 on the API level I want to propose following approach, which has no or minimum impact on existing hive-0.12 support, but be able to jumpstart the development of hive-0.13 and future version support. Approach: Introduce “hive-version” property, and manipulate pom.xml files to support different hive version at compiling time through shim layer, e.g., hive-0.12.0 and hive-0.13.1. More specifically, 1. For each different hive version, there is a very light layer of shim code to handle API differences, sitting in sql/hive/hive-version, e.g., sql/hive/v0.12.0 or sql/hive/v0.13.1 2. Add a new profile hive-default active by default, which picks up all existing configuration and hive-0.12.0 shim (v0.12.0) if no hive.version is specified. 3. If user specifies different version (currently only 0.13.1 by -Dhive.version = 0.13.1), hive-versions profile will be activated, which pick up hive-version specific shim layer and configuration, mainly the hive jars and hive-version shim, e.g., v0.13.1. 4. With this approach, nothing is changed with current hive-0.12 support. No change by default: sbt/sbt -Phive For example: sbt/sbt -Phive -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 assembly To enable hive-0.13: sbt/sbt -Dhive.version=0.13.1 For example: sbt/sbt -Dhive.version=0.13.1 -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 assembly Note that in hive-0.13, hive-thriftserver is not enabled, which should be fixed by other Jira, and we don’t need -Phive with -Dhive.version in building (probably we should use -Phive -Dhive.version=xxx instead after thrift server is also supported in hive-0.13.1). Author: Zhan Zhang <zhazhan@gmail.com> Author: zhzhan <zhazhan@gmail.com> Author: Patrick Wendell <pwendell@gmail.com> Closes #2241 from zhzhan/spark-2706 and squashes the following commits: 3ece905 [Zhan Zhang] minor fix 410b668 [Zhan Zhang] solve review comments cbb4691 [Zhan Zhang] change run-test for new options 0d4d2ed [Zhan Zhang] rebase 497b0f4 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark 8fad1cf [Zhan Zhang] change the pom file and make hive-0.13.1 as the default ab028d1 [Zhan Zhang] rebase 4a2e36d [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark 4cb1b93 [zhzhan] Merge pull request #1 from pwendell/pr-2241 b0478c0 [Patrick Wendell] Changes to simplify the build of SPARK-2706 2b50502 [Zhan Zhang] rebase a72c0d4 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark cb22863 [Zhan Zhang] correct the typo 20f6cf7 [Zhan Zhang] solve compatability issue f7912a9 [Zhan Zhang] rebase and solve review feedback 301eb4a [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark 10c3565 [Zhan Zhang] address review comments 6bc9204 [Zhan Zhang] rebase and remove temparory repo d3aa3f2 [Zhan Zhang] Merge branch 'master' into spark-2706 cedcc6f [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark 3ced0d7 [Zhan Zhang] rebase d9b981d [Zhan Zhang] rebase and fix error due to rollback adf4924 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark 3dd50e8 [Zhan Zhang] solve conflicts and remove unnecessary implicts d10bf00 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark dc7bdb3 [Zhan Zhang] solve conflicts 7e0cc36 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark d7c3e1e [Zhan Zhang] Merge branch 'master' into spark-2706 68deb11 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark d48bd18 [Zhan Zhang] address review comments 3ee3b2b [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark 57ea52e [Zhan Zhang] Merge branch 'master' into spark-2706 2b0d513 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark 9412d24 [Zhan Zhang] address review comments f4af934 [Zhan Zhang] rebase 1ccd7cc [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark 128b60b [Zhan Zhang] ignore 0.12.0 test cases for the time being af9feb9 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark 5f5619f [Zhan Zhang] restructure the directory and different hive version support 05d3683 [Zhan Zhang] solve conflicts e4c1982 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark 94b4fdc [Zhan Zhang] Spark-2706: hive-0.13.1 support on spark 87ebf3b [Zhan Zhang] Merge branch 'master' into spark-2706 921e914 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark f896b2a [Zhan Zhang] Merge branch 'master' into spark-2706 789ea21 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark cb53a2c [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark f6a8a40 [Zhan Zhang] revert ba14f28 [Zhan Zhang] test dbedff3 [Zhan Zhang] Merge remote-tracking branch 'upstream/master' 70964fe [Zhan Zhang] revert fe0f379 [Zhan Zhang] Merge branch 'master' of https://github.com/zhzhan/spark 70ffd93 [Zhan Zhang] revert 42585ec [Zhan Zhang] test 7d5fce2 [Zhan Zhang] test
* [SPARK-4000][BUILD] Sends archived unit tests logs to Jenkins masterCheng Lian2014-10-231-12/+41
| | | | | | | | | | | | | | This PR sends archived unit tests logs to the build history directory in Jenkins master, so that we can serve it via HTTP later to help debugging Jenkins build failures. pwendell JoshRosen Please help review, thanks! Author: Cheng Lian <lian@databricks.com> Closes #2845 from liancheng/log-archive and squashes the following commits: ac8d9d4 [Cheng Lian] Includes build number in messages posted to GitHub 68c7010 [Cheng Lian] Logs backup should be implemented in dev/run-tests-jenkins 4b912f7 [Cheng Lian] Sends archived unit tests logs to Jenkins master
* [SPARK-3843][Minor] Cleanup scalastyle.txt at the end of running dev/scalastyleKousuke Saruta2014-10-081-0/+2
| | | | | | | | | | dev/scalastyle create a log file 'scalastyle.txt'. it is overwrote per running but never deleted even though dev/mima and dev/lint-python delete their log files. Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes #2702 from sarutak/scalastyle-txt-cleanup and squashes the following commits: d6e238e [Kousuke Saruta] Fixed dev/scalastyle to cleanup scalastyle.txt
* HOTFIX: Use correct Hadoop profile in buildPatrick Wendell2014-10-081-1/+1
|
* [SPARK-3479] [Build] Report failed test categoryNicholas Chammas2014-10-063-35/+126
| | | | | | | | | | | | | | | | | | | | | | This PR allows SparkQA (i.e. Jenkins) to report in its posts to GitHub what category of test failed, if one can be determined. The failure categories are: * general failure * RAT checks failed * Scala style checks failed * Python style checks failed * Build failed * Spark unit tests failed * PySpark unit tests failed * MiMa checks failed This PR also fixes the diffing logic used to determine if a patch introduces new classes. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #2606 from nchammas/report-failed-test-category and squashes the following commits: d67df03 [Nicholas Chammas] report what test category failed
* HOTFIX: Fix unicode error in merge script.Patrick Wendell2014-10-051-2/+1
| | | | | | | | | | | | | | | | | | The merge script builds up a big command array and sometimes this contains both unicode and ascii strings. This doesn't work if you try to join them into a single string. Longer term a solution is to go and make sure the source of all strings is unicode. This patch provides a simpler solution... just print the array rather than joining. I actually prefer printing an array here anyways since joining on spaces is lossy in the case of arguments that themselves contain spaces. Author: Patrick Wendell <pwendell@gmail.com> Closes #2645 from pwendell/merge-script and squashes the following commits: 167b792 [Patrick Wendell] HOTFIX: Fix unicode error in merge script.
* [Build] Post commit hash with timeout messagesNicholas Chammas2014-09-301-2/+4
| | | | | | | | | | | | [By request](https://github.com/apache/spark/pull/2588#issuecomment-57266871), and because it also makes sense. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #2597 from nchammas/timeout-commit-hash and squashes the following commits: 3d90714 [Nicholas Chammas] Revert "testing: making timeout 1 minute" 2353c95 [Nicholas Chammas] testing: making timeout 1 minute e3a477e [Nicholas Chammas] post commit hash with timeout
* SPARK-3745 - fix check-license to properly download and check jarshane knapp2014-09-301-7/+8
| | | | | | | | | | for details, see: https://issues.apache.org/jira/browse/SPARK-3745 Author: shane knapp <incomplete@gmail.com> Closes #2596 from shaneknapp/SPARK-3745 and squashes the following commits: c95eea9 [shane knapp] SPARK-3745 - fix check-license to properly download and check jar
* [Build] Diff from branch pointNicholas Chammas2014-09-241-2/+2
| | | | | | | | | | | | | Sometimes Jenkins posts [spurious reports of new classes being added](https://github.com/apache/spark/pull/2339#issuecomment-56570170). I believe this stems from diffing the patch against `master`, as opposed to against `master...`, which starts from the commit the PR was branched from. This patch fixes that behavior. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #2512 from nchammas/diff-only-commits-ahead and squashes the following commits: c065599 [Nicholas Chammas] comment typo fix a453c67 [Nicholas Chammas] diff from branch point
* [Build] Fix passing of args to sbtNicholas Chammas2014-09-191-1/+6
| | | | | | | | | | | | | | | | | | | | | | | Simple mistake, simple fix: ```shell args="arg1 arg2 arg3" sbt $args # sbt sees 3 arguments sbt "$args" # sbt sees 1 argument ``` Should fix the problems we are seeing [here](https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/694/AMPLAB_JENKINS_BUILD_PROFILE=hadoop1.0,label=centos/console), for example. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #2462 from nchammas/fix-sbt-master-build and squashes the following commits: 4500c86 [Nicholas Chammas] warn about quoting 10018a6 [Nicholas Chammas] Revert "test hadoop1 build" 7d5356c [Nicholas Chammas] Revert "re-add bad quoting for testing" 061600c [Nicholas Chammas] re-add bad quoting for testing b2de56c [Nicholas Chammas] test hadoop1 build 43fb854 [Nicholas Chammas] unquote profile args
* [SPARK-3534] Add hive-thriftserver to SQL testsNicholas Chammas2014-09-171-1/+1
| | | | | | | | | | Addresses the problem pointed out in [this comment](https://github.com/apache/spark/pull/2441#issuecomment-55990116). Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #2442 from nchammas/patch-1 and squashes the following commits: 7e68b60 [Nicholas Chammas] [SPARK-3534] Add hive-thriftserver to SQL tests
* [SPARK-3534] Fix expansion of testing arguments to sbtNicholas Chammas2014-09-171-4/+7
| | | | | | | | | | | | | | | | | | Testing arguments to `sbt` need to be passed as an array, not a single, long string. Fixes a bug introduced in #2420. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #2437 from nchammas/selective-testing and squashes the following commits: a9f9c1c [Nicholas Chammas] fix printing of sbt test arguments cf57cbf [Nicholas Chammas] fix sbt test arguments e33b978 [Nicholas Chammas] Merge pull request #2 from apache/master 0b47ca4 [Nicholas Chammas] Merge branch 'master' of github.com:nchammas/spark 8051486 [Nicholas Chammas] Merge pull request #1 from apache/master 03180a4 [Nicholas Chammas] Merge branch 'master' of github.com:nchammas/spark d4c5f43 [Nicholas Chammas] Merge pull request #6 from apache/master
* [SPARK-1455] [SPARK-3534] [Build] When possible, run SQL tests only.Nicholas Chammas2014-09-171-50/+106
| | | | | | | | | | | | | | | | | If the only files changed are related to SQL, then only run the SQL tests. This patch includes some cosmetic/maintainability refactoring. I would be more than happy to undo some of these changes if they are inappropriate. We can accept this patch mostly as-is and address the immediate need documented in [SPARK-3534](https://issues.apache.org/jira/browse/SPARK-3534), or we can keep it open until a satisfactory solution along the lines [discussed here](https://issues.apache.org/jira/browse/SPARK-1455?focusedCommentId=14136424&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14136424) is reached. Note: I had to hack this patch up to test it locally, so what I'm submitting here and what I tested are technically different. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #2420 from nchammas/selective-testing and squashes the following commits: db3fa2d [Nicholas Chammas] diff against master! f9e23f6 [Nicholas Chammas] when possible, run SQL tests only
* [SPARK-3433][BUILD] Fix for Mima false-positives with @DeveloperAPI and ↵Prashant Sharma2014-09-151-0/+8
| | | | | | | | | | | | | | | @Experimental annotations. Actually false positive reported was due to mima generator not picking up the new jars in presence of old jars(theoretically this should not have happened.). So as a workaround, ran them both separately and just append them together. Author: Prashant Sharma <prashant@apache.org> Author: Prashant Sharma <prashant.s@imaginea.com> Closes #2285 from ScrapCodes/mima-fix and squashes the following commits: 093c76f [Prashant Sharma] Update mima 59012a8 [Prashant Sharma] Update mima 35b6c71 [Prashant Sharma] SPARK-3433 Fix for Mima false-positives with @DeveloperAPI and @Experimental annotations.
* [SPARK-3425] do not set MaxPermSize for OpenJDK 1.8Matthew Farrellee2014-09-151-2/+2
| | | | | | | | | | Closes #2387 Author: Matthew Farrellee <matt@redhat.com> Closes #2301 from mattf/SPARK-3425 and squashes the following commits: 20f3c09 [Matthew Farrellee] [SPARK-3425] do not set MaxPermSize for OpenJDK 1.8
* [Build] Removed -Phive-thriftserver since this profile has been removedCheng Lian2014-09-091-1/+1
| | | | | | | | Author: Cheng Lian <lian.cs.zju@gmail.com> Closes #2269 from liancheng/clean-run-tests-profile and squashes the following commits: 08617bd [Cheng Lian] Removed -Phive-thriftserver since this profile has been removed
* SPARK-3337 Paranoid quoting in shell to allow install dirs with spaces within.Prashant Sharma2014-09-085-15/+15
| | | | | | | | | | | | ... Tested ! TBH, it isn't a great idea to have directory with spaces within. Because emacs doesn't like it then hadoop doesn't like it. and so on... Author: Prashant Sharma <prashant.s@imaginea.com> Closes #2229 from ScrapCodes/SPARK-3337/quoting-shell-scripts and squashes the following commits: d4ad660 [Prashant Sharma] SPARK-3337 Paranoid quoting in shell to allow install dirs with spaces within.
* [SPARK-3361] Expand PEP 8 checks to include EC2 script and Python examplesNicholas Chammas2014-09-051-2/+3
| | | | | | | | | | | | | | This PR resolves [SPARK-3361](https://issues.apache.org/jira/browse/SPARK-3361) by expanding the PEP 8 checks to cover the remaining Python code base: * The EC2 script * All Python / PySpark examples Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #2297 from nchammas/pep8-rulez and squashes the following commits: 1e5ac9a [Nicholas Chammas] PEP 8 fixes to Python examples c3dbeff [Nicholas Chammas] PEP 8 fixes to EC2 script 65ef6e8 [Nicholas Chammas] expand PEP 8 checks
* [Build] suppress curl/wget progress barsNicholas Chammas2014-09-051-2/+2
| | | | | | | | | | | | | | In the Jenkins console output, `curl` gives us mountains of `#` symbols as it tries to show its download progress. ![noise from curl in Jenkins output](http://i.imgur.com/P2E7yUw.png) I don't think this is useful so I've changed things to suppress these progress bars. If there is actually some use to this, feel free to reject this proposal. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #2279 from nchammas/trim-test-output and squashes the following commits: 14a720c [Nicholas Chammas] suppress curl/wget progress bars
* [SPARK-3378] [DOCS] Replace the word "SparkSQL" with right word "Spark SQL"Kousuke Saruta2014-09-041-1/+1
| | | | | | | | | | Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes #2251 from sarutak/SPARK-3378 and squashes the following commits: 0bfe234 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-3378 bb5938f [Kousuke Saruta] Replaced rest of "SparkSQL" with "Spark SQL" 6df66de [Kousuke Saruta] Replaced "SparkSQL" with "Spark SQL"
* SPARK-3331 [BUILD] PEP8 tests fail because they check unzipped py4j codeSean Owen2014-09-021-1/+1
| | | | | | | | | | PEP8 tests run on files under "./python", but unzipped py4j code is found at "./python/build/py4j". Py4J code fails style checks and can fail ./dev/run-tests if this code is present locally. Author: Sean Owen <sowen@cloudera.com> Closes #2222 from srowen/SPARK-3331 and squashes the following commits: 34711ec [Sean Owen] Restrict lint check to pyspark/, since the local directory can contain unzipped py4j code in build/py4j
* [Spark QA] only check code files for new classesNicholas Chammas2014-08-301-6/+10
| | | | | | | | | | | | | | Look only at code files (`.py`, `.java`, and `.scala`) for new classes. Should get rid of false alarms like [the one reported here](https://github.com/apache/spark/pull/2014#issuecomment-52912040). Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #2184 from nchammas/jenkins-ignore-noncode and squashes the following commits: 33786ac [Nicholas Chammas] break up long line 3f91a14 [Nicholas Chammas] rename array of source files 8b82a26 [Nicholas Chammas] [Spark QA] only check code files for new classes
* BUILD: Adding back CDH4 as per user requestsPatrick Wendell2014-08-291-0/+1
|
* [Spark QA] Link to console output on test time outnchammas2014-08-281-1/+1
| | | | | | | | | | | | When tests time out we should link to the Jenkins console output for easy review. We already do this for when tests start or complete normally. Here's [a recent example](https://github.com/apache/spark/pull/2109#issuecomment-53374032) of where this would be helpful. Author: nchammas <nicholas.chammas@gmail.com> Closes #2140 from nchammas/patch-1 and squashes the following commits: 3b26c8d [nchammas] [Spark QA] Link to console output on test time out
* Add line continuation for script to work w/ py2.7.5Matthew Farrellee2014-08-271-2/+2
| | | | | | | | | | | | | | | | Error was - $ SPARK_HOME=$PWD/dist ./dev/create-release/generate-changelist.py File "./dev/create-release/generate-changelist.py", line 128 if day < SPARK_REPO_CHANGE_DATE1 or ^ SyntaxError: invalid syntax Author: Matthew Farrellee <matt@redhat.com> Closes #2139 from mattf/master-fix-generate-changelist.py-0 and squashes the following commits: 6b3a900 [Matthew Farrellee] Add line continuation for script to work w/ py2.7.5
* HOTFIX: Don't build with YARN support for Mapr3Patrick Wendell2014-08-271-1/+1
|
* [SPARK-3126][SPARK-3127][SQL] Fixed HiveThriftServer2SuiteCheng Lian2014-08-201-1/+1
| | | | | | | | | | | | | | | | This PR fixes two issues: 1. Fixes wrongly quoted command line option in `HiveThriftServer2Suite` that makes test cases hang until timeout. 1. Asks `dev/run-test` to run Spark SQL tests when `bin/spark-sql` and/or `sbin/start-thriftserver.sh` are modified. Author: Cheng Lian <lian.cs.zju@gmail.com> Closes #2036 from liancheng/fix-thriftserver-test and squashes the following commits: f38c4eb [Cheng Lian] Fixed the same quotation issue in CliSuite 26b82a0 [Cheng Lian] Run SQL tests when dff contains bin/spark-sql and/or sbin/start-thriftserver.sh a87f83d [Cheng Lian] Extended timeout e5aa31a [Cheng Lian] Fixed metastore JDBC URI quotation
* BUILD: Bump Hadoop versions in the release build.Patrick Wendell2014-08-201-5/+5
| | | | Also, minor modifications to the MapR profile.
* SPARK-3092 [SQL]: Always include the thriftserver when -Phive is enabled.Patrick Wendell2014-08-203-7/+7
| | | | | | | | | | | | | Currently we have a separate profile called hive-thriftserver. I originally suggested this in case users did not want to bundle the thriftserver, but it's ultimately lead to a lot of confusion. Since the thriftserver is only a few classes, I don't see a really good reason to isolate it from the rest of Hive. So let's go ahead and just include it in the same profile to simplify things. This has been suggested in the past by liancheng. Author: Patrick Wendell <pwendell@gmail.com> Closes #2006 from pwendell/hiveserver and squashes the following commits: 742ea40 [Patrick Wendell] Merge remote-tracking branch 'apache/master' into hiveserver 034ad47 [Patrick Wendell] SPARK-3092: Always include the thriftserver when -Phive is enabled.
* [SPARK-3114] [PySpark] Fix Python UDFs in Spark SQL.Josh Rosen2014-08-181-4/+13
| | | | | | | | | | | | | | | | This fixes SPARK-3114, an issue where we inadvertently broke Python UDFs in Spark SQL. This PR modifiers the test runner script to always run the PySpark SQL tests, irrespective of whether SparkSQL itself has been modified. It also includes Davies' fix for the bug. Closes #2026. Author: Josh Rosen <joshrosen@apache.org> Author: Davies Liu <davies.liu@gmail.com> Closes #2027 from JoshRosen/pyspark-sql-fix and squashes the following commits: 9af2708 [Davies Liu] bugfix: disable compression of command 0d8d3a4 [Josh Rosen] Always run Python Spark SQL tests.
* SPARK-2884: Create binary builds in parallel with release script.Patrick Wendell2014-08-171-4/+5
|
* [SPARK-3076] [Jenkins] catch & report test timeoutsNicholas Chammas2014-08-161-28/+20
| | | | | | | | | | | | | * Remove unused code to get jq * Set timeout on tests and report gracefully on them Author: Nicholas Chammas <nicholas.chammas@gmail.com> Closes #1974 from nchammas/master and squashes the following commits: d1f1b6b [Nicholas Chammas] set timeout to realistic number 8b1ea41 [Nicholas Chammas] fix formatting 279526e [Nicholas Chammas] [SPARK-3076] catch & report test timeouts
* [SPARK-2912] [Spark QA] Include commit hash in Spark QA messagesNicholas Chammas2014-08-141-49/+138
| | | | | | | | | | | | | | | | | | | | | | You can find the [discussion that motivated this PR here](http://mail-archives.apache.org/mod_mbox/spark-dev/201408.mbox/%3CCABPQxssy0ri2QAz=cc9Tx+EXYWARm7pNcVm8apqCwc-esLbO4Qmail.gmail.com%3E). As described in [SPARK-2912](https://issues.apache.org/jira/browse/SPARK-2912), the goal of this PR (and related ones to come) is to include useful detail in Spark QA's messages that are intended to make a committer's job easier to do. Since this work depends on Jenkins, I cannot test this locally. Hence, I will be iterating via this PR. Notes: * This is a duplicate of a [previous PR](https://github.com/apache/spark/pull/1811), without the extraneous commits. * This PR also resolves an issue targeted by [another open PR](https://github.com/apache/spark/pull/1809). Closes #1809. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Author: nchammas <nicholas.chammas@gmail.com> Closes #1816 from nchammas/master and squashes the following commits: c1be644 [Nicholas Chammas] [SPARK-2912] include commit hash in messages 8f641ac [nchammas] Merge pull request #7 from apache/master
* Make dev/mima runnable on Mac OS X.Reynold Xin2014-08-141-1/+3
| | | | | | | | | | Mac OS X's find is from the BSD variant that doesn't have -printf option. Author: Reynold Xin <rxin@apache.org> Closes #1953 from rxin/mima and squashes the following commits: e284afe [Reynold Xin] Make dev/mima runnable on Mac OS X.
* [SPARK-2894] spark-shell doesn't accept flagsKousuke Saruta2014-08-091-0/+2
| | | | | | | | | | | | | | | | | | | | As sryza reported, spark-shell doesn't accept any flags. The root cause is wrong usage of spark-submit in spark-shell and it come to the surface by #1801 Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Author: Cheng Lian <lian.cs.zju@gmail.com> Closes #1715, Closes #1864, and Closes #1861 Closes #1825 from sarutak/SPARK-2894 and squashes the following commits: 47f3510 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-2894 2c899ed [Kousuke Saruta] Removed useless code from java_gateway.py 98287ed [Kousuke Saruta] Removed useless code from java_gateway.py 513ad2e [Kousuke Saruta] Modified util.sh to enable to use option including white spaces 28a374e [Kousuke Saruta] Modified java_gateway.py to recognize arguments 5afc584 [Cheng Lian] Filter out spark-submit options when starting Python gateway e630d19 [Cheng Lian] Fixing pyspark and spark-shell CLI options
* HOTFIX: Support custom Java 7 locationPatrick Wendell2014-08-061-1/+8
|
* [SPARK-2627] [PySpark] have the build enforce PEP 8 automaticallyNicholas Chammas2014-08-064-2/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As described in [SPARK-2627](https://issues.apache.org/jira/browse/SPARK-2627), we'd like Python code to automatically be checked for PEP 8 compliance by Jenkins. This pull request aims to do that. Notes: * We may need to install [`pep8`](https://pypi.python.org/pypi/pep8) on the build server. * I'm expecting tests to fail now that PEP 8 compliance is being checked as part of the build. I'm fine with cleaning up any remaining PEP 8 violations as part of this pull request. * I did not understand why the RAT and scalastyle reports are saved to text files. I did the same for the PEP 8 check, but only so that the console output style can match those for the RAT and scalastyle checks. The PEP 8 report is removed right after the check is complete. * Updates to the ["Contributing to Spark"](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark) guide will be submitted elsewhere, as I don't believe that text is part of the Spark repo. Author: Nicholas Chammas <nicholas.chammas@gmail.com> Author: nchammas <nicholas.chammas@gmail.com> Closes #1744 from nchammas/master and squashes the following commits: 274b238 [Nicholas Chammas] [SPARK-2627] [PySpark] minor indentation changes 983d963 [nchammas] Merge pull request #5 from apache/master 1db5314 [nchammas] Merge pull request #4 from apache/master 0e0245f [Nicholas Chammas] [SPARK-2627] undo erroneous whitespace fixes bf30942 [Nicholas Chammas] [SPARK-2627] PEP8: comment spacing 6db9a44 [nchammas] Merge pull request #3 from apache/master 7b4750e [Nicholas Chammas] merge upstream changes 91b7584 [Nicholas Chammas] [SPARK-2627] undo unnecessary line breaks 44e3e56 [Nicholas Chammas] [SPARK-2627] use tox.ini to exclude files b09fae2 [Nicholas Chammas] don't wrap comments unnecessarily bfb9f9f [Nicholas Chammas] [SPARK-2627] keep up with the PEP 8 fixes 9da347f [nchammas] Merge pull request #2 from apache/master aa5b4b5 [Nicholas Chammas] [SPARK-2627] follow Spark bash style for if blocks d0a83b9 [Nicholas Chammas] [SPARK-2627] check that pep8 downloaded fine dffb5dd [Nicholas Chammas] [SPARK-2627] download pep8 at runtime a1ce7ae [Nicholas Chammas] [SPARK-2627] space out test report sections 21da538 [Nicholas Chammas] [SPARK-2627] it's PEP 8, not PEP8 6f4900b [Nicholas Chammas] [SPARK-2627] more misc PEP 8 fixes fe57ed0 [Nicholas Chammas] removing merge conflict backups 9c01d4c [nchammas] Merge pull request #1 from apache/master 9a66cb0 [Nicholas Chammas] resolving merge conflicts a31ccc4 [Nicholas Chammas] [SPARK-2627] miscellaneous PEP 8 fixes beaa9ac [Nicholas Chammas] [SPARK-2627] fail check on non-zero status 723ed39 [Nicholas Chammas] always delete the report file 0541ebb [Nicholas Chammas] [SPARK-2627] call Python linter from run-tests 12440fa [Nicholas Chammas] [SPARK-2627] add Scala linter 61c07b9 [Nicholas Chammas] [SPARK-2627] add Python linter 75ad552 [Nicholas Chammas] make check output style consistent
* [SPARK-2784][SQL] Deprecate hql() method in favor of a config option, ↵Michael Armbrust2014-08-031-4/+4
| | | | | | | | | | | | | | | | | | | | 'spark.sql.dialect' Many users have reported being confused by the distinction between the `sql` and `hql` methods. Specifically, many users think that `sql(...)` cannot be used to read hive tables. In this PR I introduce a new configuration option `spark.sql.dialect` that picks which dialect with be used for parsing. For SQLContext this must be set to `sql`. In `HiveContext` it defaults to `hiveql` but can also be set to `sql`. The `hql` and `hiveql` methods continue to act the same but are now marked as deprecated. **This is a possibly breaking change for some users unless they set the dialect manually, though this is unlikely.** For example: `hiveContex.sql("SELECT 1")` will now throw a parsing exception by default. Author: Michael Armbrust <michael@databricks.com> Closes #1746 from marmbrus/sqlLanguageConf and squashes the following commits: ad375cc [Michael Armbrust] Merge remote-tracking branch 'apache/master' into sqlLanguageConf 20c43f8 [Michael Armbrust] override function instead of just setting the value 7e4ae93 [Michael Armbrust] Deprecate hql() method in favor of a config option, 'spark.sql.dialect'
* [SPARK-2739][SQL] Rename registerAsTable to registerTempTableMichael Armbrust2014-08-021-1/+1
| | | | | | | | | | | | | There have been user complaints that the difference between `registerAsTable` and `saveAsTable` is too subtle. This PR addresses this by renaming `registerAsTable` to `registerTempTable`, which more clearly reflects what is happening. `registerAsTable` remains, but will cause a deprecation warning. Author: Michael Armbrust <michael@databricks.com> Closes #1743 from marmbrus/registerTempTable and squashes the following commits: d031348 [Michael Armbrust] Merge remote-tracking branch 'apache/master' into registerTempTable 4dff086 [Michael Armbrust] Fix .java files too 89a2f12 [Michael Armbrust] Merge remote-tracking branch 'apache/master' into registerTempTable 0b7b71e [Michael Armbrust] Rename registerAsTable to registerTempTable
* [SPARK-1981] Add AWS Kinesis streaming supportChris Fregly2014-08-026-4/+75
| | | | | | | | | | | | | | | | | | | | | | | | | Author: Chris Fregly <chris@fregly.com> Closes #1434 from cfregly/master and squashes the following commits: 4774581 [Chris Fregly] updated docs, renamed retry to retryRandom to be more clear, removed retries around store() method 0393795 [Chris Fregly] moved Kinesis examples out of examples/ and back into extras/kinesis-asl 691a6be [Chris Fregly] fixed tests and formatting, fixed a bug with JavaKinesisWordCount during union of streams 0e1c67b [Chris Fregly] Merge remote-tracking branch 'upstream/master' 74e5c7c [Chris Fregly] updated per TD's feedback. simplified examples, updated docs e33cbeb [Chris Fregly] Merge remote-tracking branch 'upstream/master' bf614e9 [Chris Fregly] per matei's feedback: moved the kinesis examples into the examples/ dir d17ca6d [Chris Fregly] per TD's feedback: updated docs, simplified the KinesisUtils api 912640c [Chris Fregly] changed the foundKinesis class to be a publically-avail class db3eefd [Chris Fregly] Merge remote-tracking branch 'upstream/master' 21de67f [Chris Fregly] Merge remote-tracking branch 'upstream/master' 6c39561 [Chris Fregly] parameterized the versions of the aws java sdk and kinesis client 338997e [Chris Fregly] improve build docs for kinesis 828f8ae [Chris Fregly] more cleanup e7c8978 [Chris Fregly] Merge remote-tracking branch 'upstream/master' cd68c0d [Chris Fregly] fixed typos and backward compatibility d18e680 [Chris Fregly] Merge remote-tracking branch 'upstream/master' b3b0ff1 [Chris Fregly] [SPARK-1981] Add AWS Kinesis streaming support