spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SPARK-1517] Refactor release scripts to facilitate nightly publishing	Patrick Wendell	2015-08-11	1	-267/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This update contains some code changes to the release scripts that allow easier nightly publishing. I've been using these new scripts on Jenkins for cutting and publishing nightly snapshots for the last month or so, and it has been going well. I'd like to get them merged back upstream so this can be maintained by the community. The main changes are: 1. Separates the release tagging from various build possibilities for an already tagged release (`release-tag.sh` and `release-build.sh`). 2. Allow for injecting credentials through the environment, including GPG keys. This is then paired with secure key injection in Jenkins. 3. Support for copying build results to a remote directory, and also "rotating" results, e.g. the ability to keep the last N copies of binary or doc builds. I'm happy if anyone wants to take a look at this - it's not user facing but an internal utility used for generating releases. Author: Patrick Wendell <patrick@databricks.com> Closes #7411 from pwendell/release-script-updates and squashes the following commits: 74f9beb [Patrick Wendell] Moving maven build command to a variable 233ce85 [Patrick Wendell] [SPARK-1517] Refactor release scripts to facilitate nightly publishing
*	[SPARK-9507] [BUILD] Remove dependency reduced POM hack now that shade ↵	Sean Owen	2015-07-31	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	plugin is updated Update to shade plugin 2.4.1, which removes the need for the dependency-reduced-POM workaround and the 'release' profile. Fix management of shade plugin version so children inherit it; bump assembly plugin version while here See https://issues.apache.org/jira/browse/SPARK-8819 I verified that `mvn clean package -DskipTests` works with Maven 3.3.3. pwendell are you up for trying this for the 1.5.0 release? Author: Sean Owen <sowen@cloudera.com> Closes #7826 from srowen/SPARK-9507 and squashes the following commits: e0b0fd2 [Sean Owen] Update to shade plugin 2.4.1, which removes the need for the dependency-reduced-POM workaround and the 'release' profile. Fix management of shade plugin version so children inherit it; bump assembly plugin version while here
*	[SPARK-8401] [BUILD] Scala version switching build enhancements	Michael Allman	2015-07-21	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These commits address a few minor issues in the Scala cross-version support in the build: 1. Correct two missing `${scala.binary.version}` pom file substitutions. 2. Don't update `scala.binary.version` in parent POM. This property is set through profiles. 3. Update the source of the generated scaladocs in `docs/_plugins/copy_api_dirs.rb`. 4. Factor common code out of `dev/change-version-to-.sh` and add some validation. We also test `sed` to see if it's GNU sed and try `gsed` as an alternative if not. This prevents the script from running with a non-GNU sed. This is my original work and I license this work to the Spark project under the Apache License. Author: Michael Allman <michael@videoamp.com> Closes #6832 from mallman/scala-versions and squashes the following commits: cde2f17 [Michael Allman] Delete dev/change-version-to-.sh, replacing them with single dev/change-scala-version.sh script that takes a version as argument 02296f2 [Michael Allman] Make the scala version change scripts cross-platform by restricting ourselves to POSIX sed syntax instead of looking for GNU sed ad9b40a [Michael Allman] Factor change-scala-version.sh out of change-version-to-*.sh, adding command line argument validation and testing for GNU sed bdd20bf [Michael Allman] Update source of scaladocs when changing Scala version 475088e [Michael Allman] Replace jackson-module-scala_2.10 with jackson-module-scala_${scala.binary.version}
*	[HOTFIX] Rename release-profile to release	Patrick Wendell	2015-07-06	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	when publishing releases. We named it as 'release-profile' because that is the Maven convention. However, it turns out this special name causes several other things to kick-in when we are creating releases that are not desirable. For instance, it triggers the javadoc plugin to run, which actually fails in our current build set-up. The fix is just to rename this to a different profile to have no collateral damage associated with its use.
*	[SPARK-8819] Fix build for maven 3.3.x	Andrew Or	2015-07-06	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a workaround for MSHADE-148, which leads to an infinite loop when building Spark with maven 3.3.x. This was originally caused by #6441, which added a bunch of test dependencies on the spark-core test module. Recently, it was revealed by #7193. This patch adds a `-Prelease` profile. If present, it will set `createDependencyReducedPom` to true. The consequences are: - If you are releasing Spark with this profile, you are fine as long as you use maven 3.2.x or before. - If you are releasing Spark without this profile, you will run into SPARK-8781. - If you are not releasing Spark but you are using this profile, you may run into SPARK-8819. - If you are not releasing Spark and you did not include this profile, you are fine. This is all documented in `pom.xml` and tested locally with both versions of maven. Author: Andrew Or <andrew@databricks.com> Closes #7219 from andrewor14/fix-maven-build and squashes the following commits: 1d37e87 [Andrew Or] Merge branch 'master' of github.com:apache/spark into fix-maven-build 3574ae4 [Andrew Or] Review comments f39199c [Andrew Or] Create a -Prelease profile that flags `createDependencyReducedPom`
*	[SPARK-8027] [SPARKR] Move man pages creation to install-dev.sh	Shivaram Venkataraman	2015-06-04	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This also helps us get rid of the sparkr-docs maven profile as docs are now built by just using -Psparkr when the roxygen2 package is available Related to discussion in #6567 cc pwendell srowen -- Let me know if this looks better Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #6593 from shivaram/sparkr-pom-cleanup and squashes the following commits: b282241 [Shivaram Venkataraman] Remove sparkr-docs from release script as well 8f100a5 [Shivaram Venkataraman] Move man pages creation to install-dev.sh This also helps us get rid of the sparkr-docs maven profile as docs are now built by just using -Psparkr when the roxygen2 package is available
*	[SPARK-8027] [SPARKR] Add maven profile to build R package docs	Shivaram Venkataraman	2015-06-01	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Also use that profile in create-release.sh cc pwendell -- Note that this means that we need `knitr` and `roxygen` installed on the machines used for building the release. Let me know if you need help with that. Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #6567 from shivaram/SPARK-8027 and squashes the following commits: 8dc8ecf [Shivaram Venkataraman] Add maven profile to build R package docs Also use that profile in create-release.sh
*	[MINOR] Add SparkR to create-release script	Shivaram Venkataraman	2015-05-22	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Enables the SparkR profiles for all the binary builds we create cc pwendell Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #6371 from shivaram/sparkr-create-release and squashes the following commits: ca5a0b2 [Shivaram Venkataraman] Add -Psparkr to create-release.sh
*	[SPARK-7249] Updated Hadoop dependencies due to inconsistency in the versions	FavioVazquez	2015-05-14	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Updated Hadoop dependencies due to inconsistency in the versions. Now the global properties are the ones used by the hadoop-2.2 profile, and the profile was set to empty but kept for backwards compatibility reasons. Changes proposed by vanzin resulting from previous pull-request https://github.com/apache/spark/pull/5783 that did not fixed the problem correctly. Please let me know if this is the correct way of doing this, the comments of vanzin are in the pull-request mentioned. Author: FavioVazquez <favio.vazquezp@gmail.com> Closes #5786 from FavioVazquez/update-hadoop-dependencies and squashes the following commits: 11670e5 [FavioVazquez] - Added missing instance of -Phadoop-2.2 in create-release.sh 379f50d [FavioVazquez] - Added instances of -Phadoop-2.2 in create-release.sh, run-tests, scalastyle and building-spark.md - Reconstructed docs to not ask users to rely on default behavior 3f9249d [FavioVazquez] Merge branch 'master' of https://github.com/apache/spark into update-hadoop-dependencies 31bdafa [FavioVazquez] - Added missing instances in -Phadoop-1 in create-release.sh, run-tests and in the building-spark documentation cbb93e8 [FavioVazquez] - Added comment related to SPARK-3710 about hadoop-yarn-server-tests in Hadoop 2.2 that fails to pull some needed dependencies 83dc332 [FavioVazquez] - Cleaned up the main POM concerning the yarn profile - Erased hadoop-2.2 profile from yarn/pom.xml and its content was integrated into yarn/pom.xml 93f7624 [FavioVazquez] - Deleted unnecessary comments and <activation> tag on the YARN profile in the main POM 668d126 [FavioVazquez] - Moved <dependencies> <activation> and <properties> sections of the hadoop-2.2 profile in the YARN POM to the YARN profile in the root POM - Erased unnecessary hadoop-2.2 profile from the YARN POM fda6a51 [FavioVazquez] - Updated hadoop1 releases in create-release.sh due to changes in the default hadoop version set - Erased unnecessary instance of -Dyarn.version=2.2.0 in create-release.sh - Prettify comment in yarn/pom.xml 0470587 [FavioVazquez] - Erased unnecessary instance of -Phadoop-2.2 -Dhadoop.version=2.2.0 in create-release.sh - Updated how the releases are made in the create-release.sh no that the default hadoop version is the 2.2.0 - Erased unnecessary instance of -Phadoop-2.2 -Dhadoop.version=2.2.0 in scalastyle - Erased unnecessary instance of -Phadoop-2.2 -Dhadoop.version=2.2.0 in run-tests - Better example given in the hadoop-third-party-distributions.md now that the default hadoop version is 2.2.0 a650779 [FavioVazquez] - Default value of avro.mapred.classifier has been set to hadoop2 in pom.xml - Cleaned up hadoop-2.3 and 2.4 profiles due to change in the default set in avro.mapred.classifier in pom.xml 199f40b [FavioVazquez] - Erased unnecessary CDH5-specific note in docs/building-spark.md - Remove example of instance -Phadoop-2.2 -Dhadoop.version=2.2.0 in docs/building-spark.md - Enabled hadoop-2.2 profile when the Hadoop version is 2.2.0, which is now the default .Added comment in the yarn/pom.xml to specify that. 88a8b88 [FavioVazquez] - Simplified Hadoop profiles due to new setting of global properties in the pom.xml file - Added comment to specify that the hadoop-2.2 profile is now the default hadoop profile in the pom.xml file - Erased hadoop-2.2 from related hadoop profiles now that is a no-op in the make-distribution.sh file 70b8344 [FavioVazquez] - Fixed typo in the make-distribution.sh file and added hadoop-1 in the Related profiles 287fa2f [FavioVazquez] - Updated documentation about specifying the hadoop version in building-spark. Now is clear that Spark will build against Hadoop 2.2.0 by default. - Added Cloudera CDH 5.3.3 without MapReduce example in the building-spark doc. 1354292 [FavioVazquez] - Fixed hadoop-1 version to match jenkins build profile in hadoop1.0 tests and documentation 6b4bfaf [FavioVazquez] - Cleanup in hadoop-2.x profiles since they contained mostly redundant stuff. 7e9955d [FavioVazquez] - Updated Hadoop dependencies due to inconsistency in the versions. Now the global properties are the ones used by the hadoop-2.2 profile, and the profile was set to empty but kept for backwards compatibility reasons 660decc [FavioVazquez] - Updated Hadoop dependencies due to inconsistency in the versions. Now the global properties are the ones used by the hadoop-2.2 profile, and the profile was set to empty but kept for backwards compatibility reasons ec91ce3 [FavioVazquez] - Updated protobuf-java version of com.google.protobuf dependancy to fix blocking error when connecting to HDFS via the Hadoop Cloudera HDFS CDH5 (fix for 2.5.0-cdh5.3.3 version)
*	[SPARK-4925] Publish Spark SQL hive-thriftserver maven artifact	Misha Chernetsov	2015-04-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	turned on hive-thriftserver profile in release script Author: Misha Chernetsov <chernetsov@gmail.com> Closes #5429 from chernetsov/master and squashes the following commits: 9cc36af [Misha Chernetsov] [SPARK-4925] Publish Spark SQL hive-thriftserver maven artifact turned on hive-thriftserver profile in release script for scala 2.10
*	HOTFIX: Changes to release script.	Patrick Wendell	2015-03-12	1	-20/+21
\| \| \| \| \| \|	This fixes a big in the release script and also properly sets things up so that Zinc launches multiple processes. I had done something similar in 0c9a8e but it didn't fully work.
*	BUILD: Minor tweaks to internal build scripts	Patrick Wendell	2015-03-03	1	-5/+19
\| \| \| \| \| \| \| \|	This adds two features: 1. The ability to publish with a different maven version than that specified in the release source. 2. Forking of different Zinc instances during the parallel dist creation (to help with some stability issues).
*	[SPARK-5944] [PySpark] fix version in Python API docs	Davies Liu	2015-02-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	use RELEASE_VERSION when building the Python API docs Author: Davies Liu <davies@databricks.com> Closes #4731 from davies/api_version and squashes the following commits: c9744c9 [Davies Liu] Update create-release.sh 08cbc3f [Davies Liu] fix python docs
*	SPARK-5542: Decouple publishing, packaging, and tagging in release script	Patrick Wendell	2015-02-02	1	-89/+99
\| \| \| \| \| \| \| \| \| \| \| \| \|	These are some changes to the build script to allow parts of it to be run independently. This has already been tested during the 1.2.1 release cycle. Author: Patrick Wendell <patrick@databricks.com> Author: Patrick Wendell <pwendell@gmail.com> Closes #4319 from pwendell/release-updates and squashes the following commits: dfe7ed9 [Patrick Wendell] Merge remote-tracking branch 'apache/master' into release-updates 478b072 [Patrick Wendell] Merge remote-tracking branch 'apache/master' into release-updates 126dd0c [Patrick Wendell] Allow decoupling Maven publishing from cutting release
*	SPARK-5308 [BUILD] MD5 / SHA1 hash format doesn't match standard Maven output	Sean Owen	2015-01-27	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \|	Here's one way to make the hashes match what Maven's plugins would create. It takes a little extra footwork since OS X doesn't have the same command line tools. An alternative is just to make Maven output these of course - would that be better? I ask in case there is a reason I'm missing, like, we need to hash files that Maven doesn't build. Author: Sean Owen <sowen@cloudera.com> Closes #4161 from srowen/SPARK-5308 and squashes the following commits: 70d09d0 [Sean Owen] Use $(...) syntax e25eff8 [Sean Owen] Generate MD5, SHA1 hashes in a format like Maven's plugin
*	[SPARK-4501][Core] - Create build/mvn to automatically download ↵	Brennon York	2014-12-27	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	maven/zinc/scalac Creates a top level directory script (as `build/mvn`) to automatically download zinc and the specific version of scala used to easily build spark. This will also download and install maven if the user doesn't already have it and all packages are hosted under the `build/` directory. Tested on both Linux and OSX OS's and both work. All commands pass through to the maven binary so it acts exactly as a traditional maven call would. Author: Brennon York <brennon.york@capitalone.com> Closes #3707 from brennonyork/SPARK-4501 and squashes the following commits: 0e5a0e4 [Brennon York] minor incorrect doc verbage (with -> this) 9b79e38 [Brennon York] fixed merge conflicts with dev/run-tests, properly quoted args in sbt/sbt, fixed bug where relative paths would fail if passed in from build/mvn d2d41b6 [Brennon York] added blurb about leverging zinc with build/mvn b979c58 [Brennon York] updated the merge conflict c5634de [Brennon York] updated documentation to overview build/mvn, updated all points where sbt/sbt was referenced with build/sbt b8437ba [Brennon York] set progress bars for curl and wget when not run on jenkins, no progress bar when run on jenkins, moved sbt script to build/sbt, wrote stub and warning under sbt/sbt which calls build/sbt, modified build/sbt to use the correct directory, fixed bug in build/sbt-launch-lib.bash to correctly pull the sbt version be11317 [Brennon York] added switch to silence download progress only if AMPLAB_JENKINS is set 28d0a99 [Brennon York] updated to remove the python dependency, uses grep instead 7e785a6 [Brennon York] added silent and quiet flags to curl and wget respectively, added single echo output to denote start of a download if download is needed 14a5da0 [Brennon York] removed unnecessary zinc output on startup 1af4a94 [Brennon York] fixed bug with uppercase vs lowercase variable 3e8b9b3 [Brennon York] updated to properly only restart zinc if it was freshly installed a680d12 [Brennon York] Added comments to functions and tested various mvn calls bb8cc9d [Brennon York] removed package files ef017e6 [Brennon York] removed OS complexities, setup generic install_app call, removed extra file complexities, removed help, removed forced install (defaults now), removed double-dash from cli 07bf018 [Brennon York] Updated to specifically handle pulling down the correct scala version f914dea [Brennon York] Beginning final portions of localized scala home 69c4e44 [Brennon York] working linux and osx installers for purely local mvn build 4a1609c [Brennon York] finalizing working linux install for maven to local ./build/apache-maven folder cbfcc68 [Brennon York] Changed the default sbt/sbt to build/sbt and added a build/mvn which will automatically download, install, and execute maven with zinc for easier build capability
*	[HOTFIX] Fixing two issues with the release script.	Patrick Wendell	2014-12-04	1	-11/+20
\| \| \| \| \| \| \| \| \| \| \|	1. The version replacement was still producing some false changes. 2. Uploads to the staging repo specifically. Author: Patrick Wendell <pwendell@gmail.com> Closes #3608 from pwendell/release-script and squashes the following commits: 3c63294 [Patrick Wendell] Fixing two issues with the release script:
*	[HOTFIX]: Adding back without-hive dist	Patrick Wendell	2014-11-25	1	-0/+1
\|
*	SPARK-4466: Provide support for publishing Scala 2.11 artifacts to Maven	Patrick Wendell	2014-11-17	1	-33/+105
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The maven release plug-in does not have support for publishing two separate sets of artifacts for a single release. Because of the way that Scala 2.11 support in Spark works, we have to write some customized code to do this. The good news is that the Maven release API is just a thin wrapper on doing git commits and pushing artifacts to the HTTP API of Apache's Sonatype server and this might overall make our deployment easier to understand. This was already used for the 1.2 snapshot, so I think it is working well. One other nice thing is this could be pretty easily extended to publish nightly snapshots. Author: Patrick Wendell <pwendell@gmail.com> Closes #3332 from pwendell/releases and squashes the following commits: 2fedaed [Patrick Wendell] Automate the opening and closing of Sonatype repos e2a24bb [Patrick Wendell] Fixing issue where we overrode non-spark version numbers 9df3a50 [Patrick Wendell] Adding TODO 1cc1749 [Patrick Wendell] Don't build the thriftserver for 2.11 933201a [Patrick Wendell] Make tagging of release commit eager d0388a6 [Patrick Wendell] Support Scala 2.11 build 4f4dc62 [Patrick Wendell] Change to 2.11 should not be included when committing new patch bf742e1 [Patrick Wendell] Minor fixes ffa1df2 [Patrick Wendell] Adding a Scala 2.11 package to test it 9ac4381 [Patrick Wendell] Addressing TODO b3105ff [Patrick Wendell] Removing commented out code d906803 [Patrick Wendell] Small fix 3f4d985 [Patrick Wendell] More work fcd54c2 [Patrick Wendell] Consolidating use of keys df2af30 [Patrick Wendell] Changes to release stuff
*	[Release] Correct make-distribution.sh log path	Andrew Or	2014-11-12	1	-1/+1
\|
*	Support cross building for Scala 2.11	Prashant Sharma	2014-11-11	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Let's give this another go using a version of Hive that shades its JLine dependency. Author: Prashant Sharma <prashant.s@imaginea.com> Author: Patrick Wendell <pwendell@gmail.com> Closes #3159 from pwendell/scala-2.11-prashant and squashes the following commits: e93aa3e [Patrick Wendell] Restoring -Phive-thriftserver profile and cleaning up build script. f65d17d [Patrick Wendell] Fixing build issue due to merge conflict a8c41eb [Patrick Wendell] Reverting dev/run-tests back to master state. 7a6eb18 [Patrick Wendell] Merge remote-tracking branch 'apache/master' into scala-2.11-prashant 583aa07 [Prashant Sharma] REVERT ME: removed hive thirftserver 3680e58 [Prashant Sharma] Revert "REVERT ME: Temporarily removing some Cli tests." 935fb47 [Prashant Sharma] Revert "Fixed by disabling a few tests temporarily." 925e90f [Prashant Sharma] Fixed by disabling a few tests temporarily. 2fffed3 [Prashant Sharma] Exclude groovy from sbt build, and also provide a way for such instances in future. 8bd4e40 [Prashant Sharma] Switched to gmaven plus, it fixes random failures observer with its predecessor gmaven. 5272ce5 [Prashant Sharma] SPARK_SCALA_VERSION related bugs. 2121071 [Patrick Wendell] Migrating version detection to PySpark b1ed44d [Patrick Wendell] REVERT ME: Temporarily removing some Cli tests. 1743a73 [Patrick Wendell] Removing decimal test that doesn't work with Scala 2.11 f5cad4e [Patrick Wendell] Add Scala 2.11 docs 210d7e1 [Patrick Wendell] Revert "Testing new Hive version with shaded jline" 48518ce [Patrick Wendell] Remove association of Hive and Thriftserver profiles. e9d0a06 [Patrick Wendell] Revert "Enable thritfserver for Scala 2.10 only" 67ec364 [Patrick Wendell] Guard building of thriftserver around Scala 2.10 check 8502c23 [Patrick Wendell] Enable thritfserver for Scala 2.10 only e22b104 [Patrick Wendell] Small fix in pom file ec402ab [Patrick Wendell] Various fixes 0be5a9d [Patrick Wendell] Testing new Hive version with shaded jline 4eaec65 [Prashant Sharma] Changed scripts to ignore target. 5167bea [Prashant Sharma] small correction a4fcac6 [Prashant Sharma] Run against scala 2.11 on jenkins. 80285f4 [Prashant Sharma] MAven equivalent of setting spark.executor.extraClasspath during tests. 034b369 [Prashant Sharma] Setting test jars on executor classpath during tests from sbt. d4874cb [Prashant Sharma] Fixed Python Runner suite. null check should be first case in scala 2.11. 6f50f13 [Prashant Sharma] Fixed build after rebasing with master. We should use ${scala.binary.version} instead of just 2.10 e56ca9d [Prashant Sharma] Print an error if build for 2.10 and 2.11 is spotted. 937c0b8 [Prashant Sharma] SCALA_VERSION -> SPARK_SCALA_VERSION cb059b0 [Prashant Sharma] Code review 0476e5e [Prashant Sharma] Scala 2.11 support with repl and all build changes.
*	[Release] Log build output for each distribution	Andrew Or	2014-11-11	1	-1/+2
\|
*	BUILD: Adding back CDH4 as per user requests	Patrick Wendell	2014-08-29	1	-0/+1
\|
*	HOTFIX: Don't build with YARN support for Mapr3	Patrick Wendell	2014-08-27	1	-1/+1
\|
*	BUILD: Bump Hadoop versions in the release build.	Patrick Wendell	2014-08-20	1	-5/+5
\| \| \| \|	Also, minor modifications to the MapR profile.
*	SPARK-3092 [SQL]: Always include the thriftserver when -Phive is enabled.	Patrick Wendell	2014-08-20	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we have a separate profile called hive-thriftserver. I originally suggested this in case users did not want to bundle the thriftserver, but it's ultimately lead to a lot of confusion. Since the thriftserver is only a few classes, I don't see a really good reason to isolate it from the rest of Hive. So let's go ahead and just include it in the same profile to simplify things. This has been suggested in the past by liancheng. Author: Patrick Wendell <pwendell@gmail.com> Closes #2006 from pwendell/hiveserver and squashes the following commits: 742ea40 [Patrick Wendell] Merge remote-tracking branch 'apache/master' into hiveserver 034ad47 [Patrick Wendell] SPARK-3092: Always include the thriftserver when -Phive is enabled.
*	SPARK-2884: Create binary builds in parallel with release script.	Patrick Wendell	2014-08-17	1	-4/+5
\|
*	HOTFIX: Support custom Java 7 location	Patrick Wendell	2014-08-06	1	-1/+8
\|
*	[SPARK-1981] Add AWS Kinesis streaming support	Chris Fregly	2014-08-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Chris Fregly <chris@fregly.com> Closes #1434 from cfregly/master and squashes the following commits: 4774581 [Chris Fregly] updated docs, renamed retry to retryRandom to be more clear, removed retries around store() method 0393795 [Chris Fregly] moved Kinesis examples out of examples/ and back into extras/kinesis-asl 691a6be [Chris Fregly] fixed tests and formatting, fixed a bug with JavaKinesisWordCount during union of streams 0e1c67b [Chris Fregly] Merge remote-tracking branch 'upstream/master' 74e5c7c [Chris Fregly] updated per TD's feedback. simplified examples, updated docs e33cbeb [Chris Fregly] Merge remote-tracking branch 'upstream/master' bf614e9 [Chris Fregly] per matei's feedback: moved the kinesis examples into the examples/ dir d17ca6d [Chris Fregly] per TD's feedback: updated docs, simplified the KinesisUtils api 912640c [Chris Fregly] changed the foundKinesis class to be a publically-avail class db3eefd [Chris Fregly] Merge remote-tracking branch 'upstream/master' 21de67f [Chris Fregly] Merge remote-tracking branch 'upstream/master' 6c39561 [Chris Fregly] parameterized the versions of the aws java sdk and kinesis client 338997e [Chris Fregly] improve build docs for kinesis 828f8ae [Chris Fregly] more cleanup e7c8978 [Chris Fregly] Merge remote-tracking branch 'upstream/master' cd68c0d [Chris Fregly] fixed typos and backward compatibility d18e680 [Chris Fregly] Merge remote-tracking branch 'upstream/master' b3b0ff1 [Chris Fregly] [SPARK-1981] Add AWS Kinesis streaming support
*	SPARK-2741 - Publish version of spark assembly which does not contain Hive	Brock Noland	2014-07-30	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	Provide a version of the Spark tarball which does not package Hive. This is meant for HIve + Spark users. Author: Brock Noland <brock@apache.org> Closes #1667 from brockn/master and squashes the following commits: 5beafb2 [Brock Noland] SPARK-2741 - Publish version of spark assembly which does not contain Hive
*	[SPARK-2410][SQL] Merging Hive Thrift/JDBC server (with Maven profile fix)	Cheng Lian	2014-07-28	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	JIRA issue: [SPARK-2410](https://issues.apache.org/jira/browse/SPARK-2410) Another try for #1399 & #1600. Those two PR breaks Jenkins builds because we made a separate profile `hive-thriftserver` in sub-project `assembly`, but the `hive-thriftserver` module is defined outside the `hive-thriftserver` profile. Thus every time a pull request that doesn't touch SQL code will also execute test suites defined in `hive-thriftserver`, but tests fail because related .class files are not included in the assembly jar. In the most recent commit, module `hive-thriftserver` is moved into its own profile to fix this problem. All previous commits are squashed for clarity. Author: Cheng Lian <lian.cs.zju@gmail.com> Closes #1620 from liancheng/jdbc-with-maven-fix and squashes the following commits: 629988e [Cheng Lian] Moved hive-thriftserver module definition into its own profile ec3c7a7 [Cheng Lian] Cherry picked the Hive Thrift server
*	Revert "[SPARK-2410][SQL] Merging Hive Thrift/JDBC server"	Patrick Wendell	2014-07-27	1	-5/+5
\| \| \| \|	This reverts commit f6ff2a61d00d12481bfb211ae13d6992daacdcc2.
*	[SPARK-2410][SQL] Merging Hive Thrift/JDBC server	Cheng Lian	2014-07-27	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(This is a replacement of #1399, trying to fix potential `HiveThriftServer2` port collision between parallel builds. Please refer to [these comments](https://github.com/apache/spark/pull/1399#issuecomment-50212572) for details.) JIRA issue: [SPARK-2410](https://issues.apache.org/jira/browse/SPARK-2410) Merging the Hive Thrift/JDBC server from [branch-1.0-jdbc](https://github.com/apache/spark/tree/branch-1.0-jdbc). Thanks chenghao-intel for his initial contribution of the Spark SQL CLI. Author: Cheng Lian <lian.cs.zju@gmail.com> Closes #1600 from liancheng/jdbc and squashes the following commits: ac4618b [Cheng Lian] Uses random port for HiveThriftServer2 to avoid collision with parallel builds 090beea [Cheng Lian] Revert changes related to SPARK-2678, decided to move them to another PR 21c6cf4 [Cheng Lian] Updated Spark SQL programming guide docs fe0af31 [Cheng Lian] Reordered spark-submit options in spark-shell[.cmd] 199e3fb [Cheng Lian] Disabled MIMA for hive-thriftserver 1083e9d [Cheng Lian] Fixed failed test suites 7db82a1 [Cheng Lian] Fixed spark-submit application options handling logic 9cc0f06 [Cheng Lian] Starts beeline with spark-submit cfcf461 [Cheng Lian] Updated documents and build scripts for the newly added hive-thriftserver profile 061880f [Cheng Lian] Addressed all comments by @pwendell 7755062 [Cheng Lian] Adapts test suites to spark-submit settings 40bafef [Cheng Lian] Fixed more license header issues e214aab [Cheng Lian] Added missing license headers b8905ba [Cheng Lian] Fixed minor issues in spark-sql and start-thriftserver.sh f975d22 [Cheng Lian] Updated docs for Hive compatibility and Shark migration guide draft 3ad4e75 [Cheng Lian] Starts spark-sql shell with spark-submit a5310d1 [Cheng Lian] Make HiveThriftServer2 play well with spark-submit 61f39f4 [Cheng Lian] Starts Hive Thrift server via spark-submit 2c4c539 [Cheng Lian] Cherry picked the Hive Thrift server
*	Revert "[SPARK-2410][SQL] Merging Hive Thrift/JDBC server"	Michael Armbrust	2014-07-25	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 06dc0d2c6b69c5d59b4d194ced2ac85bfe2e05e2. #1399 is making Jenkins fail. We should investigate and put this back after its passing tests. Author: Michael Armbrust <michael@databricks.com> Closes #1594 from marmbrus/revertJDBC and squashes the following commits: 59748da [Michael Armbrust] Revert "[SPARK-2410][SQL] Merging Hive Thrift/JDBC server"
*	[SPARK-2410][SQL] Merging Hive Thrift/JDBC server	Cheng Lian	2014-07-25	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	JIRA issue: - Main: [SPARK-2410](https://issues.apache.org/jira/browse/SPARK-2410) - Related: [SPARK-2678](https://issues.apache.org/jira/browse/SPARK-2678) Cherry picked the Hive Thrift/JDBC server from [branch-1.0-jdbc](https://github.com/apache/spark/tree/branch-1.0-jdbc). (Thanks chenghao-intel for his initial contribution of the Spark SQL CLI.) TODO - [x] Use `spark-submit` to launch the server, the CLI and beeline - [x] Migration guideline draft for Shark users ---- Hit by a bug in `SparkSubmitArguments` while working on this PR: all application options that are recognized by `SparkSubmitArguments` are stolen as `SparkSubmit` options. For example: ```bash $ spark-submit --class org.apache.hive.beeline.BeeLine spark-internal --help ``` This actually shows usage information of `SparkSubmit` rather than `BeeLine`. ~~Fixed this bug here since the `spark-internal` related stuff also touches `SparkSubmitArguments` and I'd like to avoid conflict.~~ UPDATE The bug mentioned above is now tracked by [SPARK-2678](https://issues.apache.org/jira/browse/SPARK-2678). Decided to revert changes to this bug since it involves more subtle considerations and worth a separate PR. Author: Cheng Lian <lian.cs.zju@gmail.com> Closes #1399 from liancheng/thriftserver and squashes the following commits: 090beea [Cheng Lian] Revert changes related to SPARK-2678, decided to move them to another PR 21c6cf4 [Cheng Lian] Updated Spark SQL programming guide docs fe0af31 [Cheng Lian] Reordered spark-submit options in spark-shell[.cmd] 199e3fb [Cheng Lian] Disabled MIMA for hive-thriftserver 1083e9d [Cheng Lian] Fixed failed test suites 7db82a1 [Cheng Lian] Fixed spark-submit application options handling logic 9cc0f06 [Cheng Lian] Starts beeline with spark-submit cfcf461 [Cheng Lian] Updated documents and build scripts for the newly added hive-thriftserver profile 061880f [Cheng Lian] Addressed all comments by @pwendell 7755062 [Cheng Lian] Adapts test suites to spark-submit settings 40bafef [Cheng Lian] Fixed more license header issues e214aab [Cheng Lian] Added missing license headers b8905ba [Cheng Lian] Fixed minor issues in spark-sql and start-thriftserver.sh f975d22 [Cheng Lian] Updated docs for Hive compatibility and Shark migration guide draft 3ad4e75 [Cheng Lian] Starts spark-sql shell with spark-submit a5310d1 [Cheng Lian] Make HiveThriftServer2 play well with spark-submit 61f39f4 [Cheng Lian] Starts Hive Thrift server via spark-submit 2c4c539 [Cheng Lian] Cherry picked the Hive Thrift server
*	SPARK-2526: Simplify options in make-distribution.sh	Patrick Wendell	2014-07-17	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	Right now we have a bunch of parallel logic in make-distribution.sh that's just extra work to maintain. We should just pass through Maven profiles in this case and keep the script simple. See the JIRA for more details. Author: Patrick Wendell <pwendell@gmail.com> Closes #1445 from pwendell/make-distribution.sh and squashes the following commits: f1294ea [Patrick Wendell] Simplify options in make-distribution.sh.
*	HOTFIX: Clean before building docs during release.	Patrick Wendell	2014-07-04	1	-0/+1
\| \| \| \| \| \|	If the docs are built after a Maven build has finished the intermediate state somehow causes a compiler bug during sbt compilation. This just does a clean before attempting to build the docs.
*	HOTFIX: Don't build Javadoc in Maven when creating releases.	Patrick Wendell	2014-05-15	1	-2/+4
\| \| \| \| \| \| \| \| \| \|	Because we've added java package descriptions in some packages that don't have any Java files, running the Javadoc target hits this issue: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4492654 To fix this I've simply removed the javadoc target when publishing releases.
*	Adding hadoop-2.2 profile to the build	Patrick Wendell	2014-05-12	1	-2/+2
\|
*	BUILD: Include Hive with default packages when creating a release	Patrick Wendell	2014-05-12	1	-3/+3
\|
*	HOTFIX: minor change to release script	Patrick Wendell	2014-04-29	1	-1/+1
\|
*	HOTFIX: minor change to release script	Patrick Wendell	2014-04-29	1	-2/+4
\|
*	HOTFIX: Bug in release script	Patrick Wendell	2014-04-29	1	-0/+1
\|
*	Changes to dev release script	Patrick Wendell	2014-04-28	1	-27/+32
\|
*	Small changes to release script	Patrick Wendell	2014-04-24	1	-3/+1
\|
*	SPARK-1119 and other build improvements	Patrick Wendell	2014-04-23	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \|	1. Makes assembly and examples jar naming consistent in maven/sbt. 2. Updates make-distribution.sh to use Maven and fixes some bugs. 3. Updates the create-release script to call make-distribution script. Author: Patrick Wendell <pwendell@gmail.com> Closes #502 from pwendell/make-distribution and squashes the following commits: 1a97f0d [Patrick Wendell] SPARK-1119 and other build improvements
*	Dev script: include RC name in git tag	Patrick Wendell	2014-04-21	1	-1/+1
\|
*	SPARK-1314: Use SPARK_HIVE to determine if we include Hive in packaging	Aaron Davidson	2014-04-06	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, we based our decision regarding including datanucleus jars based on the existence of a spark-hive-assembly jar, which was incidentally built whenever "sbt assembly" is run. This means that a typical and previously supported pathway would start using hive jars. This patch has the following features/bug fixes: - Use of SPARK_HIVE (default false) to determine if we should include Hive in the assembly jar. - Analagous feature in Maven with -Phive (previously, there was no support for adding Hive to any of our jars produced by Maven) - assemble-deps fixed since we no longer use a different ASSEMBLY_DIR - avoid adding log message in compute-classpath.sh to the classpath :) Still TODO before mergeable: - We need to download the datanucleus jars outside of sbt. Perhaps we can have spark-class download them if SPARK_HIVE is set similar to how sbt downloads itself. - Spark SQL documentation updates. Author: Aaron Davidson <aaron@databricks.com> Closes #237 from aarondav/master and squashes the following commits: 5dc4329 [Aaron Davidson] Typo fixes dd4f298 [Aaron Davidson] Doc update dd1a365 [Aaron Davidson] Eliminate need for SPARK_HIVE at runtime by d/ling datanucleus from Maven a9269b5 [Aaron Davidson] [WIP] Use SPARK_HIVE to determine if we include Hive in packaging
*	SPARK-1167: Remove metrics-ganglia from default build due to LGPL issues...	Patrick Wendell	2014-03-11	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch removes Ganglia integration from the default build. It allows users willing to link against LGPL code to use Ganglia by adding build flags or linking against a new Spark artifact called spark-ganglia-lgpl. This brings Spark in line with the Apache policy on LGPL code enumerated here: https://www.apache.org/legal/3party.html#options-optional Author: Patrick Wendell <pwendell@gmail.com> Closes #108 from pwendell/ganglia and squashes the following commits: 326712a [Patrick Wendell] Responding to review feedback 5f28ee4 [Patrick Wendell] SPARK-1167: Remove metrics-ganglia from default build due to LGPL issues.
*	Add Jekyll tag to isolate "production-only" doc components.	Patrick Wendell	2014-03-02	1	-1/+1
\| \| \| \| \| \| \| \|	Author: Patrick Wendell <pwendell@gmail.com> Closes #56 from pwendell/jekyll-prod and squashes the following commits: 1bdc3a8 [Patrick Wendell] Add Jekyll tag to isolate "production-only" doc components.