spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[SPARK-13456][SQL] fix creating encoders for case classes defined in Spark shell	Wenchen Fan	2016-03-21	2	-4/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	## What changes were proposed in this pull request? case classes defined in REPL are wrapped by line classes, and we have a trick for scala 2.10 REPL to automatically register the wrapper classes to `OuterScope` so that we can use when create encoders. However, this trick doesn't work right after we upgrade to scala 2.11, and unfortunately the tests are only in scala 2.10, which makes this bug hidden until now. This PR moves the encoder tests to scala 2.11 `ReplSuite`, and fixes this bug by another approach(the previous trick can't port to scala 2.11 REPL): make `OuterScope` smarter that can detect classes defined in REPL and load the singleton of line wrapper classes automatically. ## How was this patch tested? the migrated encoder tests in `ReplSuite` Author: Wenchen Fan <wenchen@databricks.com> Closes #11410 from cloud-fan/repl.
*	[SPARK-13928] Move org.apache.spark.Logging into ↵	Wenchen Fan	2016-03-17	8	-13/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	org.apache.spark.internal.Logging ## What changes were proposed in this pull request? Logging was made private in Spark 2.0. If we move it, then users would be able to create a Logging trait themselves to avoid changing their own code. ## How was this patch tested? existing tests. Author: Wenchen Fan <wenchen@databricks.com> Closes #11764 from cloud-fan/logger.
*	[SPARK-13626][CORE] Avoid duplicate config deprecation warnings.	Marcelo Vanzin	2016-03-14	3	-28/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Three different things were needed to get rid of spurious warnings: - silence deprecation warnings when cloning configuration - change the way SparkHadoopUtil instantiates SparkConf to silence warnings - avoid creating new SparkConf instances where it's not needed. On top of that, I changed the way that Logging.scala detects the repl; now it uses a method that is overridden in the repl's Main class, and the hack in Utils.scala is not needed anymore. This makes the 2.11 repl behave like the 2.10 one and set the default log level to WARN, which is a lot better. Previously, this wasn't working because the 2.11 repl triggers log initialization earlier than the 2.10 one. I also removed and simplified some other code in the 2.11 repl's Main to avoid replicating logic that already exists elsewhere in Spark. Tested the 2.11 repl in local and yarn modes. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #11510 from vanzin/SPARK-13626.
*	[SPARK-3854][BUILD] Scala style: require spaces before `{`.	Dongjoon Hyun	2016-03-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	## What changes were proposed in this pull request? Since the opening curly brace, '{', has many usages as discussed in [SPARK-3854](https://issues.apache.org/jira/browse/SPARK-3854), this PR adds a ScalaStyle rule to prevent '){' pattern for the following majority pattern and fixes the code accordingly. If we enforce this in ScalaStyle from now, it will improve the Scala code quality and reduce review time. ``` // Correct: if (true) { println("Wow!") } // Incorrect: if (true){ println("Wow!") } ``` IntelliJ also shows new warnings based on this. ## How was this patch tested? Pass the Jenkins ScalaStyle test. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #11637 from dongjoon-hyun/SPARK-3854.
*	[MINOR] Fix typos in comments and testcase name of code	Dongjoon Hyun	2016-03-03	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	## What changes were proposed in this pull request? This PR fixes typos in comments and testcase name of code. ## How was this patch tested? manual. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #11481 from dongjoon-hyun/minor_fix_typos_in_code.
*	[SPARK-13583][CORE][STREAMING] Remove unused imports and add checkstyle rule	Dongjoon Hyun	2016-03-03	3	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	## What changes were proposed in this pull request? After SPARK-6990, `dev/lint-java` keeps Java code healthy and helps PR review by saving much time. This issue aims remove unused imports from Java/Scala code and add `UnusedImports` checkstyle rule to help developers. ## How was this patch tested? ``` ./dev/lint-java ./build/sbt compile ``` Author: Dongjoon Hyun <dongjoon@apache.org> Closes #11438 from dongjoon-hyun/SPARK-13583.
*	[SPARK-13086][SHELL] Use the Scala REPL settings, to enable things like `-i ↵	Iulian Dragos	2016-02-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	file`. Now: ``` $ bin/spark-shell -i test.scala NOTE: SPARK_PREPEND_CLASSES is set, placing locally compiled Spark classes ahead of assembly. Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). 16/01/29 17:37:38 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/01/29 17:37:39 INFO Main: Created spark context.. Spark context available as sc (master = local[*], app id = local-1454085459000). 16/01/29 17:37:39 INFO Main: Created sql context.. SQL context available as sqlContext. Loading test.scala... hello Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.0.0-SNAPSHOT /_/ Using Scala version 2.11.7 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_45) Type in expressions to have them evaluated. Type :help for more information. ``` Author: Iulian Dragos <jaguarul@gmail.com> Closes #10984 from dragos/issue/repl-eval-file.
*	[SPARK-6363][BUILD] Make Scala 2.11 the default Scala version	Josh Rosen	2016-01-30	3	-11/+13
\| \| \| \| \| \| \| \| \| \| \| \|	This patch changes Spark's build to make Scala 2.11 the default Scala version. To be clear, this does not mean that Spark will stop supporting Scala 2.10: users will still be able to compile Spark for Scala 2.10 by following the instructions on the "Building Spark" page; however, it does mean that Scala 2.11 will be the default Scala version used by our CI builds (including pull request builds). The Scala 2.11 compiler is faster than 2.10, so I think we'll be able to look forward to a slight speedup in our CI builds (it looks like it's about 2X faster for the Maven compile-only builds, for instance). After this patch is merged, I'll update Jenkins to add new compile-only jobs to ensure that Scala 2.10 compilation doesn't break. Author: Josh Rosen <joshrosen@databricks.com> Closes #10608 from JoshRosen/SPARK-6363.
*	[SPARK-12761][CORE] Remove duplicated code	Jakob Odersky	2016-01-13	1	-5/+1
\| \| \| \| \| \| \| \|	Removes some duplicated code that was reintroduced during a merge. Author: Jakob Odersky <jodersky@gmail.com> Closes #10711 from jodersky/repl-2.11-duplicate.
*	[SPARK-3873][TESTS] Import ordering fixes.	Marcelo Vanzin	2016-01-05	1	-4/+4
\| \| \| \| \| \|	Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #10582 from vanzin/SPARK-3873-tests.
*	[SPARK-3873][STREAMING] Import order fixes for streaming.	Marcelo Vanzin	2015-12-31	1	-4/+3
\| \| \| \| \| \| \| \|	Also included a few miscelaneous other modules that had very few violations. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #10532 from vanzin/SPARK-3873-streaming.
*	[SPARK-12311][CORE] Restore previous value of "os.arch" property in test ↵	Kazuaki Ishizaki	2015-12-24	1	-6/+9
\| \| \| \| \| \| \| \| \| \| \| \|	suites after forcing to set specific value to "os.arch" property Restore the original value of os.arch property after each test Since some of tests forced to set the specific value to os.arch property, we need to set the original value. Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com> Closes #10289 from kiszk/SPARK-12311.
*	[SPARK-11808] Remove Bagel.	Reynold Xin	2015-12-19	1	-6/+0
\| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #10395 from rxin/SPARK-11808.
*	Bump master version to 2.0.0-SNAPSHOT.	Reynold Xin	2015-12-19	1	-1/+1
\| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #10387 from rxin/version-bump.
*	[SPARK-12350][CORE] Don't log errors when requested stream is not found.	Marcelo Vanzin	2015-12-18	1	-2/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a client requests a non-existent stream, just send a failure message back, without logging any error on the server side (since it's not a server error). On the executor side, avoid error logs by translating any errors during transfer to a `ClassNotFoundException`, so that loading the class is retried on a the parent class loader. This can mask IO errors during transmission, but the most common cause is that the class is not served by the remote end. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #10337 from vanzin/SPARK-12350.
*	[SPARK-11563][CORE][REPL] Use RpcEnv to transfer REPL-generated classes.	Marcelo Vanzin	2015-12-10	5	-67/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This avoids bringing up yet another HTTP server on the driver, and instead reuses the file server already managed by the driver's RpcEnv. As a bonus, the repl now inherits the security features of the network library. There's also a small change to create the directory for storing classes under the root temp dir for the application (instead of directly under java.io.tmpdir). Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #9923 from vanzin/SPARK-11563.
*	[SPARK-11832][CORE] Process arguments in spark-shell for Scala 2.11	Jakob Odersky	2015-12-10	2	-13/+27
\| \| \| \| \| \| \| \|	Process arguments passed to the spark-shell. Fixes running the spark-shell from within a build environment. Author: Jakob Odersky <jodersky@gmail.com> Closes #9824 from jodersky/shell-2.11.
*	[SPARK-11929][CORE] Make the repl log4j configuration override the root logger.	Marcelo Vanzin	2015-11-24	2	-22/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the default Spark distribution, there are currently two separate log4j config files, with different default values for the root logger, so that when running the shell you have a different default log level. This makes the shell more usable, since the logs don't overwhelm the output. But if you install a custom log4j.properties, you lose that, because then it's going to be used no matter whether you're running a regular app or the shell. With this change, the overriding of the log level is done differently; the log level repl's main class (org.apache.spark.repl.Main) is used to define the root logger's level when running the shell, defaulting to WARN if it's not set explicitly. On a somewhat related change, the shell output about the "sc" variable was changed a bit to contain a little more useful information about the application, since when the root logger's log level is WARN, that information is never shown to the user. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #9816 from vanzin/shell-logging.
*	[SPARK-11818][REPL] Fix ExecutorClassLoader to lookup resources from …	Jungtaek Lim	2015-11-24	2	-1/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	…parent class loader Without patch, two additional tests of ExecutorClassLoaderSuite fails. - "resource from parent" - "resources from parent" Detailed explanation is here, https://issues.apache.org/jira/browse/SPARK-11818?focusedCommentId=15011202&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15011202 Author: Jungtaek Lim <kabhwan@gmail.com> Closes #9812 from HeartSaVioR/SPARK-11818.
*	[SPARK-11889][SQL] Fix type inference for GroupedDataset.agg in REPL	Michael Armbrust	2015-11-20	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In this PR I delete a method that breaks type inference for aggregators (only in the REPL) The error when this method is present is: ``` <console>:38: error: missing parameter type for expanded function ((x$2) => x$2._2) ds.groupBy(_._1).agg(sum(_._2), sum(_._3)).collect() ``` Author: Michael Armbrust <michael@databricks.com> Closes #9870 from marmbrus/dataset-repl-agg.
*	[SPARK-11636][SQL] Support classes defined in the REPL with Encoders	Michael Armbrust	2015-11-20	3	-5/+41
\| \| \| \| \| \| \| \| \|	#theScaryParts (i.e. changes to the repl, executor classloaders and codegen)... Author: Michael Armbrust <michael@databricks.com> Author: Yin Huai <yhuai@databricks.com> Closes #9825 from marmbrus/dataset-replClasses2.
*	[SPARK-6152] Use shaded ASM5 to support closure cleaning of Java 8 compiled ↵	Josh Rosen	2015-11-11	2	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	classes This patch modifies Spark's closure cleaner (and a few other places) to use ASM 5, which is necessary in order to support cleaning of closures that were compiled by Java 8. In order to avoid ASM dependency conflicts, Spark excludes ASM from all of its dependencies and uses a shaded version of ASM 4 that comes from `reflectasm` (see [SPARK-782](https://issues.apache.org/jira/browse/SPARK-782) and #232). This patch updates Spark to use a shaded version of ASM 5.0.4 that was published by the Apache XBean project; the POM used to create the shaded artifact can be found at https://github.com/apache/geronimo-xbean/blob/xbean-4.4/xbean-asm5-shaded/pom.xml. http://movingfulcrum.tumblr.com/post/80826553604/asm-framework-50-the-missing-migration-guide was a useful resource while upgrading the code to use the new ASM5 opcodes. I also added a new regression tests in the `java8-tests` subproject; the existing tests were insufficient to catch this bug, which only affected Scala 2.11 user code which was compiled targeting Java 8. Author: Josh Rosen <joshrosen@databricks.com> Closes #9512 from JoshRosen/SPARK-6152.
*	[SPARK-10300] [BUILD] [TESTS] Add support for test tags in run-tests.py.	Marcelo Vanzin	2015-10-07	1	-0/+4
\| \| \| \| \| \|	Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #8775 from vanzin/SPARK-10300.
*	Update version to 1.6.0-SNAPSHOT.	Reynold Xin	2015-09-15	1	-1/+1
\| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #8350 from rxin/1.6.
*	[SPARK-10549] scala 2.11 spark on yarn with security - Repl doesn't work	Tom Graves	2015-09-14	1	-1/+2
\| \| \| \| \| \| \| \| \|	Make this lazy so that it can set the yarn mode before creating the securityManager. Author: Tom Graves <tgraves@yahoo-inc.com> Author: Thomas Graves <tgraves@staydecay.corp.gq1.yahoo.com> Closes #8719 from tgravescs/SPARK-10549.
*	[SPARK-10227] fatal warnings with sbt on Scala 2.11	Luc Bourlier	2015-09-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	The bulk of the changes are on `transient` annotation on class parameter. Often the compiler doesn't generate a field for this parameters, so the the transient annotation would be unnecessary. But if the class parameter are used in methods, then fields are created. So it is safer to keep the annotations. The remainder are some potential bugs, and deprecated syntax. Author: Luc Bourlier <luc.bourlier@typesafe.com> Closes #8433 from skyluc/issue/sbt-2.11.
*	[SPARK-9602] remove "Akka/Actor" words from comments	CodingCat	2015-08-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	https://issues.apache.org/jira/browse/SPARK-9602 Although we have hidden Akka behind RPC interface, I found that the Akka/Actor-related comments are still spreading everywhere. To make it consistent, we shall remove "actor"/"akka" words from the comments... Author: CodingCat <zhunansjtu@gmail.com> Closes #7936 from CodingCat/SPARK-9602 and squashes the following commits: e8296a3 [CodingCat] remove actor words from comments
*	[SPARK-9180] fix spark-shell to accept --name option	Kenichi Maehashi	2015-07-22	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes [[SPARK-9180]](https://issues.apache.org/jira/browse/SPARK-9180). Users can now set the app name of spark-shell using `spark-shell --name "whatever"`. Author: Kenichi Maehashi <webmaster@kenichimaehashi.com> Closes #7512 from kmaehashi/fix-spark-shell-app-name and squashes the following commits: e24991a [Kenichi Maehashi] use setIfMissing instead of setAppName 18aa4ad [Kenichi Maehashi] fix spark-shell to accept --name option
*	[SPARK-9244] Increase some memory defaults	Matei Zaharia	2015-07-22	2	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are a few memory limits that people hit often and that we could make higher, especially now that memory sizes have grown. - spark.akka.frameSize: This defaults at 10 but is often hit for map output statuses in large shuffles. This memory is not fully allocated up-front, so we can just make this larger and still not affect jobs that never sent a status that large. We increase it to 128. - spark.executor.memory: Defaults at 512m, which is really small. We increase it to 1g. Author: Matei Zaharia <matei@databricks.com> Closes #7586 from mateiz/configs and squashes the following commits: ce0038a [Matei Zaharia] [SPARK-9244] Increase some memory defaults
*	[SPARK-9015] [BUILD] Clean project import in scala ide	Jan Prach	2015-07-16	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Cleanup maven for a clean import in scala-ide / eclipse. * remove groovy plugin which is really not needed at all * add-source from build-helper-maven-plugin is not needed as recent version of scala-maven-plugin do it automatically * add lifecycle-mapping plugin to hide a few useless warnings from ide Author: Jan Prach <jendap@gmail.com> Closes #7375 from jendap/clean-project-import-in-scala-ide and squashes the following commits: c4b4c0f [Jan Prach] fix whitespaces 5a83e07 [Jan Prach] Revert "remove java compiler warnings from java tests" 312007e [Jan Prach] scala-maven-plugin itself add scala sources by default f47d856 [Jan Prach] remove spark-1.4-staging repository c8a54db [Jan Prach] remove java compiler warnings from java tests 999a068 [Jan Prach] remove some maven warnings in scala ide 80fbdc5 [Jan Prach] remove groovy and gmavenplus plugin
*	[SPARK-8962] Add Scalastyle rule to ban direct use of Class.forName; fix ↵	Josh Rosen	2015-07-14	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	existing uses This pull request adds a Scalastyle regex rule which fails the style check if `Class.forName` is used directly. `Class.forName` always loads classes from the default / system classloader, but in a majority of cases, we should be using Spark's own `Utils.classForName` instead, which tries to load classes from the current thread's context classloader and falls back to the classloader which loaded Spark when the context classloader is not defined. <!-- Reviewable:start --> [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/7350) <!-- Reviewable:end --> Author: Josh Rosen <joshrosen@databricks.com> Closes #7350 from JoshRosen/ban-Class.forName and squashes the following commits: e3e96f7 [Josh Rosen] Merge remote-tracking branch 'origin/master' into ban-Class.forName c0b7885 [Josh Rosen] Hopefully fix the last two cases d707ba7 [Josh Rosen] Fix uses of Class.forName that I missed in my first cleanup pass 046470d [Josh Rosen] Merge remote-tracking branch 'origin/master' into ban-Class.forName 62882ee [Josh Rosen] Fix uses of Class.forName or add exclusion. d9abade [Josh Rosen] Add stylechecker rule to ban uses of Class.forName
*	[SPARK-7944] [SPARK-8013] Remove most of the Spark REPL fork for Scala 2.11	Iulian Dragos	2015-07-10	10	-3173/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR removes most of the code in the Spark REPL for Scala 2.11 and leaves just a couple of overridden methods in `SparkILoop` in order to: - change welcome message - restrict available commands (like `:power`) - initialize Spark context The two codebases have diverged and it's extremely hard to backport fixes from the upstream REPL. This somewhat radical step is absolutely necessary in order to fix other REPL tickets (like SPARK-8013 - Hive Thrift server for 2.11). BTW, the Scala REPL has fixed the serialization-unfriendly wrappers thanks to ScrapCodes's work in [#4522](https://github.com/scala/scala/pull/4522) All tests pass and I tried the `spark-shell` on our Mesos cluster with some simple jobs (including with additional jars), everything looked good. As soon as Scala 2.11.7 is out we need to upgrade and get a shaded `jline` dependency, clearing the way for SPARK-8013. /cc pwendell Author: Iulian Dragos <jaguarul@gmail.com> Closes #6903 from dragos/issue/no-spark-repl-fork and squashes the following commits: c596c6f [Iulian Dragos] Merge branch 'master' into issue/no-spark-repl-fork 2b1a305 [Iulian Dragos] Removed spaces around multiple imports. 0ce67a6 [Iulian Dragos] Remove -verbose flag for java compiler (added by mistake in an earlier commit). 10edaf9 [Iulian Dragos] Keep the jline dependency only in the 2.10 build. 529293b [Iulian Dragos] Add back Spark REPL files to rat-excludes, since they are part of the 2.10 real. d85370d [Iulian Dragos] Remove jline dependency from the Spark REPL. b541930 [Iulian Dragos] Merge branch 'master' into issue/no-spark-repl-fork 2b15962 [Iulian Dragos] Change jline dependency and bump Scala version. b300183 [Iulian Dragos] Rename package and add license on top of the file, remove files from rat-excludes and removed `-Yrepl-sync` per reviewer’s request. 9d46d85 [Iulian Dragos] Fix SPARK-7944. abcc7cb [Iulian Dragos] Remove the REPL forked code.
*	[SPARK-7977] [BUILD] Disallowing println	Jonathan Alter	2015-07-10	7	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Jonathan Alter <jonalter@users.noreply.github.com> Closes #7093 from jonalter/SPARK-7977 and squashes the following commits: ccd44cc [Jonathan Alter] Changed println to log in ThreadingSuite 7fcac3e [Jonathan Alter] Reverting to println in ThreadingSuite 10724b6 [Jonathan Alter] Changing some printlns to logs in tests eeec1e7 [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977 0b1dcb4 [Jonathan Alter] More println cleanup aedaf80 [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977 925fd98 [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977 0c16fa3 [Jonathan Alter] Replacing some printlns with logs 45c7e05 [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977 5c8e283 [Jonathan Alter] Allowing println in audit-release examples 5b50da1 [Jonathan Alter] Allowing printlns in example files ca4b477 [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977 83ab635 [Jonathan Alter] Fixing new printlns 54b131f [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977 1cd8a81 [Jonathan Alter] Removing some unnecessary comments and printlns b837c3a [Jonathan Alter] Disallowing println
*	[SPARK-8683] [BUILD] Depend on mockito-core instead of mockito-all	Josh Rosen	2015-06-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Spark's tests currently depend on `mockito-all`, which bundles Hamcrest and Objenesis classes. Instead, it should depend on `mockito-core`, which declares those libraries as Maven dependencies. This is necessary in order to fix a dependency conflict that leads to a NoSuchMethodError when using certain Hamcrest matchers. See https://github.com/mockito/mockito/wiki/Declaring-mockito-dependency for more details. Author: Josh Rosen <joshrosen@databricks.com> Closes #7061 from JoshRosen/mockito-core-instead-of-all and squashes the following commits: 70eccbe [Josh Rosen] Depend on mockito-core instead of mockito-all.
*	[SPARK-8461] [SQL] fix codegen with REPL class loader	Davies Liu	2015-06-19	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ExecutorClassLoader for REPL will cause Janino failed to find class for those in java.lang, so switch to use default class loader for Janino, which will also help performance. cc liancheng yhuai Author: Davies Liu <davies@databricks.com> Closes #6898 from davies/fix_class_loader and squashes the following commits: 24276d4 [Davies Liu] add regression test 4ff0457 [Davies Liu] address comment, refactor 7f5ffbe [Davies Liu] fix REPL class loader with codegen
*	[SPARK-7801] [BUILD] Updating versions to SPARK 1.5.0	Patrick Wendell	2015-06-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Patrick Wendell <patrick@databricks.com> Closes #6328 from pwendell/spark-1.5-update and squashes the following commits: 2f42d02 [Patrick Wendell] A few more excludes 4bebcf0 [Patrick Wendell] Update to RC4 61aaf46 [Patrick Wendell] Using new release candidate 55f1610 [Patrick Wendell] Another exclude 04b4f04 [Patrick Wendell] More issues with transient 1.4 changes 36f549b [Patrick Wendell] [SPARK-7801] [BUILD] Updating versions to SPARK 1.5.0
*	[SPARK-7558] Demarcate tests in unit-tests.log	Andrew Or	2015-05-29	4	-8/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Right now `unit-tests.log` are not of much value because we can't tell where the test boundaries are easily. This patch adds log statements before and after each test to outline the test boundaries, e.g.: ``` ===== TEST OUTPUT FOR o.a.s.serializer.KryoSerializerSuite: 'kryo with parallelize for primitive arrays' ===== 15/05/27 12:36:39.596 pool-1-thread-1-ScalaTest-running-KryoSerializerSuite INFO SparkContext: Starting job: count at KryoSerializerSuite.scala:230 15/05/27 12:36:39.596 dag-scheduler-event-loop INFO DAGScheduler: Got job 3 (count at KryoSerializerSuite.scala:230) with 4 output partitions (allowLocal=false) 15/05/27 12:36:39.596 dag-scheduler-event-loop INFO DAGScheduler: Final stage: ResultStage 3(count at KryoSerializerSuite.scala:230) 15/05/27 12:36:39.596 dag-scheduler-event-loop INFO DAGScheduler: Parents of final stage: List() 15/05/27 12:36:39.597 dag-scheduler-event-loop INFO DAGScheduler: Missing parents: List() 15/05/27 12:36:39.597 dag-scheduler-event-loop INFO DAGScheduler: Submitting ResultStage 3 (ParallelCollectionRDD[5] at parallelize at KryoSerializerSuite.scala:230), which has no missing parents ... 15/05/27 12:36:39.624 pool-1-thread-1-ScalaTest-running-KryoSerializerSuite INFO DAGScheduler: Job 3 finished: count at KryoSerializerSuite.scala:230, took 0.028563 s 15/05/27 12:36:39.625 pool-1-thread-1-ScalaTest-running-KryoSerializerSuite INFO KryoSerializerSuite: *** FINISHED o.a.s.serializer.KryoSerializerSuite: 'kryo with parallelize for primitive arrays' *** ... ``` Author: Andrew Or <andrew@databricks.com> Closes #6441 from andrewor14/demarcate-tests and squashes the following commits: 879b060 [Andrew Or] Fix compile after rebase d622af7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into demarcate-tests 017c8ba [Andrew Or] Merge branch 'master' of github.com:apache/spark into demarcate-tests 7790b6c [Andrew Or] Fix tests after logical merge conflict c7460c0 [Andrew Or] Merge branch 'master' of github.com:apache/spark into demarcate-tests c43ffc4 [Andrew Or] Fix tests? 8882581 [Andrew Or] Fix tests ee22cda [Andrew Or] Fix log message fa9450e [Andrew Or] Merge branch 'master' of github.com:apache/spark into demarcate-tests 12d1e1b [Andrew Or] Various whitespace changes (minor) 69cbb24 [Andrew Or] Make all test suites extend SparkFunSuite instead of FunSuite bbce12e [Andrew Or] Fix manual things that cannot be covered through automation da0b12f [Andrew Or] Add core tests as dependencies in all modules f7d29ce [Andrew Or] Introduce base abstract class for all test suites
*	[SPARK-7726] Fix Scaladoc false errors	Iulian Dragos	2015-05-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Visibility rules for static members are different in Scala and Java, and this case requires an explicit static import. Even though these are Java files, they are run through scaladoc, which enforces Scala rules. Also reverted the commit that reverts the upgrade to 2.11.6 Author: Iulian Dragos <jaguarul@gmail.com> Closes #6260 from dragos/issue/scaladoc-false-error and squashes the following commits: f2e998e [Iulian Dragos] Revert "[HOTFIX] Revert "[SPARK-7092] Update spark scala version to 2.11.6"" 0bad052 [Iulian Dragos] Fix scaladoc faux-error.
*	[HOTFIX] Revert "[SPARK-7092] Update spark scala version to 2.11.6"	Patrick Wendell	2015-05-19	1	-1/+1
\| \| \| \| \| \| \|	This reverts commit a11c8683c76c67f45749a1b50a0912a731fd2487. For more information see: https://issues.apache.org/jira/browse/SPARK-7726
*	[SPARK-6568] spark-shell.cmd --jars option does not accept the jar that has ↵	Masayoshi TSUZUKI	2015-05-13	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	space in its path escape spaces in the arguments. Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp> Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes #5447 from tsudukim/feature/SPARK-6568-2 and squashes the following commits: 3f9a188 [Masayoshi TSUZUKI] modified some errors. ed46047 [Masayoshi TSUZUKI] avoid scalastyle errors. 1784239 [Masayoshi TSUZUKI] removed Utils.formatPath. e03f289 [Masayoshi TSUZUKI] removed testWindows from Utils.resolveURI and Utils.resolveURIs. replaced SystemUtils.IS_OS_WINDOWS to Utils.isWindows. removed Utils.formatPath from PythonRunner.scala. 84c33d0 [Masayoshi TSUZUKI] - use resolveURI in nonLocalPaths - run tests for Windows path only on Windows 016128d [Masayoshi TSUZUKI] fixed to use File.toURI() 2c62e3b [Masayoshi TSUZUKI] Merge pull request #1 from sarutak/SPARK-6568-2 7019a8a [Masayoshi TSUZUKI] Merge branch 'master' of https://github.com/apache/spark into feature/SPARK-6568-2 45946ee [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-6568-2 10f1c73 [Kousuke Saruta] Added a comment 93c3c40 [Kousuke Saruta] Merge branch 'classpath-handling-fix' of github.com:sarutak/spark into SPARK-6568-2 649da82 [Kousuke Saruta] Fix classpath handling c7ba6a7 [Masayoshi TSUZUKI] [SPARK-6568] spark-shell.cmd --jars option does not accept the jar that has space in its path
*	[SPARK-7489] [SPARK SHELL] Spark shell crashes when compiled with scala 2.11	vinodkc	2015-05-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Spark shell crashes when compiled with scala 2.11 and SPARK_PREPEND_CLASSES=true There is a similar Resolved JIRA issue -SPARK-7470 and a PR https://github.com/apache/spark/pull/5997 , which handled same issue only in scala 2.10 Author: vinodkc <vinod.kc.in@gmail.com> Closes #6013 from vinodkc/fix_sqlcontext_exception_scala_2.11 and squashes the following commits: 119061c [vinodkc] Spark shell crashes when compiled with scala 2.11
*	[SPARK-7470] [SQL] Spark shell SQLContext crashes without hive	Andrew Or	2015-05-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This only happens if you have `SPARK_PREPEND_CLASSES` set. Then I built it with `build/sbt clean assembly compile` and just ran it with `bin/spark-shell`. ``` ... 15/05/07 17:07:30 INFO EventLoggingListener: Logging events to file:/tmp/spark-events/local-1431043649919 15/05/07 17:07:30 INFO SparkILoop: Created spark context.. Spark context available as sc. java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf at java.lang.Class.getDeclaredConstructors0(Native Method) at java.lang.Class.privateGetDeclaredConstructors(Class.java:2493) at java.lang.Class.getConstructor0(Class.java:2803) ... Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ... 52 more <console>:10: error: not found: value sqlContext import sqlContext.implicits._ ^ <console>:10: error: not found: value sqlContext import sqlContext.sql ^ ``` yhuai marmbrus Author: Andrew Or <andrew@databricks.com> Closes #5997 from andrewor14/sql-shell-crash and squashes the following commits: 61147e6 [Andrew Or] Also expect NoClassDefFoundError
*	[SPARK-7092] Update spark scala version to 2.11.6	Prashant Sharma	2015-04-25	1	-1/+1
\| \| \| \| \| \| \| \|	Author: Prashant Sharma <prashant.s@imaginea.com> Closes #5662 from ScrapCodes/SPARK-7092/scala-update-2.11.6 and squashes the following commits: 58cf4f9 [Prashant Sharma] [SPARK-7092] Update spark scala version to 2.11.6
*	[SPARK-6758]block the right jetty package in log	WangTaoTheTonic	2015-04-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	https://issues.apache.org/jira/browse/SPARK-6758 I am not sure if it is ok to block them in test resources too (as we shade jetty in assembly?). Author: WangTaoTheTonic <wangtao111@huawei.com> Closes #5406 from WangTaoTheTonic/SPARK-6758 and squashes the following commits: e09605b [WangTaoTheTonic] block the right jetty package
*	[SPARK-6209] Clean up connections in ExecutorClassLoader after failing to ↵	Josh Rosen	2015-03-24	3	-20/+140
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	load classes (master branch PR) ExecutorClassLoader does not ensure proper cleanup of network connections that it opens. If it fails to load a class, it may leak partially-consumed InputStreams that are connected to the REPL's HTTP class server, causing that server to exhaust its thread pool, which can cause the entire job to hang. See [SPARK-6209](https://issues.apache.org/jira/browse/SPARK-6209) for more details, including a bug reproduction. This patch fixes this issue by ensuring proper cleanup of these resources. It also adds logging for unexpected error cases. This PR is an extended version of #4935 and adds a regression test. Author: Josh Rosen <joshrosen@databricks.com> Closes #4944 from JoshRosen/executorclassloader-leak-master-branch and squashes the following commits: e0e3c25 [Josh Rosen] Wrap try block around getReponseCode; re-enable keep-alive by closing error stream 961c284 [Josh Rosen] Roll back changes that were added to get the regression test to fail 7ee2261 [Josh Rosen] Add a failing regression test e2d70a3 [Josh Rosen] Properly clean up after errors in ExecutorClassLoader
*	[SPARK-6371] [build] Update version to 1.4.0-SNAPSHOT.	Marcelo Vanzin	2015-03-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #5056 from vanzin/SPARK-6371 and squashes the following commits: 63220df [Marcelo Vanzin] Merge branch 'master' into SPARK-6371 6506f75 [Marcelo Vanzin] Use more fine-grained exclusion. 178ba71 [Marcelo Vanzin] Oops. 75b2375 [Marcelo Vanzin] Exclude VertexRDD in MiMA. a45a62c [Marcelo Vanzin] Work around MIMA warning. 1d8a670 [Marcelo Vanzin] Re-group jetty exclusion. 0e8e909 [Marcelo Vanzin] Ignore ml, don't ignore graphx. cef4603 [Marcelo Vanzin] Indentation. 296cf82 [Marcelo Vanzin] [SPARK-6371] [build] Update version to 1.4.0-SNAPSHOT.
*	SPARK-6338 [CORE] Use standard temp dir mechanisms in tests to avoid ↵	Sean Owen	2015-03-20	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	orphaned temp files Use `Utils.createTempDir()` to replace other temp file mechanisms used in some tests, to further ensure they are cleaned up, and simplify Author: Sean Owen <sowen@cloudera.com> Closes #5029 from srowen/SPARK-6338 and squashes the following commits: 27b740a [Sean Owen] Fix hive-thriftserver tests that don't expect an existing dir 4a212fa [Sean Owen] Standardize a bit more temp dir management 9004081 [Sean Owen] Revert some added recursive-delete calls 57609e4 [Sean Owen] Use Utils.createTempDir() to replace other temp file mechanisms used in some tests, to further ensure they are cleaned up, and simplify
*	[SPARK-6299][CORE] ClassNotFoundException in standalone mode when running ↵	Kevin (Sangwoo) Kim	2015-03-16	2	-38/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	groupByKey with class defined in REPL ``` case class ClassA(value: String) val rdd = sc.parallelize(List(("k1", ClassA("v1")), ("k1", ClassA("v2")) )) rdd.groupByKey.collect ``` This code used to be throw exception in spark-shell, because while shuffling ```JavaSerializer```uses ```defaultClassLoader``` which was defined like ```env.serializer.setDefaultClassLoader(urlClassLoader)```. It should be ```env.serializer.setDefaultClassLoader(replClassLoader)```, like ``` override def run() { val deserializeStartTime = System.currentTimeMillis() Thread.currentThread.setContextClassLoader(replClassLoader) ``` in TaskRunner. When ```replClassLoader``` cannot be defined, it's identical with ```urlClassLoader``` Author: Kevin (Sangwoo) Kim <sangwookim.me@gmail.com> Closes #5046 from swkimme/master and squashes the following commits: fa2b9ee [Kevin (Sangwoo) Kim] stylish test codes ( collect -> collect() ) 6e9620b [Kevin (Sangwoo) Kim] stylish test codes ( collect -> collect() ) d23e4e2 [Kevin (Sangwoo) Kim] stylish test codes ( collect -> collect() ) a4a3c8a [Kevin (Sangwoo) Kim] add 'class defined in repl - shuffle' test to ReplSuite bd00da5 [Kevin (Sangwoo) Kim] add 'class defined in repl - shuffle' test to ReplSuite c1b1fc7 [Kevin (Sangwoo) Kim] use REPL class loader for executor's serializer
*	[SPARK-3619] Part 2. Upgrade to Mesos 0.21 to work around MESOS-1688	Jongyoul Lee	2015-03-15	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	- MESOS_NATIVE_LIBRARY become deprecated - Chagned MESOS_NATIVE_LIBRARY to MESOS_NATIVE_JAVA_LIBRARY Author: Jongyoul Lee <jongyoul@gmail.com> Closes #4361 from jongyoul/SPARK-3619-1 and squashes the following commits: f1ea91f [Jongyoul Lee] Merge branch 'SPARK-3619-1' of https://github.com/jongyoul/spark into SPARK-3619-1 a6a00c2 [Jongyoul Lee] [SPARK-3619] Upgrade to Mesos 0.21 to work around MESOS-1688 - Removed 'Known issues' section 2e15a21 [Jongyoul Lee] [SPARK-3619] Upgrade to Mesos 0.21 to work around MESOS-1688 - MESOS_NATIVE_LIBRARY become deprecated - Chagned MESOS_NATIVE_LIBRARY to MESOS_NATIVE_JAVA_LIBRARY 0dace7b [Jongyoul Lee] [SPARK-3619] Upgrade to Mesos 0.21 to work around MESOS-1688 - MESOS_NATIVE_LIBRARY become deprecated - Chagned MESOS_NATIVE_LIBRARY to MESOS_NATIVE_JAVA_LIBRARY
*	[Docs] Replace references to SchemaRDD with DataFrame	Reynold Xin	2015-03-09	1	-3/+3
\| \| \| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #4952 from rxin/schemardd-df-reference and squashes the following commits: b2b1dbe [Reynold Xin] [Docs] Replace references to SchemaRDD with DataFrame