spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	SPARK 1084.1 (resubmitted)	Sean Owen	2014-02-27	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(Ported from https://github.com/apache/incubator-spark/pull/637 ) Author: Sean Owen <sowen@cloudera.com> Closes #31 from srowen/SPARK-1084.1 and squashes the following commits: 6c4a32c [Sean Owen] Suppress warnings about legitimate unchecked array creations, or change code to avoid it f35b833 [Sean Owen] Fix two misc javadoc problems 254e8ef [Sean Owen] Fix one new style error introduced in scaladoc warning commit 5b2fce2 [Sean Owen] Fix scaladoc invocation warning, and enable javac warnings properly, with plugin config updates 007762b [Sean Owen] Remove dead scaladoc links b8ff8cb [Sean Owen] Replace deprecated Ant <tasks> with <target>
*	[SPARK-1089] fix the regression problem on ADD_JARS in 0.9	CodingCat	2014-02-26	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	https://spark-project.atlassian.net/browse/SPARK-1089 copied from JIRA, reported by @ash211 "Using the ADD_JARS environment variable with spark-shell used to add the jar to both the shell and the various workers. Now it only adds to the workers and importing a custom class in the shell is broken. The workaround is to add custom jars to both ADD_JARS and SPARK_CLASSPATH. We should fix ADD_JARS so it works properly again. See various threads on the user list: https://mail-archives.apache.org/mod_mbox/incubator-spark-user/201402.mbox/%3CCAJbo4neMLiTrnm1XbyqomWmp0m+EUcg4yE-txuRGSVKOb5KLeA@mail.gmail.com%3E (another one that doesn't appear in the archives yet titled "ADD_JARS not working on 0.9")" The reason of this bug is two-folds in the current implementation of SparkILoop.scala, the settings.classpath is not set properly when the process() method is invoked the weird behaviour of Scala 2.10, (I personally thought it is a bug) if we simply set value of a PathSettings object (like settings.classpath), the isDefault is not set to true (this is a flag showing if the variable is modified), so it makes the PathResolver loads the default CLASSPATH environment variable value to calculated the path (see https://github.com/scala/scala/blob/2.10.x/src/compiler/scala/tools/util/PathResolver.scala#L215) what we have to do is to manually make this flag set, (https://github.com/CodingCat/incubator-spark/blob/e3991d97ddc33e77645e4559b13bf78b9e68239a/repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala#L884) Author: CodingCat <zhunansjtu@gmail.com> Closes #13 from CodingCat/SPARK-1089 and squashes the following commits: 8af81e7 [CodingCat] impose non-null settings 9aa2125 [CodingCat] code cleaning ce36676 [CodingCat] code cleaning e045582 [CodingCat] fix the regression problem on ADD_JARS in 0.9
*	SPARK-1071: Tidy logging strategy and use of log4j	Sean Owen	2014-02-23	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prompted by a recent thread on the mailing list, I tried and failed to see if Spark can be made independent of log4j. There are a few cases where control of the underlying logging is pretty useful, and to do that, you have to bind to a specific logger. Instead I propose some tidying that leaves Spark's use of log4j, but gets rid of warnings and should still enable downstream users to switch. The idea is to pipe everything (except log4j) through SLF4J, and have Spark use SLF4J directly when logging, and where Spark needs to output info (REPL and tests), bind from SLF4J to log4j. This leaves the same behavior in Spark. It means that downstream users who want to use something except log4j should: - Exclude dependencies on log4j, slf4j-log4j12 from Spark - Include dependency on log4j-over-slf4j - Include dependency on another logger X, and another slf4j-X - Recreate any log config that Spark does, that is needed, in the other logger's config That sounds about right. Here are the key changes: - Include the jcl-over-slf4j shim everywhere by depending on it in core. - Exclude dependencies on commons-logging from third-party libraries. - Include the jul-to-slf4j shim everywhere by depending on it in core. - Exclude slf4j-* dependencies from third-party libraries to prevent collision or warnings - Added missing slf4j-log4j12 binding to GraphX, Bagel module tests And minor/incidental changes: - Update to SLF4J 1.7.5, which happily matches Hadoop 2’s version and is a recommended update over 1.7.2 - (Remove a duplicate HBase dependency declaration in SparkBuild.scala) - (Remove a duplicate mockito dependency declaration that was causing warnings and bugging me) Author: Sean Owen <sowen@cloudera.com> Closes #570 from srowen/SPARK-1071 and squashes the following commits: 52eac9f [Sean Owen] Add slf4j-over-log4j12 dependency to core (non-test) and remove it from things that depend on core. 77a7fa9 [Sean Owen] SPARK-1071: Tidy logging strategy and use of log4j
*	[SPARK-1090] improvement on spark_shell (help information, configure memory)	CodingCat	2014-02-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	https://spark-project.atlassian.net/browse/SPARK-1090 spark-shell should print help information about parameters and should allow user to configure exe memory there is no document about hot to set --cores/-c in spark-shell and also users should be able to set executor memory through command line options In this PR I also check the format of the options passed by the user Author: CodingCat <zhunansjtu@gmail.com> Closes #599 from CodingCat/spark_shell_improve and squashes the following commits: de5aa38 [CodingCat] add parameter to set driver memory 915cbf8 [CodingCat] improvement on spark_shell (help information, configure memory)
*	Merge pull request #557 from ScrapCodes/style. Closes #557.	Patrick Wendell	2014-02-09	9	-6/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build. Author: Patrick Wendell <pwendell@gmail.com> Author: Prashant Sharma <scrapcodes@gmail.com> == Merge branch commits == commit 1a8bd1c059b842cb95cc246aaea74a79fec684f4 Author: Prashant Sharma <scrapcodes@gmail.com> Date: Sun Feb 9 17:39:07 2014 +0530 scala style fixes commit f91709887a8e0b608c5c2b282db19b8a44d53a43 Author: Patrick Wendell <pwendell@gmail.com> Date: Fri Jan 24 11:22:53 2014 -0800 Adding scalastyle snapshot
*	Merge pull request #542 from markhamstra/versionBump. Closes #542.	Mark Hamstra	2014-02-08	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Version number to 1.0.0-SNAPSHOT Since 0.9.0-incubating is done and out the door, we shouldn't be building 0.9.0-incubating-SNAPSHOT anymore. @pwendell Author: Mark Hamstra <markhamstra@gmail.com> == Merge branch commits == commit 1b00a8a7c1a7f251b4bb3774b84b9e64758eaa71 Author: Mark Hamstra <markhamstra@gmail.com> Date: Wed Feb 5 09:30:32 2014 -0800 Version number to 1.0.0-SNAPSHOT
*	Add missing header files	Patrick Wendell	2014-01-14	1	-0/+17
\|
*	Removing mentions in tests	Patrick Wendell	2014-01-12	1	-2/+0
\|
*	Merge pull request #327 from lucarosellini/master	Matei Zaharia	2014-01-08	3	-3/+73
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added ‘-i’ command line option to Spark REPL We had to create a new implementation of both scala.tools.nsc.CompilerCommand and scala.tools.nsc.Settings, because using scala.tools.nsc.GenericRunnerSettings would bring in other options (-howtorun, -save and -execute) which don’t make sense in Spark. Any new Spark specific command line option could now be added to org.apache.spark.repl.SparkRunnerSettings class. Since the behavior of loading a script from the command line should be the same as loading it using the “:load” command inside the shell, the script should be loaded when the SparkContext is available, that’s why we had to move the call to ‘loadfiles(settings)’ _after_ the call to postInitialization(). This still doesn’t work if ‘isAsync = true’.
\| *	Added license header and removed @author tag	Luca Rosellini	2014-01-07	2	-4/+34
\| \|
\| *	Added ‘-i’ command line option to spark REPL.	Luca Rosellini	2014-01-03	3	-3/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We had to create a new implementation of both scala.tools.nsc.CompilerCommand and scala.tools.nsc.Settings, because using scala.tools.nsc.GenericRunnerSettings would bring in other options (-howtorun, -save and -execute) which don’t make sense in Spark. Any new Spark specific command line option could now be added to org.apache.spark.repl.SparkRunnerSettings class. Since the behavior of loading a script from the command line should be the same as loading it using the “:load” command inside the shell, the script should be loaded when the SparkContext is available, that’s why we had to move the call to ‘loadfiles(settings)’ _after_ the call to postInitialization(). This still doesn’t work if ‘isAsync = true’.
* \|	Merge remote-tracking branch 'apache-github/master' into remove-binaries	Patrick Wendell	2014-01-03	1	-1/+0
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/test/scala/org/apache/spark/DriverSuite.scala docs/python-programming-guide.md
\| * \	Merge branch 'scripts-reorg' of github.com:shane-huang/incubator-spark into ↵	Prashant Sharma	2014-01-02	1	-1/+0
\| \|\ \ \| \| \|/ \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	spark-915-segregate-scripts Conflicts: bin/spark-shell core/pom.xml core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala core/src/main/scala/org/apache/spark/ui/UIWorkloadGenerator.scala core/src/test/scala/org/apache/spark/DriverSuite.scala python/run-tests sbin/compute-classpath.sh sbin/spark-class sbin/stop-slaves.sh
\| \| *	deprecate "spark" script and SPAKR_CLASSPATH environment variable	Andrew xia	2013-10-12	1	-1/+0
\| \| \|
* \| \|	fixed review comments	Prashant Sharma	2014-01-03	1	-1/+3
\|/ /
* \|	Miscellaneous fixes from code review.	Matei Zaharia	2014-01-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Also replaced SparkConf.getOrElse with just a "get" that takes a default value, and added getInt, getLong, etc to make code that uses this simpler later on.
* \|	Various fixes to configuration code	Matei Zaharia	2013-12-28	2	-7/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Got rid of global SparkContext.globalConf - Pass SparkConf to serializers and compression codecs - Made SparkConf public instead of private[spark] - Improved API of SparkContext and SparkConf - Switched executor environment vars to be passed through SparkConf - Fixed some places that were still using system properties - Fixed some tests, though others are still failing This still fails several tests in core, repl and streaming, likely due to properties not being set or cleared correctly (some of the tests run fine in isolation).
* \|	spark-544, introducing SparkConf and related configuration overhaul.	Prashant Sharma	2013-12-25	2	-8/+6
\| \|
* \|	Use scala.binary.version in POMs	Mark Hamstra	2013-12-15	1	-7/+7
\| \|
* \|	Review comments on the PR for scala 2.10 migration.	Prashant Sharma	2013-12-13	1	-1/+0
\| \|
* \|	Style fixes and addressed review comments at #221	Prashant Sharma	2013-12-10	1	-7/+7
\| \|
* \|	Incorporated Patrick's feedback comment on #211 and made maven ↵	Prashant Sharma	2013-12-07	1	-1/+1
\| \| \| \| \| \| \| \|	build/dep-resolution atleast a bit faster.
* \|	Fixed compile time warnings and formatting post merge.	Prashant Sharma	2013-11-26	1	-65/+74
\| \|
* \|	Various merge corrections	Aaron Davidson	2013-11-14	2	-12/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I've diff'd this patch against my own -- since they were both created independently, this means that two sets of eyes have gone over all the merge conflicts that were created, so I'm feeling significantly more confident in the resulting PR. @rxin has looked at the changes to the repl and is resoundingly confident that they are correct.
* \|	Merge branch 'master' into scala-2.10	Raymond Liu	2013-11-14	1	-2/+34
\|\ \
\| * \|	Propagate the SparkContext local property from the thread that calls the ↵	Reynold Xin	2013-11-09	2	-4/+42
\| \| \| \| \| \| \| \| \| \| \| \|	spark-repl to the actual execution thread.
* \| \|	Merge branch 'master' into scala-2.10	Raymond Liu	2013-11-13	1	-7/+29
\|\\| \|
\| * \|	Makes Spark SIMR ready.	Ali Ghodsi	2013-10-24	1	-0/+14
\| \| \|
\| * \|	Spark shell exits if it cannot create SparkContext	Aaron Davidson	2013-10-17	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mainly, this occurs if you provide a messed up MASTER url (one that doesn't match one of our regexes). Previously, we would default to Mesos, fail, and then start the shell anyway, except that any Spark command would fail.
* \| \|	Merge branch 'master' into wip-merge-master	Prashant Sharma	2013-10-08	1	-8/+8
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: bagel/pom.xml core/pom.xml core/src/test/scala/org/apache/spark/ui/UISuite.scala examples/pom.xml mllib/pom.xml pom.xml project/SparkBuild.scala repl/pom.xml streaming/pom.xml tools/pom.xml In scala 2.10, a shorter representation is used for naming artifacts so changed to shorter scala version for artifacts and made it a property in pom.
\| * \|	Merging build changes in from 0.8	Patrick Wendell	2013-10-05	1	-10/+10
\| \|/
* \|	Merge branch 'master' into scala-2.10	Prashant Sharma	2013-10-01	2	-2/+2
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/main/scala/org/apache/spark/ui/jobs/JobProgressUI.scala docs/_config.yml project/SparkBuild.scala repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala
\| *	Bug fix in master build	Patrick Wendell	2013-09-26	1	-1/+1
\| \|
\| *	Update build version in master	Patrick Wendell	2013-09-24	1	-1/+1
\| \|
* \|	fixed maven build for scala 2.10	Prashant Sharma	2013-09-26	1	-6/+6
\| \|
* \|	ported repl improvements from master	Prashant Sharma	2013-09-15	2	-2/+11
\| \|
* \|	Merge branch 'master' of git://github.com/mesos/spark into scala-2.10	Prashant Sharma	2013-09-15	1	-12/+0
\|\\| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala project/SparkBuild.scala
\| *	Minor YARN build cleanups	Jey Kottalam	2013-09-06	1	-12/+0
\| \|
* \|	Fixed repl suite	Prashant Sharma	2013-09-15	2	-7/+7
\| \|
* \|	Few more fixes to tests broken during merge	Prashant Sharma	2013-09-10	1	-50/+0
\| \|
* \|	Merged with master	Prashant Sharma	2013-09-06	18	-372/+369
\|\\|
\| *	Updated LICENSE with third-party licenses	Matei Zaharia	2013-09-02	1	-0/+17
\| \|
\| *	Move some classes to more appropriate packages:	Matei Zaharia	2013-09-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	* RDD, RDDFunctions -> org.apache.spark.rdd Utils, ClosureCleaner, SizeEstimator -> org.apache.spark.util * JavaSerializer, KryoSerializer -> org.apache.spark.serializer
\| *	Fix some URLs	Matei Zaharia	2013-09-01	1	-1/+1
\| \|
\| *	Initial work to rename package to org.apache.spark	Matei Zaharia	2013-09-01	12	-30/+30
\| \|
\| *	Synced sbt and maven builds	Mark Hamstra	2013-08-21	1	-0/+6
\| \|
\| *	Remove redundant dependencies from POMs	Jey Kottalam	2013-08-18	1	-15/+0
\| \|
\| *	Updates to repl and example POMs to match SBT build	Jey Kottalam	2013-08-16	1	-6/+0
\| \|
\| *	Maven build now also works with YARN	Jey Kottalam	2013-08-16	1	-63/+1
\| \|
\| *	Maven build now works with CDH hadoop-2.0.0-mr1	Jey Kottalam	2013-08-16	1	-54/+10
\| \|