spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	Tweaks to Mesos docs	Matei Zaharia	2014-05-16	1	-37/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Mention Apache downloads first - Shorten some wording Author: Matei Zaharia <matei@databricks.com> Closes #806 from mateiz/doc-update and squashes the following commits: d9345cd [Matei Zaharia] typo a179f8d [Matei Zaharia] Tweaks to Mesos docs (cherry picked from commit fed6303f29250bd5e656dbdd731b38938c933a61) Signed-off-by: Matei Zaharia <matei@databricks.com>
*	[SQL] Implement between in hql	Michael Armbrust	2014-05-16	3	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Michael Armbrust <michael@databricks.com> Closes #804 from marmbrus/between and squashes the following commits: ae24672 [Michael Armbrust] add golden answer. d9997ef [Michael Armbrust] Implement between in hql. 9bd4433 [Michael Armbrust] Better error on parse failures. (cherry picked from commit 032d6632ad4ab88c97c9e568b63169a114220a02) Signed-off-by: Reynold Xin <rxin@apache.org>
*	bugfix: overflow of graphx Edge compare function	Zhen Peng	2014-05-16	2	-2/+47
\| \| \| \| \| \| \| \| \| \| \| \|	Author: Zhen Peng <zhenpeng01@baidu.com> Closes #769 from zhpengg/bugfix-graphx-edge-compare and squashes the following commits: 8a978ff [Zhen Peng] add ut for graphx Edge.lexicographicOrdering.compare 413c258 [Zhen Peng] there maybe a overflow for two Long's substraction (cherry picked from commit fa6de408a131a3e84350a60af74a92c323dfc5eb) Signed-off-by: Reynold Xin <rxin@apache.org>
*	[maven-release-plugin] prepare for next development iteration	Patrick Wendell	2014-05-16	21	-22/+22
\|
*	[maven-release-plugin] prepare release v1.0.0-rc8	Patrick Wendell	2014-05-16	21	-24/+24
\|
*	Revert "[maven-release-plugin] prepare release v1.0.0-rc7"	Patrick Wendell	2014-05-16	21	-24/+24
\| \| \| \|	This reverts commit 9212b3e5bb5545ccfce242da8d89108e6fb1c464.
*	Revert "[maven-release-plugin] prepare for next development iteration"	Patrick Wendell	2014-05-16	21	-22/+22
\| \| \| \|	This reverts commit c4746aa6fe4aaf383e69e34353114d36d1eb9ba6.
*	SPARK-1862: Support for MapR in the Maven build.	Patrick Wendell	2014-05-15	1	-1/+34
\| \| \| \| \| \| \| \| \| \| \|	Author: Patrick Wendell <pwendell@gmail.com> Closes #803 from pwendell/mapr-support and squashes the following commits: 8df60e4 [Patrick Wendell] SPARK-1862: Support for MapR in the Maven build. (cherry picked from commit 17702e280c4b0b030870962fcb3d50c3085ae862) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
*	[Spark-1461] Deferred Expression Evaluation (short-circuit evaluation)	Cheng Hao	2014-05-15	2	-22/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch unify the foldable & nullable interface for Expression. 1) Deterministic-less UDF (like Rand()) can not be folded. 2) Short-circut will significantly improves the performance in Expression Evaluation, however, the stateful UDF should not be ignored in a short-circuit evaluation(e.g. in expression: col1 > 0 and row_sequence() < 1000, row_sequence() can not be ignored even if col1 > 0 is false) I brought an concept of DeferredObject from Hive, which has 2 kinds of children classes (EagerResult / DeferredResult), the former requires triggering the evaluation before it's created, while the later trigger the evaluation when first called its get() method. Author: Cheng Hao <hao.cheng@intel.com> Closes #446 from chenghao-intel/expression_deferred_evaluation and squashes the following commits: d2729de [Cheng Hao] Fix the codestyle issues a08f09c [Cheng Hao] fix bug in or/and short-circuit evaluation af2236b [Cheng Hao] revert the short-circuit expression evaluation for IF b7861d2 [Cheng Hao] Add Support for Deferred Expression Evaluation (cherry picked from commit a20fea98811d98958567780815fcf0d4fb4e28d4) Signed-off-by: Reynold Xin <rxin@apache.org>
*	SPARK-1860: Do not cleanup application work/ directories by default	Aaron Davidson	2014-05-15	2	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This causes an unrecoverable error for applications that are running for longer than 7 days that have jars added to the SparkContext, as the jars are cleaned up even though the application is still running. Author: Aaron Davidson <aaron@databricks.com> Closes #800 from aarondav/shitty-defaults and squashes the following commits: a573fbb [Aaron Davidson] SPARK-1860: Do not cleanup application work/ directories by default (cherry picked from commit bb98ecafce196ecc5bc3a1e4cc9264df7b752c6a) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
*	Typos in Spark	Huajian Mao	2014-05-15	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Author: Huajian Mao <huajianmao@gmail.com> Closes #798 from huajianmao/patch-1 and squashes the following commits: 208a454 [Huajian Mao] A typo in Task 1b515af [Huajian Mao] A typo in the message (cherry picked from commit 94c5139607ec876782e594012a108ebf55fa97db) Signed-off-by: Reynold Xin <rxin@apache.org>
*	Fixes a misplaced comment.	Prashant Sharma	2014-05-15	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes a misplaced comment from #785. @pwendell Author: Prashant Sharma <prashant.s@imaginea.com> Closes #788 from ScrapCodes/patch-1 and squashes the following commits: 3ef6a69 [Prashant Sharma] Update package-info.java 67d9461 [Prashant Sharma] Update package-info.java (cherry picked from commit e1e3416c4e5f6f32983597d74866dbb809cf6a5e) Signed-off-by: Reynold Xin <rxin@apache.org>
*	[SQL] Fix tiny/small ints from HiveMetastore.	Michael Armbrust	2014-05-15	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \|	Author: Michael Armbrust <michael@databricks.com> Closes #797 from marmbrus/smallInt and squashes the following commits: 2db9dae [Michael Armbrust] Fix tiny/small ints from HiveMetastore. (cherry picked from commit a4aafe5f9fb191533400caeafddf04986492c95f) Signed-off-by: Reynold Xin <rxin@apache.org>
*	SPARK-1803 Replaced colon in filenames with a dash	Stevo Slavić	2014-05-15	16	-15/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch replaces colon in several filenames with dash to make these filenames Windows compatible. Author: Stevo Slavić <sslavic@gmail.com> Author: Stevo Slavic <sslavic@gmail.com> Closes #739 from sslavic/SPARK-1803 and squashes the following commits: 3ec66eb [Stevo Slavic] Removed extra empty line which was causing test to fail b967cc3 [Stevo Slavić] Aligned tests and names of test resources 2b12776 [Stevo Slavić] Fixed a typo in file name 1c5dfff [Stevo Slavić] Replaced colon in file name with dash 8f5bf7f [Stevo Slavić] Replaced colon in file name with dash c5b5083 [Stevo Slavić] Replaced colon in file name with dash a49801f [Stevo Slavić] Replaced colon in file name with dash 401d99e [Stevo Slavić] Replaced colon in file name with dash 40a9621 [Stevo Slavić] Replaced colon in file name with dash 4774580 [Stevo Slavić] Replaced colon in file name with dash 004f8bb [Stevo Slavić] Replaced colon in file name with dash d6a3e2c [Stevo Slavić] Replaced colon in file name with dash b585126 [Stevo Slavić] Replaced colon in file name with dash 028e48a [Stevo Slavić] Replaced colon in file name with dash ece0507 [Stevo Slavić] Replaced colon in file name with dash 84f5d2f [Stevo Slavić] Replaced colon in file name with dash 2fc7854 [Stevo Slavić] Replaced colon in file name with dash 9e1467d [Stevo Slavić] Replaced colon in file name with dash (cherry picked from commit e66e31be51f396c8f6b7a45119b8b31c4d8cdf79) Signed-off-by: Reynold Xin <rxin@apache.org>
*	SPARK-1851. Upgrade Avro dependency to 1.7.6 so Spark can read Avro file...	Sandy Ryza	2014-05-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	...s Author: Sandy Ryza <sandy@cloudera.com> Closes #795 from sryza/sandy-spark-1851 and squashes the following commits: 79c8227 [Sandy Ryza] SPARK-1851. Upgrade Avro dependency to 1.7.6 so Spark can read Avro files (cherry picked from commit 08e7606a964e3d1ac1d565f33651ff0035c75044) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
*	[SPARK-1741][MLLIB] add predict(JavaRDD) to RegressionModel, ↵	Xiangrui Meng	2014-05-15	6	-2/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ClassificationModel, and KMeans `model.predict` returns a RDD of Scala primitive type (Int/Double), which is recognized as Object in Java. Adding predict(JavaRDD) could make life easier for Java users. Added tests for KMeans, LinearRegression, and NaiveBayes. Will update examples after https://github.com/apache/spark/pull/653 gets merged. cc: @srowen Author: Xiangrui Meng <meng@databricks.com> Closes #670 from mengxr/predict-javardd and squashes the following commits: b77ccd8 [Xiangrui Meng] Merge branch 'master' into predict-javardd 43caac9 [Xiangrui Meng] add predict(JavaRDD) to RegressionModel, ClassificationModel, and KMeans (cherry picked from commit d52761d67f42ad4d2ff02d96f0675fb3ab709f38) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
*	[SPARK-1819] [SQL] Fix GetField.nullable.	Takuya UESHIN	2014-05-15	2	-1/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	`GetField.nullable` should be `true` not only when `field.nullable` is `true` but also when `child.nullable` is `true`. Author: Takuya UESHIN <ueshin@happy-camper.st> Closes #757 from ueshin/issues/SPARK-1819 and squashes the following commits: 8781a11 [Takuya UESHIN] Modify a test to use named parameters. 5bfc77d [Takuya UESHIN] Fix GetField.nullable. (cherry picked from commit 94c9d6f59859ebc77fae112c2c42c64b7a4d7f83) Signed-off-by: Reynold Xin <rxin@apache.org>
*	[SPARK-1845] [SQL] Use AllScalaRegistrar for SparkSqlSerializer to register ↵	Takuya UESHIN	2014-05-15	4	-26/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	serializers of ... ...Scala collections. When I execute `orderBy` or `limit` for `SchemaRDD` including `ArrayType` or `MapType`, `SparkSqlSerializer` throws the following exception: ``` com.esotericsoftware.kryo.KryoException: Class cannot be created (missing no-arg constructor): scala.collection.immutable.$colon$colon ``` or ``` com.esotericsoftware.kryo.KryoException: Class cannot be created (missing no-arg constructor): scala.collection.immutable.Vector ``` or ``` com.esotericsoftware.kryo.KryoException: Class cannot be created (missing no-arg constructor): scala.collection.immutable.HashMap$HashTrieMap ``` and so on. This is because registrations of serializers for each concrete collections are missing in `SparkSqlSerializer`. I believe it should use `AllScalaRegistrar`. `AllScalaRegistrar` covers a lot of serializers for concrete classes of `Seq`, `Map` for `ArrayType`, `MapType`. Author: Takuya UESHIN <ueshin@happy-camper.st> Closes #790 from ueshin/issues/SPARK-1845 and squashes the following commits: d1ed992 [Takuya UESHIN] Use AllScalaRegistrar for SparkSqlSerializer to register serializers of Scala collections. (cherry picked from commit db8cc6f28abe4326cea6f53feb604920e4867a27) Signed-off-by: Reynold Xin <rxin@apache.org>
*	SPARK-1846 Ignore logs directory in RAT checks	Andrew Ash	2014-05-15	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	https://issues.apache.org/jira/browse/SPARK-1846 Author: Andrew Ash <andrew@andrewash.com> Closes #793 from ash211/SPARK-1846 and squashes the following commits: 3f50db5 [Andrew Ash] SPARK-1846 Ignore logs directory in RAT checks (cherry picked from commit 3abe2b734a5578966f671c34f1de34b4446b90f1) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
*	HOTFIX: Don't build Javadoc in Maven when creating releases.	Patrick Wendell	2014-05-15	1	-2/+4
\| \| \| \| \| \| \| \| \| \|	Because we've added java package descriptions in some packages that don't have any Java files, running the Javadoc target hits this issue: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4492654 To fix this I've simply removed the javadoc target when publishing releases.
*	[maven-release-plugin] prepare for next development iteration	Patrick Wendell	2014-05-15	21	-22/+22
\|
*	[maven-release-plugin] prepare release v1.0.0-rc7	Patrick Wendell	2014-05-15	21	-24/+24
\|
*	Revert "[maven-release-plugin] prepare release v1.0.0-rc6"	Patrick Wendell	2014-05-14	21	-24/+24
\| \| \| \|	This reverts commit 54133abdce0246f6643a1112a5204afb2c4caa82.
*	Revert "[maven-release-plugin] prepare for next development iteration"	Patrick Wendell	2014-05-14	21	-22/+22
\| \| \| \|	This reverts commit e480bcfbd269ae1d7a6a92cfb50466cf192fe1fb.
*	fix different versions of commons-lang dependency and apache/spark#746 addendum	witgo	2014-05-14	2	-5/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Author: witgo <witgo@qq.com> Closes #754 from witgo/commons-lang and squashes the following commits: 3ebab31 [witgo] merge master f3b8fa2 [witgo] merge master 2083fae [witgo] repeat definition 5599cdb [witgo] multiple version of sbt dependency c1b66a1 [witgo] fix different versions of commons-lang dependency (cherry picked from commit bae07e36a6e0fb7982405316646b452b4ff06acc) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
*	Package docs	Prashant Sharma	2014-05-14	51	-1/+1116
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a few changes based on the original patch by @scrapcodes. Author: Prashant Sharma <prashant.s@imaginea.com> Author: Patrick Wendell <pwendell@gmail.com> Closes #785 from pwendell/package-docs and squashes the following commits: c32b731 [Patrick Wendell] Changes based on Prashant's patch c0463d3 [Prashant Sharma] added eof new line ce8bf73 [Prashant Sharma] Added eof new line to all files. 4c35f2e [Prashant Sharma] SPARK-1563 Add package-info.java and package.scala files for all packages that appear in docs (cherry picked from commit 46324279dae2fa803267d788f7c56b0ed643b4c8) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
*	Documentation: Encourage use of reduceByKey instead of groupByKey.	Patrick Wendell	2014-05-14	4	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \|	Author: Patrick Wendell <pwendell@gmail.com> Closes #784 from pwendell/group-by-key and squashes the following commits: 9b4505f [Patrick Wendell] Small fix 6347924 [Patrick Wendell] Documentation: Encourage use of reduceByKey instead of groupByKey. (cherry picked from commit 21570b463388194877003318317aafd842800cac) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
*	Add language tabs and Python version to interactive part of quick-start	Matei Zaharia	2014-05-14	2	-20/+133
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is an addition of some stuff that was missed in https://issues.apache.org/jira/browse/SPARK-1567. I've also updated the doc to show submitting the Python application with spark-submit. Author: Matei Zaharia <matei@databricks.com> Closes #782 from mateiz/spark-1567-extra and squashes the following commits: 6f8f2aa [Matei Zaharia] tweaks 9ed9874 [Matei Zaharia] tweaks ae67c3e [Matei Zaharia] tweak b303ba3 [Matei Zaharia] tweak 1433a4d [Matei Zaharia] Add language tabs and Python version to interactive part of quick-start guide (cherry picked from commit f10de042b8e86adf51b70bae2d8589a5cbf02935) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
*	[SPARK-1840] SparkListenerBus prints out scary error message when terminated ↵	Tathagata Das	2014-05-14	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	normally Running SparkPi example gave this error. ``` Pi is roughly 3.14374 14/05/14 18:16:19 ERROR Utils: Uncaught exception in thread SparkListenerBus scala.runtime.NonLocalReturnControl$mcV$sp ``` This is due to the catch-all in the SparkListenerBus, which logged control throwable used by scala system Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #783 from tdas/controlexception-fix and squashes the following commits: a466c8d [Tathagata Das] Ignored control exceptions when logging all exceptions. (cherry picked from commit ad4e60ee7e2c49c24a9972312915f7f7253c7679) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
*	default task number misleading in several places	Chen Chao	2014-05-14	1	-8/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	private[streaming] def defaultPartitioner(numPartitions: Int = self.ssc.sc.defaultParallelism){ new HashPartitioner(numPartitions) } it represents that the default task number in Spark Streaming relies on the variable defaultParallelism in SparkContext, which is decided by the config property spark.default.parallelism the property "spark.default.parallelism" refers to https://github.com/apache/spark/pull/389 Author: Chen Chao <crazyjvm@gmail.com> Closes #766 from CrazyJvm/patch-7 and squashes the following commits: 0b7efba [Chen Chao] Update streaming-programming-guide.md cc5b66c [Chen Chao] default task number misleading in several places (cherry picked from commit 2f639957f0bf70dddf1e698aa9e26007fb58bc67) Signed-off-by: Reynold Xin <rxin@apache.org>
*	[SPARK-1826] fix the head notation of package object dsl	wangfei	2014-05-14	1	-9/+12
\| \| \| \| \| \| \| \| \| \| \| \|	Author: wangfei <scnbwf@yeah.net> Closes #765 from scwf/dslfix and squashes the following commits: d2d1a9d [wangfei] Update package.scala 66ff53b [wangfei] fix the head notation of package object dsl (cherry picked from commit 44165fc91a31e6293a79031c89571e139d2c5356) Signed-off-by: Reynold Xin <rxin@apache.org>
*	[Typo] propertes -> properties	andrewor14	2014-05-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Author: andrewor14 <andrewor14@gmail.com> Closes #780 from andrewor14/submit-typo and squashes the following commits: e70e057 [andrewor14] propertes -> properties (cherry picked from commit 9ad096d55a3d8410f04056ebc87dbd8cba391870) Signed-off-by: Reynold Xin <rxin@apache.org>
*	[SPARK-1696][MLLIB] use alpha in dense dspr	Xiangrui Meng	2014-05-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	It doesn't affect existing code because only `alpha = 1.0` is used in the code. Author: Xiangrui Meng <meng@databricks.com> Closes #778 from mengxr/mllib-dspr-fix and squashes the following commits: a37402e [Xiangrui Meng] use alpha in dense dspr (cherry picked from commit e3d72a74ad007c2bf279d6a74cdaca948bdf0ddd) Signed-off-by: Reynold Xin <rxin@apache.org>
*	[FIX] do not load defaults when testing SparkConf in pyspark	Xiangrui Meng	2014-05-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	The default constructor loads default properties, which can fail the test. Author: Xiangrui Meng <meng@databricks.com> Closes #775 from mengxr/pyspark-conf-fix and squashes the following commits: 83ef6c4 [Xiangrui Meng] do not load defaults when testing SparkConf in pyspark (cherry picked from commit 94c6c06ea13032b80610b3f54401d2ef2aa4874a) Signed-off-by: Reynold Xin <rxin@apache.org>
*	SPARK-1833 - Have an empty SparkContext constructor.	Patrick Wendell	2014-05-14	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is nicer than relying on new SparkContext(new SparkConf()) Author: Patrick Wendell <pwendell@gmail.com> Closes #774 from pwendell/spark-context and squashes the following commits: ef9f12f [Patrick Wendell] SPARK-1833 - Have an empty SparkContext constructor. (cherry picked from commit 65533c7ec03e7eedf5cd9756822863ab6f034ec9) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
*	SPARK-1829 Sub-second durations shouldn't round to "0 s"	Andrew Ash	2014-05-14	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As "99 ms" up to 99 ms As "0.1 s" from 0.1 s up to 0.9 s https://issues.apache.org/jira/browse/SPARK-1829 Compare the first image to the second here: http://imgur.com/RaLEsSZ,7VTlgfo#0 Author: Andrew Ash <andrew@andrewash.com> Closes #768 from ash211/spark-1829 and squashes the following commits: 1c15b8e [Andrew Ash] SPARK-1829 Format sub-second durations more appropriately (cherry picked from commit a3315d7f4c7584dae2ee0aa33c6ec9e97b229b48) Signed-off-by: Reynold Xin <rxin@apache.org>
*	Fix: sbt test throw an java.lang.OutOfMemoryError: PermGen space	witgo	2014-05-14	2	-0/+6
\| \| \| \| \| \| \| \| \| \| \|	Author: witgo <witgo@qq.com> Closes #773 from witgo/sbt_javaOptions and squashes the following commits: 26c7d38 [witgo] Improve sbt configuration (cherry picked from commit fde82c1549c78f1eebbb21ec34e60befbbff65f5) Signed-off-by: Reynold Xin <rxin@apache.org>
*	[maven-release-plugin] prepare for next development iteration	Patrick Wendell	2014-05-14	21	-22/+22
\|
*	[maven-release-plugin] prepare release v1.0.0-rc6	Patrick Wendell	2014-05-14	21	-24/+24
\|
*	Adding back hive support	Patrick Wendell	2014-05-14	1	-3/+3
\|
*	Revert "[maven-release-plugin] prepare release v1.0.0-rc5"	Patrick Wendell	2014-05-14	21	-22/+22
\| \| \| \|	This reverts commit 18f062303303824139998e8fc8f4158217b0dbc3.
*	Revert "[maven-release-plugin] prepare for next development iteration"	Patrick Wendell	2014-05-14	21	-22/+22
\| \| \| \|	This reverts commit d08e9604fc9958b7c768e91715c8152db2ed6fd0.
*	[SPARK-1620] Handle uncaught exceptions in function run by Akka scheduler	Mark Hamstra	2014-05-14	5	-18/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the intended behavior was that uncaught exceptions thrown in functions being run by the Akka scheduler would end up being handled by the default uncaught exception handler set in Executor, and if that behavior is, in fact, correct, then this is a way to accomplish that. I'm not certain, though, that we shouldn't be doing something different to handle uncaught exceptions from some of these scheduled functions. In any event, this PR covers all of the cases I comment on in [SPARK-1620](https://issues.apache.org/jira/browse/SPARK-1620). Author: Mark Hamstra <markhamstra@gmail.com> Closes #622 from markhamstra/SPARK-1620 and squashes the following commits: 071d193 [Mark Hamstra] refactored post-SPARK-1772 1a6a35e [Mark Hamstra] another style fix d30eb94 [Mark Hamstra] scalastyle 3573ecd [Mark Hamstra] Use wrapped try/catch in Utils.tryOrExit 8fc0439 [Mark Hamstra] Make functions run by the Akka scheduler use Executor's UncaughtExceptionHandler (cherry picked from commit 17f3075bc4aa8cbed165f7b367f70e84b1bc8db9) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
*	SPARK-1828: Created forked version of hive-exec that doesn't bundle other ↵	Patrick Wendell	2014-05-14	2	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dependencies See https://issues.apache.org/jira/browse/SPARK-1828 for more information. This is being submitted to Jenkin's for testing. The dependency won't fully propagate in Maven central for a few more hours. Author: Patrick Wendell <pwendell@gmail.com> Closes #767 from pwendell/hive-shaded and squashes the following commits: ea10ac5 [Patrick Wendell] SPARK-1828: Created forked version of hive-exec that doesn't bundle other dependencies (cherry picked from commit d58cb33ffa9e98a64cecea7b40ce7bfbed145079) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
*	SPARK-1818 Freshen Mesos documentation	Andrew Ash	2014-05-14	2	-28/+174
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Place more emphasis on using precompiled binary versions of Spark and Mesos instead of encouraging the reader to compile from source. Author: Andrew Ash <andrew@andrewash.com> Closes #756 from ash211/spark-1818 and squashes the following commits: 7ef3b33 [Andrew Ash] Brief explanation of the interactions between Spark and Mesos e7dea8e [Andrew Ash] Add troubleshooting and debugging section 956362d [Andrew Ash] Don't need to pass spark.executor.uri into the spark shell de3353b [Andrew Ash] Wrap to 100char 7ebf6ef [Andrew Ash] Polish on the section on Mesos Master URLs 3dcc2c1 [Andrew Ash] Use --tgz parameter of make-distribution 41b68ed [Andrew Ash] Period at end of sentence; formatting on :5050 8bf2c53 [Andrew Ash] Update site.MESOS_VERSIOn to match /pom.xml 74f2040 [Andrew Ash] SPARK-1818 Freshen Mesos documentation (cherry picked from commit d1d41ccee49a5c093cb61c791c01f64f2076b83e) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
*	SPARK-1827. LICENSE and NOTICE files need a refresh to contain transitive ↵	Sean Owen	2014-05-14	3	-6/+671
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dependency info LICENSE and NOTICE policy is explained here: http://www.apache.org/dev/licensing-howto.html http://www.apache.org/legal/3party.html This leads to the following changes. First, this change enables two extensions to maven-shade-plugin in assembly/ that will try to include and merge all NOTICE and LICENSE files. This can't hurt. This generates a consolidated NOTICE file that I manually added to NOTICE. Next, a list of all dependencies and their licenses was generated: `mvn ... license:aggregate-add-third-party` to create: `target/generated-sources/license/THIRD-PARTY.txt` Each dependency is listed with one or more licenses. Determine the most-compatible license for each if there is more than one. For "unknown" license dependencies, I manually evaluateD their license. Many are actually Apache projects or components of projects covered already. The only non-trivial one was Colt, which has its own (compatible) license. I ignored Apache-licensed and public domain dependencies as these require no further action (beyond NOTICE above). BSD and MIT licenses (permissive Category A licenses) are evidently supposed to be mentioned in LICENSE, so I added a section without output from the THIRD-PARTY.txt file appropriately. Everything else, Category B licenses, are evidently mentioned in NOTICE (?) Same there. LICENSE contained some license statements for source code that is redistributed. I left this as I think that is the right place to put it. Author: Sean Owen <sowen@cloudera.com> Closes #770 from srowen/SPARK-1827 and squashes the following commits: a764504 [Sean Owen] Add LICENSE and NOTICE info for all transitive dependencies as of 1.0 (cherry picked from commit 2e5a7cde223c8bf6d34e46b27ac94a965441584d) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
*	Fixed streaming examples docs to use run-example instead of spark-submit	Tathagata Das	2014-05-14	18	-95/+130
\| \| \| \| \| \| \| \| \| \| \| \| \|	Pretty self-explanatory Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #722 from tdas/example-fix and squashes the following commits: 7839979 [Tathagata Das] Minor changes. 0673441 [Tathagata Das] Fixed java docs of java streaming example e687123 [Tathagata Das] Fixed scala style errors. 9b8d112 [Tathagata Das] Fixed streaming examples docs to use run-example instead of spark-submit.
*	[SPARK-1769] Executor loss causes NPE race condition	Andrew Or	2014-05-14	5	-26/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR replaces the Schedulable data structures in Pool.scala with thread-safe ones from java. Note that Scala's `with SynchronizedBuffer` trait is soon to be deprecated in 2.11 because it is ["inherently unreliable"](http://www.scala-lang.org/api/2.11.0/index.html#scala.collection.mutable.SynchronizedBuffer). We should slowly drift away from `SynchronizedBuffer` in other places too. Note that this PR introduces an API-breaking change; `sc.getAllPools` now returns an Array rather than an ArrayBuffer. This is because we want this method to return an immutable copy rather than one may potentially confuse the user if they try to modify the copy, which takes no effect on the original data structure. Author: Andrew Or <andrewor14@gmail.com> Closes #762 from andrewor14/pool-npe and squashes the following commits: 383e739 [Andrew Or] JavaConverters -> JavaConversions 3f32981 [Andrew Or] Merge branch 'master' of github.com:apache/spark into pool-npe 769be19 [Andrew Or] Assorted minor changes 2189247 [Andrew Or] Merge branch 'master' of github.com:apache/spark into pool-npe 05ad9e9 [Andrew Or] Fix test - contains is not the same as containsKey 0921ea0 [Andrew Or] var -> val 07d720c [Andrew Or] Synchronize Schedulable data structures (cherry picked from commit 69f750228f3ec8537a93da08e712596fa8004143) Signed-off-by: Aaron Davidson <aaron@databricks.com>
*	Fix dep exclusion: avro-ipc, not avro, depends on netty.	Marcelo Vanzin	2014-05-14	1	-6/+4
\| \| \| \| \| \| \| \|	Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #763 from vanzin/netty-dep-hell and squashes the following commits: dfb6ce2 [Marcelo Vanzin] Fix dep exclusion: avro-ipc, not avro, depends on netty.
*	SPARK-1801. expose InterruptibleIterator and TaskKilledException in deve...	Koert Kuipers	2014-05-14	2	-3/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	...loper api Author: Koert Kuipers <koert@tresata.com> Closes #764 from koertkuipers/feat-rdd-developerapi and squashes the following commits: 8516dd2 [Koert Kuipers] SPARK-1801. expose InterruptibleIterator and TaskKilledException in developer api (cherry picked from commit b22952fa1f21c0b93208846b5e1941a9d2578c6f) Signed-off-by: Aaron Davidson <aaron@databricks.com>