spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SPARK-10065] [SQL] avoid the extra copy when generate unsafe array	Wenchen Fan	2015-09-10	1	-60/+24
\| \| \| \| \| \| \| \| \| \| \| \|	The reason for this extra copy is that we iterate the array twice: calculate elements data size and copy elements to array buffer. A simple solution is to follow `createCodeForStruct`, we can dynamically grow the buffer when needed and thus don't need to know the data size ahead. This PR also include some typo and style fixes, and did some minor refactor to make sure `input.primitive` is always variable name not code when generate unsafe code. Author: Wenchen Fan <cloud0fan@outlook.com> Closes #8496 from cloud-fan/avoid-copy.
*	[SPARK-10497] [BUILD] [TRIVIAL] Handle both locations for JIRAError with ↵	Holden Karau	2015-09-10	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \|	python-jira Location of JIRAError has moved between old and new versions of python-jira package. Longer term it probably makes sense to pin to specific versions (as mentioned in https://issues.apache.org/jira/browse/SPARK-10498 ) but for now, making release tools works with both new and old versions of python-jira. Author: Holden Karau <holden@pigscanfly.ca> Closes #8661 from holdenk/SPARK-10497-release-utils-does-not-work-with-new-jira-python.
*	[MINOR] [MLLIB] [ML] [DOC] fixed typo: label for negative result should be ↵	Sean Paradiso	2015-09-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	0.0 (original: 1.0) Small typo in the example for `LabelledPoint` in the MLLib docs. Author: Sean Paradiso <seanparadiso@gmail.com> Closes #8680 from sparadiso/docs_mllib_smalltypo.
*	[SPARK-9772] [PYSPARK] [ML] Add Python API for ml.feature.VectorSlicer	Yanbo Liang	2015-09-09	1	-5/+90
\| \| \| \| \| \| \| \|	Add Python API for ml.feature.VectorSlicer. Author: Yanbo Liang <ybliang8@gmail.com> Closes #8102 from yanboliang/SPARK-9772.
*	[SPARK-9730] [SQL] Add Full Outer Join support for SortMergeJoin	Liang-Chi Hsieh	2015-09-09	5	-34/+259
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR is based on #8383 , thanks to viirya JIRA: https://issues.apache.org/jira/browse/SPARK-9730 This patch adds the Full Outer Join support for SortMergeJoin. A new class SortMergeFullJoinScanner is added to scan rows from left and right iterators. FullOuterIterator is simply a wrapper of type RowIterator to consume joined rows from SortMergeFullJoinScanner. Closes #8383 Author: Liang-Chi Hsieh <viirya@appier.com> Author: Davies Liu <davies@databricks.com> Closes #8579 from davies/smj_fullouter.
*	[SPARK-10461] [SQL] make sure `input.primitive` is always variable name not ↵	Wenchen Fan	2015-09-09	5	-67/+75
\| \| \| \| \| \| \| \| \| \| \| \|	code at `GenerateUnsafeProjection` When we generate unsafe code inside `createCodeForXXX`, we always assign the `input.primitive` to a temp variable in case `input.primitive` is expression code. This PR did some refactor to make sure `input.primitive` is always variable name, and some other typo and style fixes. Author: Wenchen Fan <cloud0fan@outlook.com> Closes #8613 from cloud-fan/minor.
*	[SPARK-10481] [YARN] SPARK_PREPEND_CLASSES make spark-yarn related jar could ↵	Jeff Zhang	2015-09-09	1	-1/+4
\| \| \| \| \| \| \| \| \| \|	n… Throw a more readable exception. Please help review. Thanks Author: Jeff Zhang <zjffdu@apache.org> Closes #8649 from zjffdu/SPARK-10481.
*	[SPARK-10117] [MLLIB] Implement SQL data source API for reading LIBSVM data	lewuathe	2015-09-09	4	-0/+256
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is convenient to implement data source API for LIBSVM format to have a better integration with DataFrames and ML pipeline API. Two option is implemented. * `numFeatures`: Specify the dimension of features vector * `featuresType`: Specify the type of output vector. `sparse` is default. Author: lewuathe <lewuathe@me.com> Closes #8537 from Lewuathe/SPARK-10117 and squashes the following commits: 986999d [lewuathe] Change unit test phrase 11d513f [lewuathe] Fix some reviews 21600a4 [lewuathe] Merge branch 'master' into SPARK-10117 9ce63c7 [lewuathe] Rewrite service loader file 1fdd2df [lewuathe] Merge branch 'SPARK-10117' of github.com:Lewuathe/spark into SPARK-10117 ba3657c [lewuathe] Merge branch 'master' into SPARK-10117 0ea1c1c [lewuathe] LibSVMRelation is registered into META-INF 4f40891 [lewuathe] Improve test suites 5ab62ab [lewuathe] Merge branch 'master' into SPARK-10117 8660d0e [lewuathe] Fix Java unit test b56a948 [lewuathe] Merge branch 'master' into SPARK-10117 2c12894 [lewuathe] Remove unnecessary tag 7d693c2 [lewuathe] Resolv conflict 62010af [lewuathe] Merge branch 'master' into SPARK-10117 a97ee97 [lewuathe] Fix some points aef9564 [lewuathe] Fix 70ee4dd [lewuathe] Add Java test 3fd8dce [lewuathe] [SPARK-10117] Implement SQL data source API for reading LIBSVM data 40d3027 [lewuathe] Add Java test 7056d4a [lewuathe] Merge branch 'master' into SPARK-10117 99accaa [lewuathe] [SPARK-10117] Implement SQL data source API for reading LIBSVM data
*	[SPARK-10227] fatal warnings with sbt on Scala 2.11	Luc Bourlier	2015-09-09	60	-151/+158
\| \| \| \| \| \| \| \| \| \| \|	The bulk of the changes are on `transient` annotation on class parameter. Often the compiler doesn't generate a field for this parameters, so the the transient annotation would be unnecessary. But if the class parameter are used in methods, then fields are created. So it is safer to keep the annotations. The remainder are some potential bugs, and deprecated syntax. Author: Luc Bourlier <luc.bourlier@typesafe.com> Closes #8433 from skyluc/issue/sbt-2.11.
*	[SPARK-10249] [ML] [DOC] Add Python Code Example to StopWordsRemover User Guide	Yuhao Yang	2015-09-08	1	-0/+19
\| \| \| \| \| \| \| \| \| \|	jira: https://issues.apache.org/jira/browse/SPARK-10249 update user guide since python support added. Author: Yuhao Yang <hhbyyh@gmail.com> Closes #8620 from hhbyyh/swPyDocExample.
*	[SPARK-9654] [ML] [PYSPARK] Add IndexToString to PySpark	Holden Karau	2015-09-08	3	-6/+73
\| \| \| \| \| \| \| \|	Adds IndexToString to PySpark. Author: Holden Karau <holden@pigscanfly.ca> Closes #7976 from holdenk/SPARK-9654-add-string-indexer-inverse-in-pyspark.
*	[SPARK-10094] Pyspark ML Feature transformers marked as experimental	noelsmith	2015-09-08	1	-0/+52
\| \| \| \| \| \| \| \|	Modified class-level docstrings to mark all feature transformers in pyspark.ml as experimental. Author: noelsmith <mail@noelsmith.com> Closes #8623 from noel-smith/SPARK-10094-mark-pyspark-ml-trans-exp.
*	[SPARK-10373] [PYSPARK] move @since into pyspark from sql	Davies Liu	2015-09-08	9	-25/+23
\| \| \| \| \| \| \| \|	cc mengxr Author: Davies Liu <davies@databricks.com> Closes #8657 from davies/move_since.
*	[SPARK-10464] [MLLIB] Add WeibullGenerator for RandomDataGenerator	Yanbo Liang	2015-09-08	2	-3/+40
\| \| \| \| \| \| \| \| \|	Add WeibullGenerator for RandomDataGenerator. #8611 need use WeibullGenerator to generate random data based on Weibull distribution. Author: Yanbo Liang <ybliang8@gmail.com> Closes #8622 from yanboliang/spark-10464.
*	[SPARK-9834] [MLLIB] implement weighted least squares via normal equation	Xiangrui Meng	2015-09-08	4	-1/+438
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The goal of this PR is to have a weighted least squares implementation that takes the normal equation approach, and hence to be able to provide R-like summary statistics and support IRLS (used by GLMs). The tests match R's lm and glmnet. There are couple TODOs that can be addressed in future PRs: * consolidate summary statistics aggregators * move `dspr` to `BLAS` * etc It would be nice to have this merged first because it blocks couple other features. dbtsai Author: Xiangrui Meng <meng@databricks.com> Closes #8588 from mengxr/SPARK-9834.
*	[SPARK-10071] [STREAMING] Output a warning when writing QueueInputDStream ↵	zsxwing	2015-09-08	3	-13/+30
\| \| \| \| \| \| \| \| \| \| \| \|	and throw a better exception when reading QueueInputDStream Output a warning when serializing QueueInputDStream rather than throwing an exception to allow unit tests use it. Moreover, this PR also throws an better exception when deserializing QueueInputDStream to make the user find out the problem easily. The previous exception is hard to understand: https://issues.apache.org/jira/browse/SPARK-8553 Author: zsxwing <zsxwing@gmail.com> Closes #8624 from zsxwing/SPARK-10071 and squashes the following commits: 847cfa8 [zsxwing] Output a warning when writing QueueInputDStream and throw a better exception when reading QueueInputDStream
*	[RELEASE] Add more contributors & only show names in release notes.	Reynold Xin	2015-09-08	2	-8/+39
\| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #8660 from rxin/contrib.
*	[HOTFIX] Fix build break caused by #8494	Michael Armbrust	2015-09-08	1	-2/+2
\| \| \| \| \| \|	Author: Michael Armbrust <michael@databricks.com> Closes #8659 from marmbrus/testBuildBreak.
*	[SPARK-10327] [SQL] Cache Table is not working while subquery has alias in ↵	Cheng Hao	2015-09-08	2	-3/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	its project list ```scala import org.apache.spark.sql.hive.execution.HiveTableScan sql("select key, value, key + 1 from src").registerTempTable("abc") cacheTable("abc") val sparkPlan = sql( """select a.key, b.key, c.key from \|abc a join abc b on a.key=b.key \|join abc c on a.key=c.key""".stripMargin).queryExecution.sparkPlan assert(sparkPlan.collect { case e: InMemoryColumnarTableScan => e }.size === 3) // failed assert(sparkPlan.collect { case e: HiveTableScan => e }.size === 0) // failed ``` The actual plan is: ``` == Parsed Logical Plan == 'Project [unresolvedalias('a.key),unresolvedalias('b.key),unresolvedalias('c.key)] 'Join Inner, Some(('a.key = 'c.key)) 'Join Inner, Some(('a.key = 'b.key)) 'UnresolvedRelation [abc], Some(a) 'UnresolvedRelation [abc], Some(b) 'UnresolvedRelation [abc], Some(c) == Analyzed Logical Plan == key: int, key: int, key: int Project [key#14,key#61,key#66] Join Inner, Some((key#14 = key#66)) Join Inner, Some((key#14 = key#61)) Subquery a Subquery abc Project [key#14,value#15,(key#14 + 1) AS _c2#16] MetastoreRelation default, src, None Subquery b Subquery abc Project [key#61,value#62,(key#61 + 1) AS _c2#58] MetastoreRelation default, src, None Subquery c Subquery abc Project [key#66,value#67,(key#66 + 1) AS _c2#63] MetastoreRelation default, src, None == Optimized Logical Plan == Project [key#14,key#61,key#66] Join Inner, Some((key#14 = key#66)) Project [key#14,key#61] Join Inner, Some((key#14 = key#61)) Project [key#14] InMemoryRelation [key#14,value#15,_c2#16], true, 10000, StorageLevel(true, true, false, true, 1), (Project [key#14,value#15,(key#14 + 1) AS _c2#16]), Some(abc) Project [key#61] MetastoreRelation default, src, None Project [key#66] MetastoreRelation default, src, None == Physical Plan == TungstenProject [key#14,key#61,key#66] BroadcastHashJoin [key#14], [key#66], BuildRight TungstenProject [key#14,key#61] BroadcastHashJoin [key#14], [key#61], BuildRight ConvertToUnsafe InMemoryColumnarTableScan [key#14], (InMemoryRelation [key#14,value#15,_c2#16], true, 10000, StorageLevel(true, true, false, true, 1), (Project [key#14,value#15,(key#14 + 1) AS _c2#16]), Some(abc)) ConvertToUnsafe HiveTableScan [key#61], (MetastoreRelation default, src, None) ConvertToUnsafe HiveTableScan [key#66], (MetastoreRelation default, src, None) ``` Author: Cheng Hao <hao.cheng@intel.com> Closes #8494 from chenghao-intel/weird_cache.
*	[SPARK-10492] [STREAMING] [DOCUMENTATION] Update Streaming documentation ↵	Tathagata Das	2015-09-08	2	-1/+25
\| \| \| \| \| \| \| \| \| \|	about rate limiting and backpressure Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #8656 from tdas/SPARK-10492 and squashes the following commits: 986cdd6 [Tathagata Das] Added information on backpressure
*	[SPARK-10468] [ MLLIB ] Verify schema before Dataframe select API call	Vinod K C	2015-09-08	2	-5/+2
\| \| \| \| \| \| \| \| \|	Loader.checkSchema was called to verify the schema after dataframe.select(...). Schema verification should be done before dataframe.select(...) Author: Vinod K C <vinod.kc@huawei.com> Closes #8636 from vinodkc/fix_GaussianMixtureModel_load_verification.
*	[SPARK-10441] [SQL] Save data correctly to json.	Yin Huai	2015-09-08	9	-8/+205
\| \| \| \| \| \| \| \|	https://issues.apache.org/jira/browse/SPARK-10441 Author: Yin Huai <yhuai@databricks.com> Closes #8597 from yhuai/timestampJson.
*	[SPARK-10470] [ML] ml.IsotonicRegressionModel.copy should set parent	Yanbo Liang	2015-09-08	2	-1/+6
\| \| \| \| \| \| \| \| \|	Copied model must have the same parent, but ml.IsotonicRegressionModel.copy did not set parent. Here fix it and add test case. Author: Yanbo Liang <ybliang8@gmail.com> Closes #8637 from yanboliang/spark-10470.
*	[SPARK-10316] [SQL] respect nondeterministic expressions in PhysicalOperation	Wenchen Fan	2015-09-08	2	-30/+20
\| \| \| \| \| \| \| \|	We did a lot of special handling for non-deterministic expressions in `Optimizer`. However, `PhysicalOperation` just collects all Projects and Filters and mess it up. We should respect the operators order caused by non-deterministic expressions in `PhysicalOperation`. Author: Wenchen Fan <cloud0fan@outlook.com> Closes #8486 from cloud-fan/fix.
*	[SPARK-10480] [ML] Fix ML.LinearRegressionModel.copy()	Yanbo Liang	2015-09-08	2	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \|	This PR fix two model ```copy()``` related issues: [SPARK-10480](https://issues.apache.org/jira/browse/SPARK-10480) ```ML.LinearRegressionModel.copy()``` ignored argument ```extra```, it will not take effect when users setting this parameter. [SPARK-10479](https://issues.apache.org/jira/browse/SPARK-10479) ```ML.LogisticRegressionModel.copy()``` should copy model summary if available. Author: Yanbo Liang <ybliang8@gmail.com> Closes #8641 from yanboliang/linear-regression-copy.
*	[SPARK-9170] [SQL] Use OrcStructInspector to be case preserving when writing ↵	Liang-Chi Hsieh	2015-09-08	2	-21/+40
\| \| \| \| \| \| \| \| \| \| \| \|	ORC files JIRA: https://issues.apache.org/jira/browse/SPARK-9170 `StandardStructObjectInspector` will implicitly lowercase column names. But I think Orc format doesn't have such requirement. In fact, there is a `OrcStructInspector` specified for Orc format. We should use it when serialize rows to Orc file. It can be case preserving when writing ORC files. Author: Liang-Chi Hsieh <viirya@appier.com> Closes #7520 from viirya/use_orcstruct.
*	Docs small fixes	Jacek Laskowski	2015-09-08	2	-19/+19
\| \| \| \| \| \|	Author: Jacek Laskowski <jacek@japila.pl> Closes #8629 from jaceklaskowski/docs-fixes.
*	[DOC] Added R to the list of languages with "high-level API" support in the…	Stephen Hopper	2015-09-08	2	-10/+10
\| \| \| \| \| \| \| \|	… main README. Author: Stephen Hopper <shopper@shopper-osx.local> Closes #8646 from enragedginger/master.
*	[SPARK-9767] Remove ConnectionManager.	Reynold Xin	2015-09-07	21	-3855/+651
\| \| \| \| \| \| \| \|	We introduced the Netty network module for shuffle in Spark 1.2, and has turned it on by default for 3 releases. The old ConnectionManager is difficult to maintain. If we merge the patch now, by the time it is released, it would be 1 yr for which ConnectionManager is off by default. It's time to remove it. Author: Reynold Xin <rxin@databricks.com> Closes #8161 from rxin/SPARK-9767.
*	[SPARK-10013] [ML] [JAVA] [TEST] remove java assert from java unit tests	Holden Karau	2015-09-05	4	-52/+54
\| \| \| \| \| \| \| \|	From Jira: We should use assertTrue, etc. instead to make sure the asserts are not ignored in tests. Author: Holden Karau <holden@pigscanfly.ca> Closes #8607 from holdenk/SPARK-10013-remove-java-assert-from-java-unit-tests.
*	[SPARK-10434] [SQL] Fixes Parquet schema of arrays that may contain null	Cheng Lian	2015-09-05	2	-9/+10
\| \| \| \| \| \| \| \| \| \| \| \|	To keep full compatibility of Parquet write path with Spark 1.4, we should rename the innermost field name of arrays that may contain null from "array_element" to "array". Please refer to [SPARK-10434] [1] for more details. [1]: https://issues.apache.org/jira/browse/SPARK-10434 Author: Cheng Lian <lian@databricks.com> Closes #8586 from liancheng/spark-10434/fix-parquet-array-type.
*	[SPARK-10440] [STREAMING] [DOCS] Update python API stuff in the programming ↵	Tathagata Das	2015-09-04	4	-12/+33
\| \| \| \| \| \| \| \| \| \| \|	guides and python docs - Fixed information around Python API tags in streaming programming guides - Added missing stuff in python docs Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #8595 from tdas/SPARK-10440.
*	[HOTFIX] [SQL] Fixes compilation error	Cheng Lian	2015-09-04	1	-1/+1
\| \| \| \| \| \| \| \|	Jenkins master builders are currently broken by a merge conflict between PR #8584 and PR #8155. Author: Cheng Lian <lian@databricks.com> Closes #8614 from liancheng/hotfix/fix-pr-8155-8584-conflict.
*	[SPARK-9925] [SQL] [TESTS] Set SQLConf.SHUFFLE_PARTITIONS.key correctly for ↵	Yin Huai	2015-09-04	7	-21/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tests This PR fix the failed test and conflict for #8155 https://issues.apache.org/jira/browse/SPARK-9925 Closes #8155 Author: Yin Huai <yhuai@databricks.com> Author: Davies Liu <davies@databricks.com> Closes #8602 from davies/shuffle_partitions.
*	[SPARK-10402] [DOCS] [ML] Add defaults to the scaladoc for params in ml/	Holden Karau	2015-09-04	10	-2/+16
\| \| \| \| \| \| \| \|	We should make sure the scaladoc for params includes their default values through the models in ml/ Author: Holden Karau <holden@pigscanfly.ca> Closes #8591 from holdenk/SPARK-10402-add-scaladoc-for-default-values-of-params-in-ml.
*	[SPARK-10311] [STREAMING] Reload appId and attemptId when app starts with ↵	xutingjun	2015-09-04	1	-0/+2
\| \| \| \| \| \| \| \|	checkpoint file in cluster mode Author: xutingjun <xutingjun@huawei.com> Closes #8477 from XuTingjun/streaming-attempt.
*	[SPARK-10454] [SPARK CORE] wait for empty event queue	robbins	2015-09-04	1	-0/+1
\| \| \| \| \| \|	Author: robbins <robbins@uk.ibm.com> Closes #8605 from robbinspg/DAGSchedulerSuite-fix.
*	[SPARK-9669] [MESOS] Support PySpark on Mesos cluster mode.	Timothy Chen	2015-09-04	3	-16/+41
\| \| \| \| \| \| \| \| \|	Support running pyspark with cluster mode on Mesos! This doesn't upload any scripts, so if running in a remote Mesos requires the user to specify the script from a available URI. Author: Timothy Chen <tnachen@gmail.com> Closes #8349 from tnachen/mesos_python.
*	[SPARK-10450] [SQL] Minor improvements to readability / style / typos etc.	Andrew Or	2015-09-04	5	-15/+15
\| \| \| \| \| \|	Author: Andrew Or <andrew@databricks.com> Closes #8603 from andrewor14/minor-sql-changes.
*	[SPARK-10176] [SQL] Show partially analyzed plans when checkAnswer fails to ↵	Wenchen Fan	2015-09-04	90	-999/+908
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	analyze This PR takes over https://github.com/apache/spark/pull/8389. This PR improves `checkAnswer` to print the partially analyzed plan in addition to the user friendly error message, in order to aid debugging failing tests. In doing so, I ran into a conflict with the various ways that we bring a SQLContext into the tests. Depending on the trait we refer to the current context as `sqlContext`, `_sqlContext`, `ctx` or `hiveContext` with access modifiers `public`, `protected` and `private` depending on the defining class. I propose we refactor as follows: 1. All tests should only refer to a `protected sqlContext` when testing general features, and `protected hiveContext` when it is a method that only exists on a `HiveContext`. 2. All tests should only import `testImplicits._` (i.e., don't import `TestHive.implicits._`) Author: Wenchen Fan <cloud0fan@outlook.com> Closes #8584 from cloud-fan/cleanupTests.
*	MAINTENANCE: Automated closing of pull requests.	Michael Armbrust	2015-09-04	0	-0/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit exists to close the following pull requests on Github: Closes #1890 (requested by andrewor14, JoshRosen) Closes #3558 (requested by JoshRosen, marmbrus) Closes #3890 (requested by marmbrus) Closes #3895 (requested by andrewor14, marmbrus) Closes #4055 (requested by andrewor14) Closes #4105 (requested by andrewor14) Closes #4812 (requested by marmbrus) Closes #5109 (requested by andrewor14) Closes #5178 (requested by andrewor14) Closes #5298 (requested by marmbrus) Closes #5393 (requested by marmbrus) Closes #5449 (requested by andrewor14) Closes #5468 (requested by marmbrus) Closes #5715 (requested by marmbrus) Closes #6192 (requested by marmbrus) Closes #6319 (requested by marmbrus) Closes #6326 (requested by marmbrus) Closes #6349 (requested by marmbrus) Closes #6380 (requested by andrewor14) Closes #6554 (requested by marmbrus) Closes #6696 (requested by marmbrus) Closes #6868 (requested by marmbrus) Closes #6951 (requested by marmbrus) Closes #7129 (requested by marmbrus) Closes #7188 (requested by marmbrus) Closes #7358 (requested by marmbrus) Closes #7379 (requested by marmbrus) Closes #7628 (requested by marmbrus) Closes #7715 (requested by marmbrus) Closes #7782 (requested by marmbrus) Closes #7914 (requested by andrewor14) Closes #8051 (requested by andrewor14) Closes #8269 (requested by andrewor14) Closes #8448 (requested by andrewor14) Closes #8576 (requested by andrewor14)
*	[MINOR] Minor style fix in SparkR	Shivaram Venkataraman	2015-09-04	1	-1/+1
\| \| \| \| \| \| \| \|	`dev/lintr-r` passes on my machine now Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #8601 from shivaram/sparkr-style-fix.
*	[SPARK-10003] Improve readability of DAGScheduler	Andrew Or	2015-09-03	1	-37/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Note: this is not intended to be in Spark 1.5! This patch rewrites some code in the `DAGScheduler` to make it more readable. In particular - there were blocks of code that are unnecessary and removed for simplicity - there were abstractions that are unnecessary and made the code hard to navigate - other minor changes Author: Andrew Or <andrew@databricks.com> Closes #8217 from andrewor14/dag-scheduler-readability and squashes the following commits: 57abca3 [Andrew Or] Move comment back into if case 574fb1e [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-scheduler-readability 64a9ed2 [Andrew Or] Remove unnecessary code + minor code rewrites
*	[SPARK-10421] [BUILD] Exclude curator artifacts from tachyon dependencies.	Marcelo Vanzin	2015-09-03	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \|	This avoids them being mistakenly pulled instead of the newer ones that Spark actually uses. Spark only depends on these artifacts transitively, so sometimes maven just decides to pick tachyon's version of the dependency for whatever reason. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #8577 from vanzin/SPARK-10421.
*	[SPARK-10435] Spark submit should fail fast for Mesos cluster mode with R	Andrew Or	2015-09-03	1	-0/+3
\| \| \| \| \| \| \| \|	It's not supported yet so we should error with a clear message. Author: Andrew Or <andrew@databricks.com> Closes #8590 from andrewor14/mesos-cluster-r-guard.
*	[SPARK-9591] [CORE] Job may fail for exception during getting remote block	jeanlyn	2015-09-03	3	-2/+80
\| \| \| \| \| \| \| \| \|	[SPARK-9591](https://issues.apache.org/jira/browse/SPARK-9591) When we getting the broadcast variable, we can fetch the block form several location,but now when connecting the lost blockmanager(idle for enough time removed by driver when using dynamic resource allocate and so on) will cause task fail,and the worse case will cause the job fail. Author: jeanlyn <jeanlyn92@gmail.com> Closes #7927 from jeanlyn/catch_exception.
*	[SPARK-10430] [CORE] Added hashCode methods in AccumulableInfo and ↵	Vinod K C	2015-09-03	4	-1/+26
\| \| \| \| \| \| \| \|	RDDOperationScope Author: Vinod K C <vinod.kc@huawei.com> Closes #8581 from vinodkc/fix_RDDOperationScope_Hashcode.
*	[SPARK-9672] [MESOS] Don’t include SPARK_ENV_LOADED when passing env vars	Pat Shields	2015-09-03	2	-4/+25
\| \| \| \| \| \| \| \|	This contribution is my original work and I license the work to the project under the project's open source license. Author: Pat Shields <yeoldefortran@gmail.com> Closes #7979 from pashields/env-loading-on-driver.
*	[SPARK-9869] [STREAMING] Wait for all event notifications before asserting ↵	robbins	2015-09-03	1	-0/+3
\| \| \| \| \| \| \| \|	results Author: robbins <robbins@uk.ibm.com> Closes #8589 from robbinspg/InputStreamSuite-fix.
*	[SPARK-10431] [CORE] Fix intermittent test failure. Wait for event queue to ↵	robbins	2015-09-03	1	-0/+4
\| \| \| \| \| \| \| \|	be clear Author: robbins <robbins@uk.ibm.com> Closes #8582 from robbinspg/InputOutputMetricsSuite.