spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SPARK-4085] Propagate FetchFailedException when Spark fails to read local ↵	Reynold Xin	2014-12-03	3	-13/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	shuffle file. cc aarondav kayousterhout pwendell This should go into 1.2? Author: Reynold Xin <rxin@databricks.com> Closes #3579 from rxin/SPARK-4085 and squashes the following commits: 255b4fd [Reynold Xin] Updated test. f9814d9 [Reynold Xin] Code review feedback. 2afaf35 [Reynold Xin] [SPARK-4085] Propagate FetchFailedException when Spark fails to read local shuffle file.
*	[SPARK-4498][core] Don't transition ExecutorInfo to RUNNING until Driver ↵	Mark Hamstra	2014-12-03	2	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	adds Executor The ExecutorInfo only reaches the RUNNING state if the Driver is alive to send the ExecutorStateChanged message to master. Else, appInfo.resetRetryCount() is never called and failing Executors will eventually exceed ApplicationState.MAX_NUM_RETRY, resulting in the application being removed from the master's accounting. JoshRosen Author: Mark Hamstra <markhamstra@gmail.com> Closes #3550 from markhamstra/SPARK-4498 and squashes the following commits: 8f543b1 [Mark Hamstra] Don't transition ExecutorInfo to RUNNING until Executor is added by Driver
*	[SPARK-4552][SQL] Avoid exception when reading empty parquet data through Hive	Michael Armbrust	2014-12-03	3	-45/+62
\| \| \| \| \| \| \| \| \| \| \|	This is a very small fix that catches one specific exception and returns an empty table. #3441 will address this in a more principled way. Author: Michael Armbrust <michael@databricks.com> Closes #3586 from marmbrus/fixEmptyParquet and squashes the following commits: 2781d9f [Michael Armbrust] Handle empty lists for newParquet 04dd376 [Michael Armbrust] Avoid exception when reading empty parquet data through Hive
*	[HOT FIX] [YARN] Check whether `/lib` exists before listing its files	Andrew Or	2014-12-03	1	-12/+15
\| \| \| \| \| \| \| \| \| \|	This is caused by a975dc32799bb8a14f9e1c76defaaa7cfbaf8b53 Author: Andrew Or <andrew@databricks.com> Closes #3589 from andrewor14/yarn-hot-fix and squashes the following commits: a4fad5f [Andrew Or] Check whether lib directory exists before listing its files
*	[SPARK-4642] Add description about spark.yarn.queue to running-on-YARN document.	Masayoshi TSUZUKI	2014-12-03	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added descriptions about these parameters. - spark.yarn.queue Modified description about the defalut value of this parameter. - spark.yarn.submit.file.replication Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp> Closes #3500 from tsudukim/feature/SPARK-4642 and squashes the following commits: ce99655 [Masayoshi TSUZUKI] better gramatically. 21cf624 [Masayoshi TSUZUKI] Removed intentionally undocumented properties. 88cac9b [Masayoshi TSUZUKI] [SPARK-4642] Documents about running-on-YARN needs update
*	[SPARK-4715][Core] Make sure tryToAcquire won't return a negative value	zsxwing	2014-12-03	2	-3/+19
\| \| \| \| \| \| \| \| \| \|	ShuffleMemoryManager.tryToAcquire may return a negative value. The unit test demonstrates this bug. It will output `0 did not equal -200 granted is negative`. Author: zsxwing <zsxwing@gmail.com> Closes #3575 from zsxwing/SPARK-4715 and squashes the following commits: a193ae6 [zsxwing] Make sure tryToAcquire won't return a negative value
*	[SPARK-4701] Typo in sbt/sbt	Masayoshi TSUZUKI	2014-12-03	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	Modified typo. Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp> Closes #3560 from tsudukim/feature/SPARK-4701 and squashes the following commits: ed2a3f1 [Masayoshi TSUZUKI] Another whitespace position error. 1af3a35 [Masayoshi TSUZUKI] [SPARK-4701] Typo in sbt/sbt
*	SPARK-2624 add datanucleus jars to the container in yarn-cluster	Jim Lim	2014-12-03	3	-0/+157
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If `spark-submit` finds the datanucleus jars, it adds them to the driver's classpath, but does not add it to the container. This patch modifies the yarn deployment class to copy all `datanucleus-*` jars found in `[spark-home]/libs` to the container. Author: Jim Lim <jim@quixey.com> Closes #3238 from jimjh/SPARK-2624 and squashes the following commits: 3633071 [Jim Lim] SPARK-2624 update documentation and comments fe95125 [Jim Lim] SPARK-2624 keep java imports together 6c31fe0 [Jim Lim] SPARK-2624 update documentation 6690fbf [Jim Lim] SPARK-2624 add tests d28d8e9 [Jim Lim] SPARK-2624 add spark.yarn.datanucleus.dir option 84e6cba [Jim Lim] SPARK-2624 add datanucleus jars to the container in yarn-cluster
*	[SPARK-4717][MLlib] Optimize BLAS library to avoid de-reference multiple ↵	DB Tsai	2014-12-03	1	-39/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	times in loop Have a local reference to `values` and `indices` array in the `Vector` object so JVM can locate the value with one operation call. See `SPARK-4581` for similar optimization, and the bytecode analysis. Author: DB Tsai <dbtsai@alpinenow.com> Closes #3577 from dbtsai/blasopt and squashes the following commits: 62d38c4 [DB Tsai] formating 0316cef [DB Tsai] first commit
*	[SPARK-4708][MLLib] Make k-mean runs two/three times faster with ↵	DB Tsai	2014-12-03	5	-68/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dense/sparse sample Note that the usage of `breezeSquaredDistance` in `org.apache.spark.mllib.util.MLUtils.fastSquaredDistance` is in the critical path, and `breezeSquaredDistance` is slow. We should replace it with our own implementation. Here is the benchmark against mnist8m dataset. Before DenseVector: 70.04secs SparseVector: 59.05secs With this PR DenseVector: 30.58secs SparseVector: 21.14secs Author: DB Tsai <dbtsai@alpinenow.com> Closes #3565 from dbtsai/kmean and squashes the following commits: 08bc068 [DB Tsai] restyle de24662 [DB Tsai] address feedback b185a77 [DB Tsai] cleanup 4554ddd [DB Tsai] first commit
*	[SPARK-4710] [mllib] Eliminate MLlib compilation warnings	Joseph K. Bradley	2014-12-03	2	-8/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Renamed StreamingKMeans to StreamingKMeansExample to avoid warning about name conflict with StreamingKMeans class. Added import to DecisionTreeRunner to eliminate warning. CC: mengxr Author: Joseph K. Bradley <joseph@databricks.com> Closes #3568 from jkbradley/ml-compilation-warnings and squashes the following commits: 64d6bc4 [Joseph K. Bradley] Updated DecisionTreeRunner.scala and StreamingKMeans.scala to eliminate compilation warnings, including renaming StreamingKMeans to StreamingKMeansExample.
*	[SPARK-4397][Core] Change the 'since' value of '@deprecated' to '1.3.0'	zsxwing	2014-12-03	1	-18/+18
\| \| \| \| \| \| \| \| \| \|	As #3262 wasn't merged to branch 1.2, the `since` value of `deprecated` should be '1.3.0'. Author: zsxwing <zsxwing@gmail.com> Closes #3573 from zsxwing/SPARK-4397-version and squashes the following commits: 1daa03c [zsxwing] Change the 'since' value to '1.3.0'
*	[SPARK-4672][Core]Checkpoint() should clear f to shorten the serialization chain	JerryLead	2014-12-02	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672 The f closure of `PartitionsRDD(ZippedPartitionsRDD2)` contains a `$outer` that references EdgeRDD/VertexRDD, which causes task's serialization chain become very long in iterative GraphX applications. As a result, StackOverflow error will occur. If we set "f = null" in `clearDependencies()`, checkpoint() can cut off the long serialization chain. More details and explanation can be found in the JIRA. Author: JerryLead <JerryLead@163.com> Author: Lijie Xu <csxulijie@gmail.com> Closes #3545 from JerryLead/my_core and squashes the following commits: f7faea5 [JerryLead] checkpoint() should clear the f to avoid StackOverflow error c0169da [JerryLead] Merge branch 'master' of https://github.com/apache/spark 52799e3 [Lijie Xu] Merge pull request #1 from apache/master
*	[SPARK-4672][GraphX]Non-transient PartitionsRDDs will lead to StackOverflow ↵	JerryLead	2014-12-02	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	error The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672 In a nutshell, if `val partitionsRDD` in EdgeRDDImpl and VertexRDDImpl are non-transient, the serialization chain can become very long in iterative algorithms and finally lead to the StackOverflow error. More details and explanation can be found in the JIRA. Author: JerryLead <JerryLead@163.com> Author: Lijie Xu <csxulijie@gmail.com> Closes #3544 from JerryLead/my_graphX and squashes the following commits: 628f33c [JerryLead] set PartitionsRDD to be transient in EdgeRDDImpl and VertexRDDImpl c0169da [JerryLead] Merge branch 'master' of https://github.com/apache/spark 52799e3 [Lijie Xu] Merge pull request #1 from apache/master
*	[SPARK-4672][GraphX]Perform checkpoint() on PartitionsRDD to shorten the lineage	JerryLead	2014-12-02	2	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672 Iterative GraphX applications always have long lineage, while checkpoint() on EdgeRDD and VertexRDD themselves cannot shorten the lineage. In contrast, if we perform checkpoint() on their ParitionsRDD, the long lineage can be cut off. Moreover, the existing operations such as cache() in this code is performed on the PartitionsRDD, so checkpoint() should do the same way. More details and explanation can be found in the JIRA. Author: JerryLead <JerryLead@163.com> Author: Lijie Xu <csxulijie@gmail.com> Closes #3549 from JerryLead/my_graphX_checkpoint and squashes the following commits: d1aa8d8 [JerryLead] Perform checkpoint() on PartitionsRDD not VertexRDD and EdgeRDD themselves ff08ed4 [JerryLead] Merge branch 'master' of https://github.com/apache/spark c0169da [JerryLead] Merge branch 'master' of https://github.com/apache/spark 52799e3 [Lijie Xu] Merge pull request #1 from apache/master
*	[Release] Translate unknown author names automatically	Andrew Or	2014-12-02	2	-18/+111
\|
*	Minor nit style cleanup in GraphX.	Reynold Xin	2014-12-02	1	-1/+1
\|
*	[SPARK-4695][SQL] Get result using executeCollect	wangfei	2014-12-02	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	Using ```executeCollect``` to collect the result, because executeCollect is a custom implementation of collect in spark sql which better than rdd's collect Author: wangfei <wangfei1@huawei.com> Closes #3547 from scwf/executeCollect and squashes the following commits: a5ab68e [wangfei] Revert "adding debug info" a60d680 [wangfei] fix test failure 0db7ce8 [wangfei] adding debug info 184c594 [wangfei] using executeCollect instead collect
*	[SPARK-4670] [SQL] wrong symbol for bitwise not	Daoyuan Wang	2014-12-02	2	-10/+25
\| \| \| \| \| \| \| \| \| \| \| \|	We should use `~` instead of `-` for bitwise NOT. Author: Daoyuan Wang <daoyuan.wang@intel.com> Closes #3528 from adrian-wang/symbol and squashes the following commits: affd4ad [Daoyuan Wang] fix code gen test case 56efb79 [Daoyuan Wang] ensure bitwise NOT over byte and short persist data type f55fbae [Daoyuan Wang] wrong symbol for bitwise not
*	[SPARK-4593][SQL] Return null when denominator is 0	Daoyuan Wang	2014-12-02	4	-5/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SELECT max(1/0) FROM src would return a very large number, which is obviously not right. For hive-0.12, hive would return `Infinity` for 1/0, while for hive-0.13.1, it is `NULL` for 1/0. I think it is better to keep our behavior with newer Hive version. This PR ensures that when the divider is 0, the result of expression should be NULL, same with hive-0.13.1 Author: Daoyuan Wang <daoyuan.wang@intel.com> Closes #3443 from adrian-wang/div and squashes the following commits: 2e98677 [Daoyuan Wang] fix code gen for divide 0 85c28ba [Daoyuan Wang] temp 36236a5 [Daoyuan Wang] add test cases 6f5716f [Daoyuan Wang] fix comments cee92bd [Daoyuan Wang] avoid evaluation 2 times 22ecd9a [Daoyuan Wang] fix style cf28c58 [Daoyuan Wang] divide fix 2dfe50f [Daoyuan Wang] return null when divider is 0 of Double type
*	[SPARK-4676][SQL] JavaSchemaRDD.schema may throw NullType MatchError if sql ↵	YanTangZhai	2014-12-02	5	-0/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	has null val jsc = new org.apache.spark.api.java.JavaSparkContext(sc) val jhc = new org.apache.spark.sql.hive.api.java.JavaHiveContext(jsc) val nrdd = jhc.hql("select null from spark_test.for_test") println(nrdd.schema) Then the error is thrown as follows: scala.MatchError: NullType (of class org.apache.spark.sql.catalyst.types.NullType$) at org.apache.spark.sql.types.util.DataTypeConversions$.asJavaDataType(DataTypeConversions.scala:43) Author: YanTangZhai <hakeemzhai@tencent.com> Author: yantangzhai <tyz0303@163.com> Author: Michael Armbrust <michael@databricks.com> Closes #3538 from YanTangZhai/MatchNullType and squashes the following commits: e052dff [yantangzhai] [SPARK-4676] [SQL] JavaSchemaRDD.schema may throw NullType MatchError if sql has null 4b4bb34 [yantangzhai] [SPARK-4676] [SQL] JavaSchemaRDD.schema may throw NullType MatchError if sql has null 896c7b7 [yantangzhai] fix NullType MatchError in JavaSchemaRDD when sql has null 6e643f8 [YanTangZhai] Merge pull request #11 from apache/master e249846 [YanTangZhai] Merge pull request #10 from apache/master d26d982 [YanTangZhai] Merge pull request #9 from apache/master 76d4027 [YanTangZhai] Merge pull request #8 from apache/master 03b62b0 [YanTangZhai] Merge pull request #7 from apache/master 8a00106 [YanTangZhai] Merge pull request #6 from apache/master cbcba66 [YanTangZhai] Merge pull request #3 from apache/master cdef539 [YanTangZhai] Merge pull request #1 from apache/master
*	[SPARK-4663][sql]add finally to avoid resource leak	baishuo	2014-12-02	1	-4/+7
\| \| \| \| \| \| \| \| \| \|	Author: baishuo <vc_java@hotmail.com> Closes #3526 from baishuo/master-trycatch and squashes the following commits: d446e14 [baishuo] correct the code style b36bf96 [baishuo] correct the code style ae0e447 [baishuo] add finally to avoid resource leak
*	[SPARK-4536][SQL] Add sqrt and abs to Spark SQL DSL	Kousuke Saruta	2014-12-02	4	-1/+74
\| \| \| \| \| \| \| \| \| \| \| \| \|	Spark SQL has embeded sqrt and abs but DSL doesn't support those functions. Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes #3401 from sarutak/dsl-missing-operator and squashes the following commits: 07700cf [Kousuke Saruta] Modified Literal(null, NullType) to Literal(null) in DslQuerySuite 8f366f8 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into dsl-missing-operator 1b88e2e [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into dsl-missing-operator 0396f89 [Kousuke Saruta] Added sqrt and abs to Spark SQL DSL
*	Indent license header properly for interfaces.scala.	Reynold Xin	2014-12-02	1	-17/+15
\| \| \| \| \| \| \| \| \| \|	A very small nit update. Author: Reynold Xin <rxin@databricks.com> Closes #3552 from rxin/license-header and squashes the following commits: df8d1a4 [Reynold Xin] Indent license header properly for interfaces.scala.
*	[SPARK-4686] Link to allowed master URLs is broken	Kay Ousterhout	2014-12-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	The link points to the old scala programming guide; it should point to the submitting applications page. This should be backported to 1.1.2 (it's been broken as of 1.0). Author: Kay Ousterhout <kayousterhout@gmail.com> Closes #3542 from kayousterhout/SPARK-4686 and squashes the following commits: a8fc43b [Kay Ousterhout] [SPARK-4686] Link to allowed master URLs is broken
*	[SPARK-4397][Core] Cleanup 'import SparkContext._' in core	zsxwing	2014-12-02	36	-44/+8
\| \| \| \| \| \| \| \| \| \|	This PR cleans up `import SparkContext._` in core for SPARK-4397(#3262) to prove it really works well. Author: zsxwing <zsxwing@gmail.com> Closes #3530 from zsxwing/SPARK-4397-cleanup and squashes the following commits: 04e2273 [zsxwing] Cleanup 'import SparkContext._' in core
*	[SPARK-4611][MLlib] Implement the efficient vector norm	DB Tsai	2014-12-02	4	-6/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The vector norm in breeze is implemented by `activeIterator` which is known to be very slow. In this PR, an efficient vector norm is implemented, and with this API, `Normalizer` and `k-means` have big performance improvement. Here is the benchmark against mnist8m dataset. a) `Normalizer` Before DenseVector: 68.25secs SparseVector: 17.01secs With this PR DenseVector: 12.71secs SparseVector: 2.73secs b) `k-means` Before DenseVector: 83.46secs SparseVector: 61.60secs With this PR DenseVector: 70.04secs SparseVector: 59.05secs Author: DB Tsai <dbtsai@alpinenow.com> Closes #3462 from dbtsai/norm and squashes the following commits: 63c7165 [DB Tsai] typo 0c3637f [DB Tsai] add import org.apache.spark.SparkContext._ back 6fa616c [DB Tsai] address feedback 9b7cb56 [DB Tsai] move norm to static method 0b632e6 [DB Tsai] kmeans dbed124 [DB Tsai] style c1a877c [DB Tsai] first commit
*	MAINTENANCE: Automated closing of pull requests.	Patrick Wendell	2014-12-01	0	-0/+0
\| \| \| \| \| \| \| \| \| \| \| \|	This commit exists to close the following pull requests on Github: Closes #1612 (close requested by 'marmbrus') Closes #2723 (close requested by 'marmbrus') Closes #1737 (close requested by 'marmbrus') Closes #2252 (close requested by 'marmbrus') Closes #2029 (close requested by 'marmbrus') Closes #2386 (close requested by 'marmbrus') Closes #2997 (close requested by 'marmbrus')
*	[SPARK-4268][SQL] Use #::: to get benefit from Stream in ↵	zsxwing	2014-12-01	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	SqlLexical.allCaseVersions In addition, using `s.isEmpty` to eliminate the string comparison. Author: zsxwing <zsxwing@gmail.com> Closes #3132 from zsxwing/SPARK-4268 and squashes the following commits: 358e235 [zsxwing] Improvement of allCaseVersions
*	[SPARK-4529] [SQL] support view with column alias	Daoyuan Wang	2014-12-01	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support view definition like CREATE VIEW view3(valoo) TBLPROPERTIES ("fear" = "factor") AS SELECT upper(value) FROM src WHERE key=86; [valoo as the alias of upper(value)]. This is missing part of SPARK-4239, for a fully view support. Author: Daoyuan Wang <daoyuan.wang@intel.com> Closes #3396 from adrian-wang/viewcolumn and squashes the following commits: 4d001d0 [Daoyuan Wang] support view with column alias
*	[SQL][DOC] Date type in SQL programming guide	Daoyuan Wang	2014-12-01	1	-0/+23
\| \| \| \| \| \| \| \|	Author: Daoyuan Wang <daoyuan.wang@intel.com> Closes #3535 from adrian-wang/datedoc and squashes the following commits: 18ff1ed [Daoyuan Wang] [DOC] Date type
*	[SQL] Minor fix for doc and comment	wangfei	2014-12-01	3	-5/+7
\| \| \| \| \| \| \| \|	Author: wangfei <wangfei1@huawei.com> Closes #3533 from scwf/sql-doc1 and squashes the following commits: 962910b [wangfei] doc and comment fix
*	[SPARK-4658][SQL] Code documentation issue in DDL of datasource API	ravipesala	2014-12-01	2	-3/+3
\| \| \| \| \| \| \| \| \|	Author: ravipesala <ravindra.pesala@huawei.com> Closes #3516 from ravipesala/ddl_doc and squashes the following commits: d101fdf [ravipesala] Style issues fixed d2238cd [ravipesala] Corrected documentation
*	[SPARK-4650][SQL] Supporting multi column support in countDistinct function ↵	ravipesala	2014-12-01	2	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	like count(distinct c1,c2..) in Spark SQL Supporting multi column support in countDistinct function like count(distinct c1,c2..) in Spark SQL Author: ravipesala <ravindra.pesala@huawei.com> Author: Michael Armbrust <michael@databricks.com> Closes #3511 from ravipesala/countdistinct and squashes the following commits: cc4dbb1 [ravipesala] style 070e12a [ravipesala] Supporting multi column support in count(distinct c1,c2..) in Spark SQL
*	[SPARK-4358][SQL] Let BigDecimal do checking type compatibility	Liang-Chi Hsieh	2014-12-01	1	-8/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove hardcoding max and min values for types. Let BigDecimal do checking type compatibility. Author: Liang-Chi Hsieh <viirya@gmail.com> Closes #3208 from viirya/more_numericLit and squashes the following commits: e9834b4 [Liang-Chi Hsieh] Remove byte and short types for number literal. 1bd1825 [Liang-Chi Hsieh] Fix Indentation and make the modification clearer. cf1a997 [Liang-Chi Hsieh] Modified for comment to add a rule of analysis that adds a cast. 91fe489 [Liang-Chi Hsieh] add Byte and Short. 1bdc69d [Liang-Chi Hsieh] Let BigDecimal do checking type compatibility.
*	[SQL] add @group tab in limit() and count()	Jacky Li	2014-12-01	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	group tab is missing for scaladoc Author: Jacky Li <jacky.likun@gmail.com> Closes #3458 from jackylk/patch-7 and squashes the following commits: 0121a70 [Jacky Li] add @group tab in limit() and count()
*	[SPARK-4258][SQL][DOC] Documents spark.sql.parquet.filterPushdown	Cheng Lian	2014-12-01	1	-6/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Documents `spark.sql.parquet.filterPushdown`, explains why it's turned off by default and when it's safe to be turned on. <!-- Reviewable:start --> [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/3440) <!-- Reviewable:end --> Author: Cheng Lian <lian@databricks.com> Closes #3440 from liancheng/parquet-filter-pushdown-doc and squashes the following commits: 2104311 [Cheng Lian] Documents spark.sql.parquet.filterPushdown
*	Documentation: add description for repartitionAndSortWithinPartitions	Madhu Siddalingaiah	2014-12-01	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \|	Author: Madhu Siddalingaiah <madhu@madhu.com> Closes #3390 from msiddalingaiah/master and squashes the following commits: cbccbfe [Madhu Siddalingaiah] Documentation: replace <b> with <code> (again) 332f7a2 [Madhu Siddalingaiah] Documentation: replace <b> with <code> cd2b05a [Madhu Siddalingaiah] Merge remote-tracking branch 'upstream/master' 0fc12d7 [Madhu Siddalingaiah] Documentation: add description for repartitionAndSortWithinPartitions
*	[SPARK-4661][Core] Minor code and docs cleanup	zsxwing	2014-12-01	3	-3/+2
\| \| \| \| \| \| \| \|	Author: zsxwing <zsxwing@gmail.com> Closes #3521 from zsxwing/SPARK-4661 and squashes the following commits: 03cbe3f [zsxwing] Minor code and docs cleanup
*	[SPARK-4664][Core] Throw an exception when spark.akka.frameSize > 2047	zsxwing	2014-12-01	1	-1/+8
\| \| \| \| \| \| \| \| \| \|	If `spark.akka.frameSize` > 2047, it will overflow and become negative. Should have some assertion in `maxFrameSizeBytes` to warn people. Author: zsxwing <zsxwing@gmail.com> Closes #3527 from zsxwing/SPARK-4664 and squashes the following commits: 0089c7a [zsxwing] Throw an exception when spark.akka.frameSize > 2047
*	SPARK-2192 [BUILD] Examples Data Not in Binary Distribution	Sean Owen	2014-12-01	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	Simply, add data/ to distributions. This adds about 291KB (compressed) to the tarball, FYI. Author: Sean Owen <sowen@cloudera.com> Closes #3480 from srowen/SPARK-2192 and squashes the following commits: 47688f1 [Sean Owen] Add data/ to distributions
*	Fix wrong file name pattern in .gitignore	Kousuke Saruta	2014-12-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	In .gitignore, there is an entry for spark--bin.tar.gz but considering make-distribution.sh, the name pattern should be spark--bin-*.tgz. This change is really small so I don't open issue in JIRA. If it's needed, please let me know. Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes #3529 from sarutak/fix-wrong-tgz-pattern and squashes the following commits: de3c70a [Kousuke Saruta] Fixed wrong file name pattern in .gitignore
*	[SPARK-4632] version update	Prabeesh K	2014-11-30	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Prabeesh K <prabsmails@gmail.com> Closes #3495 from prabeesh/master and squashes the following commits: ab03d50 [Prabeesh K] Update pom.xml 8c6437e [Prabeesh K] Revert e10b40a [Prabeesh K] version update dbac9eb [Prabeesh K] Revert ec0b1c3 [Prabeesh K] [SPARK-4632] version update a835505 [Prabeesh K] [SPARK-4632] version update 831391b [Prabeesh K] [SPARK-4632] version update
*	MAINTENANCE: Automated closing of pull requests.	Patrick Wendell	2014-11-30	0	-0/+0
\| \| \| \| \| \| \| \|	This commit exists to close the following pull requests on Github: Closes #2915 (close requested by 'JoshRosen') Closes #3140 (close requested by 'JoshRosen') Closes #3366 (close requested by 'JoshRosen')
*	[DOC] Fixes formatting typo in SQL programming guide	Cheng Lian	2014-11-30	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \|	<!-- Reviewable:start --> [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/3498) <!-- Reviewable:end --> Author: Cheng Lian <lian@databricks.com> Closes #3498 from liancheng/fix-sql-doc-typo and squashes the following commits: 865ecd7 [Cheng Lian] Fixes formatting typo in SQL programming guide
*	[SPARK-4656][Doc] Typo in Programming Guide markdown	lewuathe	2014-11-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Grammatical error in Programming Guide document Author: lewuathe <lewuathe@me.com> Closes #3412 from Lewuathe/typo-programming-guide and squashes the following commits: a3e2f00 [lewuathe] Typo in Programming Guide markdown
*	[SPARK-4623]Add the some error infomation if using spark-sql in yarn-cluster ↵	carlmartin	2014-11-30	2	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mode If using spark-sql in yarn-cluster mode, print an error infomation just as the spark shell in yarn-cluster mode. Author: carlmartin <carlmartinmax@gmail.com> Author: huangzhaowei <carlmartinmax@gmail.com> Closes #3479 from SaintBacchus/sparkSqlShell and squashes the following commits: 35829a9 [carlmartin] improve the description of comment e6c1eb7 [carlmartin] add a comment in bin/spark-sql to remind user who wants to change the class f1c5c8d [carlmartin] Merge branch 'master' into sparkSqlShell 8e112c5 [huangzhaowei] singular form ec957bc [carlmartin] Add the some error infomation if using spark-sql in yarn-cluster mode 7bcecc2 [carlmartin] Merge branch 'master' of https://github.com/apache/spark into codereview 4fad75a [carlmartin] Add the Error infomation using spark-sql in yarn-cluster mode
*	SPARK-2143 [WEB UI] Add Spark version to UI footer	Sean Owen	2014-11-30	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \|	This PR adds the Spark version number to the UI footer; this is how it looks: ![screen shot 2014-11-21 at 22 58 40](https://cloud.githubusercontent.com/assets/822522/5157738/f4822094-7316-11e4-98f1-333a535fdcfa.png) Author: Sean Owen <sowen@cloudera.com> Closes #3410 from srowen/SPARK-2143 and squashes the following commits: e9b3a7a [Sean Owen] Add Spark version to footer
*	[DOCS][BUILD] Add instruction to use change-version-to-2.11.sh in 'Building ↵	Takuya UESHIN	2014-11-30	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	for Scala 2.11'. To build with Scala 2.11, we have to execute `change-version-to-2.11.sh` before Maven execute, otherwise inter-module dependencies are broken. Author: Takuya UESHIN <ueshin@happy-camper.st> Closes #3361 from ueshin/docs/building-spark_2.11 and squashes the following commits: 1d29126 [Takuya UESHIN] Add instruction to use change-version-to-2.11.sh in 'Building for Scala 2.11'.
*	SPARK-4507: PR merge script should support closing multiple JIRA tickets	Takayuki Hasegawa	2014-11-29	1	-7/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This will fix SPARK-4507. For pull requests that reference multiple JIRAs in their titles, it would be helpful if the PR merge script offered to close all of them. Author: Takayuki Hasegawa <takayuki.hasegawa0311@gmail.com> Closes #3428 from hase1031/SPARK-4507 and squashes the following commits: bf6d64b [Takayuki Hasegawa] SPARK-4507: try to resolve issue when no JIRAs in title 401224c [Takayuki Hasegawa] SPARK-4507: moved codes as before ce89021 [Takayuki Hasegawa] SPARK-4507: PR merge script should support closing multiple JIRA tickets