spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	Preparing development version 1.4.0-SNAPSHOT	Patrick Wendell	2015-06-02	30	-30/+30
\|
*	Preparing Spark release v1.4.0-rc4	Patrick Wendell	2015-06-02	30	-30/+30
\|
*	[SPARK-8038] [SQL] [PYSPARK] fix Column.when() and otherwise()	Davies Liu	2015-06-02	1	-3/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Thanks ogirardot, closes #6580 cc rxin JoshRosen Author: Davies Liu <davies@databricks.com> Closes #6590 from davies/when and squashes the following commits: c0f2069 [Davies Liu] fix Column.when() and otherwise() (cherry picked from commit 605ddbb27c8482fc0107b21c19d4e4ae19348f35) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[SPARK-8014] [SQL] Avoid premature metadata discovery when writing a ↵	Cheng Lian	2015-06-02	5	-32/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	HadoopFsRelation with a save mode other than Append The current code references the schema of the DataFrame to be written before checking save mode. This triggers expensive metadata discovery prematurely. For save mode other than `Append`, this metadata discovery is useless since we either ignore the result (for `Ignore` and `ErrorIfExists`) or delete existing files (for `Overwrite`) later. This PR fixes this issue by deferring metadata discovery after save mode checking. Author: Cheng Lian <lian@databricks.com> Closes #6583 from liancheng/spark-8014 and squashes the following commits: 1aafabd [Cheng Lian] Updates comments 088abaa [Cheng Lian] Avoids schema merging and partition discovery when data schema and partition schema are defined 8fbd93f [Cheng Lian] Fixes SPARK-8014 (cherry picked from commit 686a45f0b9c50ede2a80854ed6a155ee8a9a4f5c) Signed-off-by: Yin Huai <yhuai@databricks.com>
*	[SPARK-7985] [ML] [MLlib] [Docs] Remove "fittingParamMap" references. ↵	Mike Dusenberry	2015-06-02	11	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Updating ML Doc "Estimator, Transformer, and Param" examples. Updating ML Doc's "Estimator, Transformer, and Param" example to use `model.extractParamMap` instead of `model.fittingParamMap`, which no longer exists. mengxr, I believe this addresses (part of) the update documentation TODO list item from [PR 5820](https://github.com/apache/spark/pull/5820). Author: Mike Dusenberry <dusenberrymw@gmail.com> Closes #6514 from dusenberrymw/Fix_ML_Doc_Estimator_Transformer_Param_Example and squashes the following commits: 6366e1f [Mike Dusenberry] Updating instances of model.extractParamMap to model.parent.extractParamMap, since the Params of the parent Estimator could possibly differ from thos of the Model. d850e0e [Mike Dusenberry] Removing all references to "fittingParamMap" throughout Spark, since it has been removed. 0480304 [Mike Dusenberry] Updating the ML Doc "Estimator, Transformer, and Param" Java example to use model.extractParamMap() instead of model.fittingParamMap(), which no longer exists. 7d34939 [Mike Dusenberry] Updating ML Doc "Estimator, Transformer, and Param" example to use model.extractParamMap instead of model.fittingParamMap, which no longer exists. (cherry picked from commit ad06727fe985ca243ebdaaba55cd7d35a4749d0a) Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
*	[MINOR] Enable PySpark SQL readerwriter and window tests	Josh Rosen	2015-06-02	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	PySpark SQL's `readerwriter` and `window` doctests weren't being run by our test runner script; this patch re-enables them. Author: Josh Rosen <joshrosen@databricks.com> Closes #6542 from JoshRosen/enable-more-pyspark-sql-tests and squashes the following commits: 9f46ce4 [Josh Rosen] Enable PySpark SQL readerwriter and window tests.
*	[SPARK-8015] [FLUME] Remove Guava dependency from flume-sink.	Marcelo Vanzin	2015-06-02	4	-7/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The minimal change would be to disable shading of Guava in the module, and rely on the transitive dependency from other libraries instead. But since Guava's use is so localized, I think it's better to just not use it instead, so I replaced that code and removed all traces of Guava from the module's build. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #6555 from vanzin/SPARK-8015 and squashes the following commits: c0ceea8 [Marcelo Vanzin] Add comments about dependency management. c38228d [Marcelo Vanzin] Add guava dep in test scope. b7a0349 [Marcelo Vanzin] Add libthrift exclusion. 6e0942d [Marcelo Vanzin] Add comment in pom. 2d79260 [Marcelo Vanzin] [SPARK-8015] [flume] Remove Guava dependency from flume-sink. (cherry picked from commit 0071bd8d31f13abfe73b9d141a818412d374dce0) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
*	[SPARK-8037] [SQL] Ignores files whose name starts with dot in HadoopFsRelation	Cheng Lian	2015-06-03	3	-6/+26
\| \| \| \| \| \| \| \| \| \| \|	Author: Cheng Lian <lian@databricks.com> Closes #6581 from liancheng/spark-8037 and squashes the following commits: d08e97b [Cheng Lian] Ignores files whose name starts with dot in HadoopFsRelation (cherry picked from commit 1bb5d716c0351cd0b4c11b397fd778f30db39bd9) Signed-off-by: Cheng Lian <lian@databricks.com>
*	[HOT-FIX] Add EvaluatedType back to RDG	Yin Huai	2015-06-02	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	https://github.com/apache/spark/commit/87941ff8c49a6661f22c31aa7b84ac1fce768135 accidentally removed the EvaluatedType. Author: Yin Huai <yhuai@databricks.com> Closes #6589 from yhuai/getBackEvaluatedType and squashes the following commits: 618c2eb [Yin Huai] Add EvaluatedType back.
*	[SPARK-7432] [MLLIB] fix flaky CrossValidator doctest	Xiangrui Meng	2015-06-02	1	-10/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	The new test uses CV to compare `maxIter=0` and `maxIter=1`, and validate on the evaluation result. jkbradley Author: Xiangrui Meng <meng@databricks.com> Closes #6572 from mengxr/SPARK-7432 and squashes the following commits: c236bb8 [Xiangrui Meng] fix flacky cv doctest (cherry picked from commit bd97840d5ccc3f0bfde1e5cfc7abeac9681997ab) Signed-off-by: Xiangrui Meng <meng@databricks.com>
*	Preparing development version 1.4.0-SNAPSHOT	Patrick Wendell	2015-06-02	30	-30/+30
\|
*	Preparing Spark release v1.4.0-rc4	Patrick Wendell	2015-06-02	30	-30/+30
\|
*	[SPARK-8021] [SQL] [PYSPARK] make Python read/write API consistent with Scala	Davies Liu	2015-06-02	1	-27/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	add schema()/format()/options() for reader, add mode()/format()/options()/partitionBy() for writer cc rxin yhuai pwendell Author: Davies Liu <davies@databricks.com> Closes #6578 from davies/readwrite and squashes the following commits: 720d293 [Davies Liu] address comments b65dfa2 [Davies Liu] Update readwriter.py 1299ab6 [Davies Liu] make Python API consistent with Scala (cherry picked from commit 445647a1a36e1e24076a9fe506492fac462c66ad) Signed-off-by: Patrick Wendell <patrick@databricks.com>
*	[SPARK-8023][SQL] Add "deterministic" attribute to Expression to avoid ↵	Yin Huai	2015-06-02	6	-3/+136
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	collapsing nondeterministic projects. This closes #6570. Author: Yin Huai <yhuai@databricks.com> Author: Reynold Xin <rxin@databricks.com> Closes #6573 from rxin/deterministic and squashes the following commits: 356cd22 [Reynold Xin] Added unit test for the optimizer. da3fde1 [Reynold Xin] Merge pull request #6570 from yhuai/SPARK-8023 da56200 [Yin Huai] Comments. e38f264 [Yin Huai] Comment. f9d6a73 [Yin Huai] Add a deterministic method to Expression. (cherry picked from commit 0f80990bfac1e9969644952d1d8edaf7d26fb436) Signed-off-by: Reynold Xin <rxin@databricks.com> Conflicts: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/random.scala
*	[SPARK-8020] [SQL] Spark SQL conf in spark-defaults.conf make metadataHive ↵	Yin Huai	2015-06-02	1	-3/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	get constructed too early https://issues.apache.org/jira/browse/SPARK-8020 Author: Yin Huai <yhuai@databricks.com> Closes #6571 from yhuai/SPARK-8020-1 and squashes the following commits: 0398f5b [Yin Huai] First populate the SQLConf and then construct executionHive and metadataHive. (cherry picked from commit 7b7f7b6c6fd903e2ecfc886d29eaa9df58adcfc3) Signed-off-by: Yin Huai <yhuai@databricks.com>
*	[SPARK-6917] [SQL] DecimalType is not read back when non-native type exists	Davies Liu	2015-06-01	2	-1/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cc yhuai Author: Davies Liu <davies@databricks.com> Closes #6558 from davies/decimalType and squashes the following commits: c877ca8 [Davies Liu] Update ParquetConverter.scala 48cc57c [Davies Liu] Update ParquetConverter.scala b43845c [Davies Liu] add test 3b4a94f [Davies Liu] DecimalType is not read back when non-native type exists (cherry picked from commit bcb47ad7718b843fbd25cd1e228a7b7e6e5b8686) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[SPARK-7582] [MLLIB] user guide for StringIndexer	Xiangrui Meng	2015-06-01	2	-0/+193
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR adds a Java unit test and user guide for `StringIndexer`. I put it before `OneHotEncoder` because they are closely related. jkbradley Author: Xiangrui Meng <meng@databricks.com> Closes #6561 from mengxr/SPARK-7582 and squashes the following commits: 4bba4f1 [Xiangrui Meng] fix example ba1cd1b [Xiangrui Meng] fix style 7fa18d1 [Xiangrui Meng] add user guide for StringIndexer 136cb93 [Xiangrui Meng] add a Java unit test for StringIndexer (cherry picked from commit 0221c7f0efe2512f3ae3839b83aa8abb0806d516) Signed-off-by: Xiangrui Meng <meng@databricks.com>
*	Fixed typo in the previous commit.	Reynold Xin	2015-06-01	1	-1/+1
\| \| \| \| \|	(cherry picked from commit b53a0116473a03607c5be3e4135151b4932acc06) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[SPARK-7965] [SPARK-7972] [SQL] Handle expressions containing multiple ↵	Yin Huai	2015-06-01	3	-32/+134
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	window expressions and make parser match window frames in case insensitive way JIRAs: https://issues.apache.org/jira/browse/SPARK-7965 https://issues.apache.org/jira/browse/SPARK-7972 Author: Yin Huai <yhuai@databricks.com> Closes #6524 from yhuai/7965-7972 and squashes the following commits: c12c79c [Yin Huai] Add doc for returned value. de64328 [Yin Huai] Address rxin's comments. fc9b1ad [Yin Huai] wip 2996da4 [Yin Huai] scala style 20b65b7 [Yin Huai] Handle expressions containing multiple window expressions. 9568b21 [Yin Huai] case insensitive matches 41f633d [Yin Huai] Failed test case. (cherry picked from commit e797dba58e8cafdd30683dd1e0263f00ce30ccc0) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[SPARK-8025][Streaming]Add JavaDoc style deprecation for deprecated ↵	zsxwing	2015-06-01	3	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Streaming methods Scala `deprecated` annotation actually doesn't show up in JavaDoc. Author: zsxwing <zsxwing@gmail.com> Closes #6564 from zsxwing/SPARK-8025 and squashes the following commits: 2faa2bb [zsxwing] Add JavaDoc style deprecation for deprecated Streaming methods (cherry picked from commit 7f74bb3bc6d29c53e67af6b6eec336f2d083322a) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[minor doc] Add exploratory data analysis warning for ↵	Reynold Xin	2015-06-01	2	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \|	DataFrame.stat.freqItem API Author: Reynold Xin <rxin@databricks.com> Closes #6569 from rxin/freqItemsWarning and squashes the following commits: 7eec145 [Reynold Xin] [minor doc] Add exploratory data analysis warning for DataFrame.stat.freqItem API. (cherry picked from commit 4c868b9943a2d86107d1f15f8df9830aac36fb75) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[SPARK-8027] [SPARKR] Add maven profile to build R package docs	Shivaram Venkataraman	2015-06-01	2	-8/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Also use that profile in create-release.sh cc pwendell -- Note that this means that we need `knitr` and `roxygen` installed on the machines used for building the release. Let me know if you need help with that. Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #6567 from shivaram/SPARK-8027 and squashes the following commits: 8dc8ecf [Shivaram Venkataraman] Add maven profile to build R package docs Also use that profile in create-release.sh (cherry picked from commit cae9306c4f437c722baa57593fe83f4b7d82dbff) Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
*	[SPARK-8026][SQL] Add Column.alias to Scala/Java DataFrame API	Reynold Xin	2015-06-01	2	-0/+18
\| \| \| \| \| \| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #6565 from rxin/alias and squashes the following commits: 286d880 [Reynold Xin] [SPARK-8026][SQL] Add Column.alias to Scala/Java DataFrame API (cherry picked from commit 89f642a0e8c3a6bc9149a0bb413f1a8939cb0283) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[SPARK-7982][SQL] DataFrame.stat.crosstab should use 0 instead of null for ↵	Reynold Xin	2015-06-01	2	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	pairs that don't appear Author: Reynold Xin <rxin@databricks.com> Closes #6566 from rxin/crosstab and squashes the following commits: e0ace1c [Reynold Xin] [SPARK-7982][SQL] DataFrame.stat.crosstab should use 0 instead of null for pairs that don't appear (cherry picked from commit 6396cc0303ceabea53c4df436ffa50b82b7e233f) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[SPARK-8028] [SPARKR] Use addJar instead of setJars in SparkR	Shivaram Venkataraman	2015-06-01	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This prevents the spark.jars from being cleared while using `--packages` or `--jars` cc pwendell davies brkyvz Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #6568 from shivaram/SPARK-8028 and squashes the following commits: 3a9cf1f [Shivaram Venkataraman] Use addJar instead of setJars in SparkR This prevents the spark.jars from being cleared (cherry picked from commit 6b44278ef7cd2a278dfa67e8393ef30775c72726) Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
*	[MINOR] [UI] Improve error message on log page	Andrew Or	2015-06-01	2	-0/+74
\| \| \| \| \|	Currently if a bad log type if specified, then we get blank. We should provide a more informative error message.
*	[SPARK-7958] [STREAMING] Handled exception in StreamingContext.start() to ↵	Tathagata Das	2015-06-01	3	-4/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	prevent leaking of actors StreamingContext.start() can throw exception because DStream.validateAtStart() fails (say, checkpoint directory not set for StateDStream). But by then JobScheduler, JobGenerator, and ReceiverTracker has already started, along with their actors. But those cannot be shutdown because the only way to do that is call StreamingContext.stop() which cannot be called as the context has not been marked as ACTIVE. The solution in this PR is to stop the internal scheduler if start throw exception, and mark the context as STOPPED. Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #6559 from tdas/SPARK-7958 and squashes the following commits: 20b2ec1 [Tathagata Das] Added synchronized 790b617 [Tathagata Das] Handled exception in StreamingContext.start() (cherry picked from commit 2f9c7519d6a3f867100979b5e7ced3f72b7d9adc) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
*	[SPARK-7899] [PYSPARK] Fix Python 3 pyspark/sql/types module conflict	Michael Nazario	2015-06-01	7	-63/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR makes the types module in `pyspark/sql/types` work with pylint static analysis by removing the dynamic naming of the `pyspark/sql/_types` module to `pyspark/sql/types`. Tests are now loaded using `$PYSPARK_DRIVER_PYTHON -m module` rather than `$PYSPARK_DRIVER_PYTHON module.py`. The old method adds the location of `module.py` to `sys.path`, so this change prevents accidental use of relative paths in Python. Author: Michael Nazario <mnazario@palantir.com> Closes #6439 from mnazario/feature/SPARK-7899 and squashes the following commits: 366ef30 [Michael Nazario] Remove hack on random.py bb8b04d [Michael Nazario] Make doctests consistent with other tests 6ee4f75 [Michael Nazario] Change test scripts to use "-m" 673528f [Michael Nazario] Move _types back to types
*	[SPARK-7584] [MLLIB] User guide for VectorAssembler	Xiangrui Meng	2015-06-01	2	-0/+192
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR adds a section in the user guide for `VectorAssembler` with code examples in Python/Java/Scala. It also adds a unit test in Java. jkbradley Author: Xiangrui Meng <meng@databricks.com> Closes #6556 from mengxr/SPARK-7584 and squashes the following commits: 11313f6 [Xiangrui Meng] simplify Java example 0cd47f3 [Xiangrui Meng] update user guide fd36292 [Xiangrui Meng] update Java unit test ce61ca0 [Xiangrui Meng] add Java unit test for VectorAssembler e399942 [Xiangrui Meng] scala/python example code (cherry picked from commit 90c606925e7ec8f65f28e2290a0048f64af8c6a6) Signed-off-by: Xiangrui Meng <meng@databricks.com>
*	[SPARK-7497] [PYSPARK] [STREAMING] fix streaming flaky tests	Davies Liu	2015-06-01	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Increase the duration and timeout in streaming python tests. Author: Davies Liu <davies@databricks.com> Closes #6239 from davies/flaky_tests and squashes the following commits: d6aee8f [Davies Liu] fix window tests 26317f7 [Davies Liu] Merge branch 'master' of github.com:apache/spark into flaky_tests 7947db6 [Davies Liu] fix streaming flaky tests (cherry picked from commit b7ab0299b03ae833d5811f380e4594837879f8ae) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
*	[DOC] Minor modification to Streaming docs with regards to parallel data ↵	Nishkam Ravi	2015-06-01	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	receiving pwendell tdas Author: Nishkam Ravi <nravi@cloudera.com> Author: nishkamravi2 <nishkamravi@gmail.com> Author: nravi <nravi@c1704.halxg.cloudera.com> Closes #6544 from nishkamravi2/master_nravi and squashes the following commits: 46e8c03 [Nishkam Ravi] Slight modification to streaming docs (cherry picked from commit e7c7e51f2ec158d12a8429f753225c746f92d513) Signed-off-by: Sean Owen <sowen@cloudera.com>
*	[SPARK-7978] [SQL] [PYSPARK] DecimalType should not be singleton	Davies Liu	2015-05-31	2	-2/+25
\| \| \| \| \| \| \| \| \| \| \| \|	Author: Davies Liu <davies@databricks.com> Closes #6532 from davies/decimal and squashes the following commits: c7fcbce [Davies Liu] Update tests.py 1425359 [Davies Liu] DecimalType should not be singleton (cherry picked from commit 91777a1c3ad3b3ec7b65d5a0413209a9baf6b36a) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[HOTFIX] Remove trailing whitespace to fix Scalastyle checks	Josh Rosen	2015-05-31	3	-5/+5
\| \| \| \|	866652c903d06d1cb4356283e0741119d84dcc21 enabled this check.
*	[SPARK-7227] [SPARKR] Support fillna / dropna in R DataFrame.	Sun Rui	2015-05-31	6	-3/+267
\| \| \| \| \| \| \| \| \| \| \| \|	Author: Sun Rui <rui.sun@intel.com> Closes #6183 from sun-rui/SPARK-7227 and squashes the following commits: dd6f5b3 [Sun Rui] Rename readEnv() back to readMap(). Add alias na.omit() for dropna(). 41cf725 [Sun Rui] [SPARK-7227][SPARKR] Support fillna / dropna in R DataFrame. (cherry picked from commit 46576ab303e50c54c3bd464f8939953efe644574) Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
*	[SPARK-3850] Turn style checker on for trailing whitespaces.	Reynold Xin	2015-05-31	3	-2/+5
\| \| \| \| \| \| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #6541 from rxin/trailing-whitespace-on and squashes the following commits: f72ebe4 [Reynold Xin] [SPARK-3850] Turn style checker on for trailing whitespaces. (cherry picked from commit 866652c903d06d1cb4356283e0741119d84dcc21) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[SPARK-7949] [MLLIB] [DOC] update document with some missing save/load	Yuhao Yang	2015-05-31	3	-6/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	add save load for examples: KMeansModel PowerIterationClusteringModel Word2VecModel IsotonicRegressionModel Author: Yuhao Yang <hhbyyh@gmail.com> Closes #6498 from hhbyyh/docSaveLoad and squashes the following commits: 7f9f06d [Yuhao Yang] add missing imports c604cad [Yuhao Yang] Merge remote-tracking branch 'upstream/master' into docSaveLoad 1dd77cc [Yuhao Yang] update document with some missing save/load (cherry picked from commit 0674700303da3e4737d73f5fabd2a925ec712f63) Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
*	[SPARK-3850] Trim trailing spaces for MLlib.	Reynold Xin	2015-05-31	30	-189/+189
\| \| \| \| \| \| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #6534 from rxin/whitespace-mllib and squashes the following commits: 38926e3 [Reynold Xin] [SPARK-3850] Trim trailing spaces for MLlib. (cherry picked from commit e1067d0ad1c32c678c23d76d7653b51770795831) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[MINOR] Add license for dagre-d3 and graphlib-dot	zsxwing	2015-05-31	1	-0/+46
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add license for dagre-d3 and graphlib-dot Author: zsxwing <zsxwing@gmail.com> Closes #6539 from zsxwing/LICENSE and squashes the following commits: 82b0475 [zsxwing] Add license for dagre-d3 and graphlib-dot (cherry picked from commit d1d2def2f5f91e86f340656421170d1097f14854) Signed-off-by: Andrew Or <andrew@databricks.com>
*	[SPARK-7979] Enforce structural type checker.	Reynold Xin	2015-05-31	5	-3/+14
\| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #6536 from rxin/structural-type-checker and squashes the following commits: f833151 [Reynold Xin] Fixed compilation. 633f9a1 [Reynold Xin] Fixed typo. d1fa804 [Reynold Xin] [SPARK-7979] Enforce structural type checker. (cherry picked from commit 4b5f12bac939a2f47a3a61365b5325d849b7b51f) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[SPARK-3850] Trim trailing spaces for SQL.	Reynold Xin	2015-05-31	36	-82/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #6535 from rxin/whitespace-sql and squashes the following commits: de50316 [Reynold Xin] [SPARK-3850] Trim trailing spaces for SQL. (cherry picked from commit 63a50be13d32b9e5f3aad8d1a6ba5362f17a252f) Signed-off-by: Reynold Xin <rxin@databricks.com> Conflicts: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala sql/core/src/test/scala/org/apache/spark/sql/DataFrameStatSuite.scala
*	[SPARK-3850] Trim trailing spaces for examples/streaming/yarn.	Reynold Xin	2015-05-31	20	-91/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #6530 from rxin/trim-whitespace-1 and squashes the following commits: 7b7b3a0 [Reynold Xin] Reset again. dc14597 [Reynold Xin] Reset scalastyle. cd556c4 [Reynold Xin] YARN, Kinesis, Flume. 4223fe1 [Reynold Xin] [SPARK-3850] Trim trailing spaces for examples/streaming. (cherry picked from commit 564bc11e9827915c8652bc06f4bd591809dea4b1) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[SPARK-3850] Trim trailing spaces for core.	Reynold Xin	2015-05-31	46	-113/+113
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #6533 from rxin/whitespace-2 and squashes the following commits: 038314c [Reynold Xin] [SPARK-3850] Trim trailing spaces for core. (cherry picked from commit 74fdc97c7206c6d715f128ef7c46055e0bb90760) Signed-off-by: Reynold Xin <rxin@databricks.com> Conflicts: core/src/main/scala/org/apache/spark/storage/TachyonBlockManager.scala core/src/test/scala/org/apache/spark/serializer/KryoSerializerSuite.scala
*	[SPARK-7975] Add style checker to disallow overriding equals covariantly.	Reynold Xin	2015-05-31	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> This patch had conflicts when merged, resolved by Committer: Reynold Xin <rxin@databricks.com> Closes #6527 from rxin/covariant-equals and squashes the following commits: e7d7784 [Reynold Xin] [SPARK-7975] Enforce CovariantEqualsChecker (cherry picked from commit 7896e99b2a0a160bd0b6c5c11cf40b6cbf4a65cf) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[SQL] [MINOR] Adds @deprecated Scaladoc entry for SchemaRDD	Cheng Lian	2015-05-30	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	Author: Cheng Lian <lian@databricks.com> Closes #6529 from liancheng/schemardd-deprecation-fix and squashes the following commits: 49765c2 [Cheng Lian] Adds @deprecated Scaladoc entry for SchemaRDD (cherry picked from commit 8764dccebd44292ab6f6834640199aad451459c5) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[SPARK-7976] Add style checker to disallow overriding finalize.	Reynold Xin	2015-05-30	2	-1/+3
\| \| \| \| \| \| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #6528 from rxin/style-finalizer and squashes the following commits: a2211ca [Reynold Xin] [SPARK-7976] Enable NoFinalizeChecker. (cherry picked from commit 084fef76e90116c6465cd6fad7c0197c3e4d4313) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	Update documentation for the new DataFrame reader/writer interface.	Reynold Xin	2015-05-30	1	-60/+66
\| \| \| \| \| \| \| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #6522 from rxin/sql-doc-1.4 and squashes the following commits: c227be7 [Reynold Xin] Updated link. 040b6d7 [Reynold Xin] Update documentation for the new DataFrame reader/writer interface. (cherry picked from commit 00a7137900d45188673da85cbcef4f02b7a266c1) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[SPARK-7971] Add JavaDoc style deprecation for deprecated DataFrame methods	Reynold Xin	2015-05-30	3	-12/+70
\| \| \| \| \| \| \| \| \| \| \| \| \|	Scala deprecated annotation actually doesn't show up in JavaDoc. Author: Reynold Xin <rxin@databricks.com> Closes #6523 from rxin/df-deprecated-javadoc and squashes the following commits: 26da2b2 [Reynold Xin] [SPARK-7971] Add JavaDoc style deprecation for deprecated DataFrame methods. (cherry picked from commit c63e1a742b3e87e79a4466e9bd0b927a24645756) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[SQL] Tighten up visibility for JavaDoc.	Reynold Xin	2015-05-30	8	-14/+32
\| \| \| \| \| \| \| \| \| \| \| \| \|	I went through all the JavaDocs and tightened up visibility. Author: Reynold Xin <rxin@databricks.com> Closes #6526 from rxin/sql-1.4-visibility-for-docs and squashes the following commits: bc37d1e [Reynold Xin] Tighten up visibility for JavaDoc. (cherry picked from commit 14b314dc2cad7bbf23976347217c676d338e0a2d) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[SPARK-5610] [DOC] update genjavadocSettings to use the patched version of ↵	Xiangrui Meng	2015-05-30	2	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	genjavadoc This PR updates `genjavadocSettings` to use a patched version of `genjavadoc-plugin` that hides package private classes/methods/interfaces in the generated Java API doc. The patch can be found at: https://github.com/typesafehub/genjavadoc/compare/master...mengxr:spark-1.4. It wasn't merged into the main repo because there exist corner cases where a package private Scala class has to be a Java public class in order to compile. This doesn't seem to apply to the Spark codebase. So we release a patched version under `org.spark-project` and use it in the Spark build. brkyvz is publishing the artifacts to Maven Central. Need more people audit the generated APIs and make sure we don't have false negatives. Current listed classes under `org.apache.spark.rdd`: ![screen shot 2015-05-29 at 12 48 52 pm](https://cloud.githubusercontent.com/assets/829644/7891396/28fb9daa-0601-11e5-8ed8-4e9522d25a71.png) After this PR: ![screen shot 2015-05-29 at 12 48 23 pm](https://cloud.githubusercontent.com/assets/829644/7891408/408e210e-0601-11e5-975c-ff0a02eb5c91.png) cc: pwendell rxin srowen Author: Xiangrui Meng <meng@databricks.com> Closes #6506 from mengxr/SPARK-5610 and squashes the following commits: 489c785 [Xiangrui Meng] update genjavadocSettings to use the patched version of genjavadoc (cherry picked from commit 2b258e1c0784c8ca958bf94cd9e75fa17f104448) Signed-off-by: Reynold Xin <rxin@databricks.com>
*	[SPARK-7920] [MLLIB] Make MLlib ChiSqSelector Serializable (& Fix Related ↵	Mike Dusenberry	2015-05-30	2	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Documentation Example). The MLlib ChiSqSelector class is not serializable, and so the example in the ChiSqSelector documentation fails. Also, that example is missing the import of ChiSqSelector. This PR makes ChiSqSelector extend Serializable in MLlib, and adds the ChiSqSelector import statement to the associated example in the documentation. Author: Mike Dusenberry <dusenberrymw@gmail.com> Closes #6462 from dusenberrymw/Make_ChiSqSelector_Serializable_and_Fix_Related_Docs_Example and squashes the following commits: 9cb2f94 [Mike Dusenberry] Make MLlib ChiSqSelector Serializable. d9003bf [Mike Dusenberry] Add missing import in MLlib ChiSqSelector Docs Scala example. (cherry picked from commit 1281a3518802bfa624618236e6b9b59bc0e78585) Signed-off-by: Joseph K. Bradley <joseph@databricks.com>