spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SPARK-10331] [MLLIB] Update example code in ml-guide	Xiangrui Meng	2015-08-29	1	-215/+147
\| \| \| \| \| \| \| \| \| \| \| \|	* The example code was added in 1.2, before `createDataFrame`. This PR switches to `createDataFrame`. Java code still uses JavaBean. * assume `sqlContext` is available * fix some minor issues from previous code review jkbradley srowen feynmanliang Author: Xiangrui Meng <meng@databricks.com> Closes #8518 from mengxr/SPARK-10331.
*	[SPARK-10348] [MLLIB] updates ml-guide	Xiangrui Meng	2015-08-29	2	-52/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* replace `ML Dataset` by `DataFrame` to unify the abstraction * ML algorithms -> pipeline components to describe the main concept * remove Scala API doc links from the main guide * `Section Title` -> `Section tile` to be consistent with other section titles in MLlib guide * modified lines break at 100 chars or periods jkbradley feynmanliang Author: Xiangrui Meng <meng@databricks.com> Closes #8517 from mengxr/SPARK-10348.
*	[SPARK-9986] [SPARK-9991] [SPARK-9993] [SQL] Create a simple test framework ↵	zsxwing	2015-08-29	14	-55/+509
\| \| \| \| \| \| \| \| \| \| \| \|	for local operators This PR includes the following changes: - Add `LocalNodeTest` for local operator tests and add unit tests for FilterNode and ProjectNode. - Add `LimitNode` and `UnionNode` and their unit tests to show how to use `LocalNodeTest`. (SPARK-9991, SPARK-9993) Author: zsxwing <zsxwing@gmail.com> Closes #8464 from zsxwing/local-execution.
*	[SPARK-10339] [SPARK-10334] [SPARK-10301] [SQL] Partitioned table scan can ↵	Yin Huai	2015-08-29	3	-42/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	OOM driver and throw a better error message when users need to enable parquet schema merging This fixes the problem that scanning partitioned table causes driver have a high memory pressure and takes down the cluster. Also, with this fix, we will be able to correctly show the query plan of a query consuming partitioned tables. https://issues.apache.org/jira/browse/SPARK-10339 https://issues.apache.org/jira/browse/SPARK-10334 Finally, this PR squeeze in a "quick fix" for SPARK-10301. It is not a real fix, but it just throw a better error message to let user know what to do. Author: Yin Huai <yhuai@databricks.com> Closes #8515 from yhuai/partitionedTableScan.
*	[SPARK-10330] Use SparkHadoopUtil TaskAttemptContext reflection methods in ↵	Josh Rosen	2015-08-29	5	-12/+28
\| \| \| \| \| \| \| \| \| \|	more places SparkHadoopUtil contains methods that use reflection to work around TaskAttemptContext binary incompatibilities between Hadoop 1.x and 2.x. We should use these methods in more places. Author: Josh Rosen <joshrosen@databricks.com> Closes #8499 from JoshRosen/use-hadoop-reflection-in-more-places.
*	[SPARK-10226] [SQL] Fix exclamation mark issue in SparkSQL	wangwei	2015-08-29	1	-0/+1
\| \| \| \| \| \| \| \|	When I tested the latest version of spark with exclamation mark, I got some errors. Then I reseted the spark version and found that commit id "a2409d1c8e8ddec04b529ac6f6a12b5993f0eeda" brought the bug. With jline version changing from 0.9.94 to 2.12 after this commit, exclamation mark would be treated as a special character in ConsoleReader. Author: wangwei <wangwei82@huawei.com> Closes #8420 from small-wang/jline-SPARK-10226.
*	[SPARK-10344] [SQL] Add tests for extraStrategies	Michael Armbrust	2015-08-29	2	-1/+68
\| \| \| \| \| \| \| \|	Actually using this API requires access to a lot of classes that we might make private by accident. I've added some tests to prevent this. Author: Michael Armbrust <michael@databricks.com> Closes #8516 from marmbrus/extraStrategiesTests.
*	[SPARK-10289] [SQL] A direct write API for testing Parquet	Cheng Lian	2015-08-29	2	-24/+160
\| \| \| \| \| \| \| \| \| \| \| \|	This PR introduces a direct write API for testing Parquet. It's a DSL flavored version of the [`writeDirect` method] [1] comes with parquet-avro testing code. With this API, it's much easier to construct arbitrary Parquet structures. It's especially useful when adding regression tests for various compatibility corner cases. Sample usage of this API can be found in the new test case added in `ParquetThriftCompatibilitySuite`. [1]: https://github.com/apache/parquet-mr/blob/apache-parquet-1.8.1/parquet-avro/src/test/java/org/apache/parquet/avro/TestArrayCompatibility.java#L945-L972 Author: Cheng Lian <lian@databricks.com> Closes #8454 from liancheng/spark-10289/parquet-testing-direct-write-api.
*	[SPARK-10350] [DOC] [SQL] Removed duplicated option description from SQL guide	GuoQiang Li	2015-08-29	1	-10/+0
\| \| \| \| \| \|	Author: GuoQiang Li <witgo@qq.com> Closes #8520 from witgo/SPARK-10350.
*	[SPARK-9910] [ML] User guide for train validation split	martinzapletal	2015-08-28	3	-0/+287
\| \| \| \| \| \|	Author: martinzapletal <zapletal-martin@email.cz> Closes #8377 from zapletal-martin/SPARK-9910.
*	[SPARK-9803] [SPARKR] Add subset and transform + tests	felixcheung	2015-08-28	4	-17/+85
\| \| \| \| \| \| \| \| \| \| \| \|	Add subset and transform Also reorganize `[` & `[[` to subset instead of select Note: for transform, transform is very similar to mutate. Spark doesn't seem to replace existing column with the name in mutate (ie. `mutate(df, age = df$age + 2)` - returned DataFrame has 2 columns with the same name 'age'), so therefore not doing that for now in transform. Though it is clearly stated it should replace column with matching name (should I open a JIRA for mutate/transform?) Author: felixcheung <felixcheung_m@hotmail.com> Closes #8503 from felixcheung/rsubset_transform.
*	[SPARK-10323] [SQL] fix nullability of In/InSet/ArrayContain	Davies Liu	2015-08-28	7	-97/+138
\| \| \| \| \| \| \| \|	After this PR, In/InSet/ArrayContain will return null if value is null, instead of false. They also will return null even if there is a null in the set/array. Author: Davies Liu <davies@databricks.com> Closes #8492 from davies/fix_in.
*	[SPARK-9671] [MLLIB] re-org user guide and add migration guide	Xiangrui Meng	2015-08-28	3	-106/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR updates the MLlib user guide and adds migration guide for 1.4->1.5. * merge migration guide for `spark.mllib` and `spark.ml` packages * remove dependency section from `spark.ml` guide * move the paragraph about `spark.mllib` and `spark.ml` to the top and recommend `spark.ml` * move Sam's talk to footnote to make the section focus on dependencies Minor changes to code examples and other wording will be in a separate PR. jkbradley srowen feynmanliang Author: Xiangrui Meng <meng@databricks.com> Closes #8498 from mengxr/SPARK-9671.
*	[SPARK-10336][example] fix not being able to set intercept in LR example	Shuo Xiang	2015-08-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	`fitIntercept` is a command line option but not set in the main program. dbtsai Author: Shuo Xiang <sxiang@pinterest.com> Closes #8510 from coderxiang/intercept and squashes the following commits: 57c9b7d [Shuo Xiang] fix not being able to set intercept in LR example
*	[SPARK-9284] [TESTS] Allow all tests to run without an assembly.	Marcelo Vanzin	2015-08-28	10	-37/+122
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change aims at speeding up the dev cycle a little bit, by making sure that all tests behave the same w.r.t. where the code to be tested is loaded from. Namely, that means that tests don't rely on the assembly anymore, rather loading all needed classes from the build directories. The main change is to make sure all build directories (classes and test-classes) are added to the classpath of child processes when running tests. YarnClusterSuite required some custom code since the executors are run differently (i.e. not through the launcher library, like standalone and Mesos do). I also found a couple of tests that could leak a SparkContext on failure, and added code to handle those. With this patch, it's possible to run the following command from a clean source directory and have all tests pass: mvn -Pyarn -Phadoop-2.4 -Phive-thriftserver install Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #7629 from vanzin/SPARK-9284.
*	[SPARK-10325] Override hashCode() for public Row	Josh Rosen	2015-08-28	2	-0/+22
\| \| \| \| \| \| \| \| \| \|	This commit fixes an issue where the public SQL `Row` class did not override `hashCode`, causing it to violate the hashCode() + equals() contract. To fix this, I simply ported the `hashCode` implementation from the 1.4.x version of `Row`. Author: Josh Rosen <joshrosen@databricks.com> Closes #8500 from JoshRosen/SPARK-10325 and squashes the following commits: 51ffea1 [Josh Rosen] Override hashCode() for public Row.
*	[SPARK-8952] [SPARKR] - Wrap normalizePath calls with suppressWarnings	Luciano Resende	2015-08-28	2	-3/+3
\| \| \| \| \| \| \| \|	This is based on davies comment on SPARK-8952 which suggests to only call normalizePath() when path starts with '~' Author: Luciano Resende <lresende@apache.org> Closes #8343 from lresende/SPARK-8952.
*	[SPARK-9890] [DOC] [ML] User guide for CountVectorizer	Yuhao Yang	2015-08-28	1	-0/+109
\| \| \| \| \| \| \| \| \| \|	jira: https://issues.apache.org/jira/browse/SPARK-9890 document with Scala and java examples Author: Yuhao Yang <hhbyyh@gmail.com> Closes #8487 from hhbyyh/cvDoc.
*	[YARN] [MINOR] Avoid hard code port number in YarnShuffleService test	jerryshao	2015-08-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current port number is fixed as default (7337) in test, this will introduce port contention exception, better to change to a random number in unit test. squito , seems you're author of this unit test, mind taking a look at this fix? Thanks a lot. ``` [info] - executor state kept across NM restart * FAILED * (597 milliseconds) [info] org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use [info] at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) [info] at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) [info] at org.apache.spark.network.yarn.YarnShuffleServiceSuite$$anonfun$1.apply$mcV$sp(YarnShuffleServiceSuite.scala:72) [info] at org.apache.spark.network.yarn.YarnShuffleServiceSuite$$anonfun$1.apply(YarnShuffleServiceSuite.scala:70) [info] at org.apache.spark.network.yarn.YarnShuffleServiceSuite$$anonfun$1.apply(YarnShuffleServiceSuite.scala:70) [info] at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) [info] at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) [info] at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) [info] at org.scalatest.Transformer.apply(Transformer.scala:22) [info] at org.scalatest.Transformer.apply(Transformer.scala:20) [info] at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166) [info] at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:42) ... ``` Author: jerryshao <sshao@hortonworks.com> Closes #8502 from jerryshao/avoid-hardcode-port.
*	typo in comment	Dharmesh Kakadia	2015-08-28	1	-1/+1
\| \| \| \| \| \|	Author: Dharmesh Kakadia <dharmeshkakadia@users.noreply.github.com> Closes #8497 from dharmeshkakadia/patch-2.
*	Fix DynamodDB/DynamoDB typo in Kinesis Integration doc	Keiji Yoshida	2015-08-28	1	-1/+1
\| \| \| \| \| \| \| \|	Fix DynamodDB/DynamoDB typo in Kinesis Integration doc Author: Keiji Yoshida <yoshida.keiji.84@gmail.com> Closes #8501 from yosssi/patch-1.
*	[SPARK-10295] [CORE] Dynamic allocation in Mesos does not release when RDDs ↵	Sean Owen	2015-08-28	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \|	are cached Remove obsolete warning about dynamic allocation not working with cached RDDs See discussion in https://issues.apache.org/jira/browse/SPARK-10295 Author: Sean Owen <sowen@cloudera.com> Closes #8489 from srowen/SPARK-10295.
*	[SPARK-10260] [ML] Add @Since annotation to ml.clustering	Yu ISHIKAWA	2015-08-28	1	-3/+29
\| \| \| \| \| \| \| \| \|	### JIRA [[SPARK-10260] Add Since annotation to ml.clustering - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-10260) Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8455 from yu-iskw/SPARK-10260.
*	[SPARK-10328] [SPARKR] Fix generic for na.omit	Shivaram Venkataraman	2015-08-28	4	-6/+27
\| \| \| \| \| \| \| \| \| \|	S3 function is at https://stat.ethz.ch/R-manual/R-patched/library/stats/html/na.fail.html Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Author: Shivaram Venkataraman <shivaram.venkataraman@gmail.com> Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8495 from shivaram/na-omit-fix.
*	[SPARK-10188] [PYSPARK] Pyspark CrossValidator with RMSE selects incorrect model	noelsmith	2015-08-27	3	-1/+104
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Added isLargerBetter() method to Pyspark Evaluator to match the Scala version. * JavaEvaluator delegates isLargerBetter() to underlying Scala object. * Added check for isLargerBetter() in CrossValidator to determine whether to use argmin or argmax. * Added test cases for where smaller is better (RMSE) and larger is better (R-Squared). (This contribution is my original work and that I license the work to the project under Sparks' open source license) Author: noelsmith <mail@noelsmith.com> Closes #8399 from noel-smith/pyspark-rmse-xval-fix.
*	[SPARK-SQL] [MINOR] Fixes some typos in HiveContext	Cheng Lian	2015-08-27	2	-5/+5
\| \| \| \| \| \|	Author: Cheng Lian <lian@databricks.com> Closes #8481 from liancheng/hive-context-typo.
*	[SPARK-9905] [ML] [DOC] Adds LinearRegressionSummary user guide	Feynman Liang	2015-08-27	1	-13/+127
\| \| \| \| \| \| \| \| \| \| \|	* Adds user guide for `LinearRegressionSummary` * Fixes unresolved issues in #8197 CC jkbradley mengxr Author: Feynman Liang <fliang@databricks.com> Closes #8491 from feynmanliang/SPARK-9905.
*	[SPARK-9911] [DOC] [ML] Update Userguide for Evaluator	MechCoder	2015-08-27	1	-0/+13
\| \| \| \| \| \| \| \|	I added a small note about the different types of evaluator and the metrics used. Author: MechCoder <manojkumarsivaraj334@gmail.com> Closes #8304 from MechCoder/multiclass_evaluator.
*	[SPARK-8505] [SPARKR] Add settings to kick `lint-r` from `./dev/run-test.py`	Yu ISHIKAWA	2015-08-27	5	-12/+47
\| \| \| \| \| \| \| \| \| \| \| \|	JoshRosen we'd like to check the SparkR source code with the `dev/lint-r` script on the Jenkins. I tried to incorporate the script into `dev/run-test.py`. Could you review it when you have time? shivaram I modified `dev/lint-r` and `dev/lint-r.R` to install lintr package into a local directory(`R/lib/`) and to exit with a lint status. Could you review it? - [[SPARK-8505] Add settings to kick `lint-r` from `./dev/run-test.py` - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-8505) Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #7883 from yu-iskw/SPARK-8505.
*	[SPARK-10321] sizeInBytes in HadoopFsRelation	Davies Liu	2015-08-27	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	Having sizeInBytes in HadoopFsRelation to enable broadcast join. cc marmbrus Author: Davies Liu <davies@databricks.com> Closes #8490 from davies/sizeInByte.
*	[SPARK-10287] [SQL] Fixes JSONRelation refreshing on read path	Yin Huai	2015-08-27	4	-25/+7
\| \| \| \| \| \| \| \| \| \|	https://issues.apache.org/jira/browse/SPARK-10287 After porting json to HadoopFsRelation, it seems hard to keep the behavior of picking up new files automatically for JSON. This PR removes this behavior, so JSON is consistent with others (ORC and Parquet). Author: Yin Huai <yhuai@databricks.com> Closes #8469 from yhuai/jsonRefresh.
*	[SPARK-9680] [MLLIB] [DOC] StopWordsRemovers user guide and Java ↵	Feynman Liang	2015-08-27	2	-3/+171
\| \| \| \| \| \| \| \| \| \| \| \| \|	compatibility test * Adds user guide for ml.feature.StopWordsRemovers, ran code examples on my machine * Cleans up scaladocs for public methods * Adds test for Java compatibility * Follow up Python user guide code example is tracked by SPARK-10249 Author: Feynman Liang <fliang@databricks.com> Closes #8436 from feynmanliang/SPARK-10230.
*	[SPARK-9906] [ML] User guide for LogisticRegressionSummary	MechCoder	2015-08-27	1	-16/+133
\| \| \| \| \| \| \| \| \| \|	User guide for LogisticRegression summaries Author: MechCoder <manojkumarsivaraj334@gmail.com> Author: Manoj Kumar <mks542@nyu.edu> Author: Feynman Liang <fliang@databricks.com> Closes #8197 from MechCoder/log_summary_user_guide.
*	[SPARK-9901] User guide for RowMatrix Tall-and-skinny QR	Yuhao Yang	2015-08-27	1	-1/+10
\| \| \| \| \| \| \| \| \| \|	jira: https://issues.apache.org/jira/browse/SPARK-9901 The jira covers only the document update. I can further provide example code for QR (like the ones for SVD and PCA) in a separate PR. Author: Yuhao Yang <hhbyyh@gmail.com> Closes #8462 from hhbyyh/qrDoc.
*	[SPARK-10315] remove document on spark.akka.failure-detector.threshold	CodingCat	2015-08-27	1	-10/+0
\| \| \| \| \| \| \| \| \| \|	https://issues.apache.org/jira/browse/SPARK-10315 this parameter is not used any longer and there is some mistake in the current document , should be 'akka.remote.watch-failure-detector.threshold' Author: CodingCat <zhunansjtu@gmail.com> Closes #8483 from CodingCat/SPARK_10315.
*	[SPARK-9148] [SPARK-10252] [SQL] Update SQL Programming Guide	Michael Armbrust	2015-08-27	1	-19/+73
\| \| \| \| \| \|	Author: Michael Armbrust <michael@databricks.com> Closes #8441 from marmbrus/documentation.
*	[SPARK-10182] [MLLIB] GeneralizedLinearModel doesn't unpersist cached data	Vyacheslav Baranov	2015-08-27	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	`GeneralizedLinearModel` creates a cached RDD when building a model. It's inconvenient, since these RDDs flood the memory when building several models in a row, so useful data might get evicted from the cache. The proposed solution is to always cache the dataset & remove the warning. There's a caveat though: input dataset gets evaluated twice, in line 270 when fitting `StandardScaler` for the first time, and when running optimizer for the second time. So, it might worth to return removed warning. Another possible solution is to disable caching entirely & return removed warning. I don't really know what approach is better. Author: Vyacheslav Baranov <slavik.baranov@gmail.com> Closes #8395 from SlavikBaranov/SPARK-10182.
*	[SPARK-10257] [MLLIB] Removes Guava from all spark.mllib Java tests	Feynman Liang	2015-08-27	14	-74/+71
\| \| \| \| \| \| \| \| \| \| \| \|	* Replaces instances of `Lists.newArrayList` with `Arrays.asList` * Replaces `commons.lang.StringUtils` over `com.google.collections.Strings` * Replaces `List` interface over `ArrayList` implementations This PR along with #8445 #8446 #8447 completely removes all `com.google.collections.Lists` dependencies within mllib's Java tests. Author: Feynman Liang <fliang@databricks.com> Closes #8451 from feynmanliang/SPARK-10257.
*	[SPARK-9613] [HOTFIX] Fix usage of JavaConverters removed in Scala 2.11	Jacek Laskowski	2015-08-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix for [JavaConverters.asJavaListConverter](http://www.scala-lang.org/api/2.10.5/index.html#scala.collection.JavaConverters$) being removed in 2.11.7 and hence the build fails with the 2.11 profile enabled. Tested with the default 2.10 and 2.11 profiles. BUILD SUCCESS in both cases. Build for 2.10: ./build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.7.1 -DskipTests clean install and 2.11: ./dev/change-scala-version.sh 2.11 ./build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.7.1 -Dscala-2.11 -DskipTests clean install Author: Jacek Laskowski <jacek@japila.pl> Closes #8479 from jaceklaskowski/SPARK-9613-hotfix.
*	[SPARK-10256] [ML] Removes guava dependency from spark.ml.classification ↵	Feynman Liang	2015-08-27	1	-2/+2
\| \| \| \| \| \| \| \|	JavaTests Author: Feynman Liang <fliang@databricks.com> Closes #8447 from feynmanliang/SPARK-10256.
*	[SPARK-10255] [ML] Removes Guava dependencies from spark.ml.param JavaTests	Feynman Liang	2015-08-27	2	-6/+6
\| \| \| \| \| \|	Author: Feynman Liang <fliang@databricks.com> Closes #8446 from feynmanliang/SPARK-10255.
*	[SPARK-10254] [ML] Removes Guava dependencies in spark.ml.feature JavaTests	Feynman Liang	2015-08-27	11	-30/+35
\| \| \| \| \| \| \| \| \|	* Replaces `com.google.common` dependencies with `java.util.Arrays` * Small clean up in `JavaNormalizerSuite` Author: Feynman Liang <fliang@databricks.com> Closes #8445 from feynmanliang/SPARK-10254.
*	[DOCS] [STREAMING] [KAFKA] Fix typo in exactly once semantics	Moussa Taifi	2015-08-27	1	-1/+1
\| \| \| \| \| \| \| \| \|	Fix Typo in exactly once semantics [Semantics of output operations] link Author: Moussa Taifi <moutai10@gmail.com> Closes #8468 from moutai/patch-3.
*	[SPARK-10251] [CORE] some common types are not registered for Kryo Serializat…	Ram Sriharsha	2015-08-26	2	-1/+64
\| \| \| \| \| \| \| \|	…ion by default Author: Ram Sriharsha <rsriharsha@hw11853.local> Closes #8465 from harsha2010/SPARK-10251.
*	[SPARK-10219] [SPARKR] Fix varargsToEnv and add test case	Shivaram Venkataraman	2015-08-26	2	-1/+8
\| \| \| \| \| \| \| \|	cc sun-rui davies Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #8475 from shivaram/varargs-fix.
*	[SPARK-9964] [PYSPARK] [SQL] PySpark DataFrameReader accept RDD of String ↵	Yanbo Liang	2015-08-26	1	-6/+22
\| \| \| \| \| \| \| \| \| \| \|	for JSON PySpark DataFrameReader should could accept an RDD of Strings (like the Scala version does) for JSON, rather than only taking a path. If this PR is merged, it should be duplicated to cover the other input types (not just JSON). Author: Yanbo Liang <ybliang8@gmail.com> Closes #8444 from yanboliang/spark-9964.
*	[SPARK-9424] [SQL] Parquet programming guide updates for 1.5	Cheng Lian	2015-08-26	1	-8/+37
\| \| \| \| \| \|	Author: Cheng Lian <lian@databricks.com> Closes #8467 from liancheng/spark-9424/parquet-docs-for-1.5.
*	[MINOR] [SPARKR] Fix some validation problems in SparkR	Yu ISHIKAWA	2015-08-26	3	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Getting rid of some validation problems in SparkR https://github.com/apache/spark/pull/7883 cc shivaram ``` inst/tests/test_Serde.R:26:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_Serde.R:34:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_Serde.R:37:38: style: Trailing whitespace is superfluous. expect_equal(class(x), "character") ^~ inst/tests/test_Serde.R:50:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_Serde.R:55:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_Serde.R:60:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_sparkSQL.R:611:1: style: Trailing whitespace is superfluous. ^~ R/DataFrame.R:664:1: style: Trailing whitespace is superfluous. ^~~~~~~~~~~~~~ R/DataFrame.R:670:55: style: Trailing whitespace is superfluous. df <- data.frame(row.names = 1 : nrow) ^~~~~~~~~~~~~~~~ R/DataFrame.R:672:1: style: Trailing whitespace is superfluous. ^~~~~~~~~~~~~~ R/DataFrame.R:686:49: style: Trailing whitespace is superfluous. df[[names[colIndex]]] <- vec ^~~~~~~~~~~~~~~~~~ ``` Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8474 from yu-iskw/minor-fix-sparkr.
*	[SPARK-10308] [SPARKR] Add %in% to the exported namespace	Shivaram Venkataraman	2015-08-26	1	-3/+4
\| \| \| \| \| \| \| \| \| \|	I also checked all the other functions defined in column.R, functions.R and DataFrame.R and everything else looked fine. cc yu-iskw Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #8473 from shivaram/in-namespace.
*	[SPARK-10305] [SQL] fix create DataFrame from Python class	Davies Liu	2015-08-26	2	-0/+18
\| \| \| \| \| \| \| \|	cc jkbradley Author: Davies Liu <davies@databricks.com> Closes #8470 from davies/fix_create_df.