spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	Fixing the type of the sentiment happiness value	Yury Liavitski	2016-03-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	## What changes were proposed in this pull request? Added the conversion to int for the 'happiness value' read from the file. Otherwise, later on line 75 the multiplication will multiply a string by a number, yielding values like "-2-2" instead of -4. ## How was this patch tested? Tested manually. Author: Yury Liavitski <seconds.before@gmail.com> Author: Yury Liavitski <yury.liavitski@il111.ice.local> Closes #11540 from heliocentrist/fix-sentiment-value-type.
*	[MINOR] Fix typos in comments and testcase name of code	Dongjoon Hyun	2016-03-03	5	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	## What changes were proposed in this pull request? This PR fixes typos in comments and testcase name of code. ## How was this patch tested? manual. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #11481 from dongjoon-hyun/minor_fix_typos_in_code.
*	[SPARK-13013][DOCS] Replace example code in mllib-clustering.md using ↵	Xin Ren	2016-03-03	5	-3/+186
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	include_example Replace example code in mllib-clustering.md using include_example https://issues.apache.org/jira/browse/SPARK-13013 The example code in the user guide is embedded in the markdown and hence it is not easy to test. It would be nice to automatically test them. This JIRA is to discuss options to automate example code testing and see what we can do in Spark 1.6. Goal is to move actual example code to spark/examples and test compilation in Jenkins builds. Then in the markdown, we can reference part of the code to show in the user guide. This requires adding a Jekyll tag that is similar to https://github.com/jekyll/jekyll/blob/master/lib/jekyll/tags/include.rb, e.g., called include_example. `{% include_example scala/org/apache/spark/examples/mllib/KMeansExample.scala %}` Jekyll will find `examples/src/main/scala/org/apache/spark/examples/mllib/KMeansExample.scala` and pick code blocks marked "example" and replace code block in `{% highlight %}` in the markdown. See more sub-tasks in parent ticket: https://issues.apache.org/jira/browse/SPARK-11337 Author: Xin Ren <iamshrek@126.com> Closes #11116 from keypointt/SPARK-13013.
*	[SPARK-13583][CORE][STREAMING] Remove unused imports and add checkstyle rule	Dongjoon Hyun	2016-03-03	23	-31/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	## What changes were proposed in this pull request? After SPARK-6990, `dev/lint-java` keeps Java code healthy and helps PR review by saving much time. This issue aims remove unused imports from Java/Scala code and add `UnusedImports` checkstyle rule to help developers. ## How was this patch tested? ``` ./dev/lint-java ./build/sbt compile ``` Author: Dongjoon Hyun <dongjoon@apache.org> Closes #11438 from dongjoon-hyun/SPARK-13583.
*	[SPARK-13423][WIP][CORE][SQL][STREAMING] Static analysis fixes for 2.x	Sean Owen	2016-03-03	7	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	## What changes were proposed in this pull request? Make some cross-cutting code improvements according to static analysis. These are individually up for discussion since they exist in separate commits that can be reverted. The changes are broadly: - Inner class should be static - Mismatched hashCode/equals - Overflow in compareTo - Unchecked warnings - Misuse of assert, vs junit.assert - get(a) + getOrElse(b) -> getOrElse(a,b) - Array/String .size -> .length (occasionally, -> .isEmpty / .nonEmpty) to avoid implicit conversions - Dead code - tailrec - exists(_ == ) -> contains find + nonEmpty -> exists filter + size -> count - reduce(_+_) -> sum map + flatten -> map The most controversial may be .size -> .length simply because of its size. It is intended to avoid implicits that might be expensive in some places. ## How was the this patch tested? Existing Jenkins unit tests. Author: Sean Owen <sowen@cloudera.com> Closes #11292 from srowen/SPARK-13423.
*	[HOT-FIX] Recover some deprecations for 2.10 compatibility.	Dongjoon Hyun	2016-03-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	## What changes were proposed in this pull request? #11479 [SPARK-13627] broke 2.10 compatibility: [2.10-Build](https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Compile/job/spark-master-compile-maven-scala-2.10/292/console) At this moment, we need to support both 2.10 and 2.11. This PR recovers some deprecated methods which were replace by [SPARK-13627]. ## How was this patch tested? Jenkins build: Both 2.10, 2.11. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #11488 from dongjoon-hyun/hotfix_compatibility_with_2.10.
*	[SPARK-13627][SQL][YARN] Fix simple deprecation warnings.	Dongjoon Hyun	2016-03-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	## What changes were proposed in this pull request? This PR aims to fix the following deprecation warnings. * MethodSymbolApi.paramss--> paramLists * AnnotationApi.tpe -> tree.tpe * BufferLike.readOnly -> toList. * StandardNames.nme -> termNames * scala.tools.nsc.interpreter.AbstractFileClassLoader -> scala.reflect.internal.util.AbstractFileClassLoader * TypeApi.declarations-> decls ## How was this patch tested? Check the compile build log and pass the tests. ``` ./build/sbt ``` Author: Dongjoon Hyun <dongjoon@apache.org> Closes #11479 from dongjoon-hyun/SPARK-13627.
*	[MINOR][STREAMING] Replace deprecated `apply` with `create` in example.	Dongjoon Hyun	2016-03-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	## What changes were proposed in this pull request? Twitter Algebird deprecated `apply` in HyperLogLog.scala. ``` deprecated("Use toHLL", since = "0.10.0 / 2015-05") def apply[T <% Array[Byte]](t: T) = create(t) ``` This PR replace the deprecated usage `apply` with new `create` according to the upstream change. ## How was this patch tested? manual. ``` /bin/spark-submit --class org.apache.spark.examples.streaming.TwitterAlgebirdHLL examples/target/scala-2.11/spark-examples-2.0.0-SNAPSHOT-hadoop2.2.0.jar ``` Author: Dongjoon Hyun <dongjoon@apache.org> Closes #11451 from dongjoon-hyun/replace_deprecated_hll_apply.
*	[SPARK-11381][DOCS] Replace example code in mllib-linear-methods.md using ↵	Dongjoon Hyun	2016-02-26	4	-0/+261
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	include_example ## What changes were proposed in this pull request? This PR replaces example codes in `mllib-linear-methods.md` using `include_example` by doing the followings: * Extracts the example codes(Scala,Java,Python) as files in `example` module. * Merges some dialog-style examples into a single file. * Hide redundant codes in HTML for the consistency with other docs. ## How was the this patch tested? manual test. This PR can be tested by document generations, `SKIP_API=1 jekyll build`. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #11320 from dongjoon-hyun/SPARK-11381.
*	[SPARK-13457][SQL] Removes DataFrame RDD operations	Cheng Lian	2016-02-27	6	-8/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	## What changes were proposed in this pull request? This is another try of PR #11323. This PR removes DataFrame RDD operations except for `foreach` and `foreachPartitions` (they are actions rather than transformations). Original calls are now replaced by calls to methods of `DataFrame.rdd`. PR #11323 was reverted because it introduced a regression: both `DataFrame.foreach` and `DataFrame.foreachPartitions` wrap underlying RDD operations with `withNewExecutionId` to track Spark jobs. But they are removed in #11323. ## How was the this patch tested? No extra tests are added. Existing tests should do the work. Author: Cheng Lian <lian@databricks.com> Closes #11388 from liancheng/remove-df-rdd-ops.
*	Revert "[SPARK-13457][SQL] Removes DataFrame RDD operations"	Davies Liu	2016-02-25	6	-9/+8
\| \| \| \|	This reverts commit 157fe64f3ecbd13b7286560286e50235eecfe30e.
*	[SPARK-13457][SQL] Removes DataFrame RDD operations	Cheng Lian	2016-02-25	6	-8/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	## What changes were proposed in this pull request? This PR removes DataFrame RDD operations. Original calls are now replaced by calls to methods of `DataFrame.rdd`. ## How was the this patch tested? No extra tests are added. Existing tests should do the work. Author: Cheng Lian <lian@databricks.com> Closes #11323 from liancheng/remove-df-rdd-ops.
*	[SPARK-13012][DOCUMENTATION] Replace example code in ml-guide.md using ↵	Devaraj K	2016-02-22	4	-0/+376
\| \| \| \| \| \| \| \| \| \|	include_example Replaced example code in ml-guide.md using include_example Author: Devaraj K <devaraj@apache.org> Closes #11053 from devaraj-kavali/SPARK-13012.
*	[SPARK-13016][DOCUMENTATION] Replace example code in ↵	Devaraj K	2016-02-22	3	-0/+176
\| \| \| \| \| \| \| \| \| \| \|	mllib-dimensionality-reduction.md using include_example Replaced example example code in mllib-dimensionality-reduction.md using include_example Author: Devaraj K <devaraj@apache.org> Closes #11132 from devaraj-kavali/SPARK-13016.
*	[SPARK-12953][EXAMPLES] RDDRelation writer set overwrite mode	shijinkui	2016-02-17	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	https://issues.apache.org/jira/browse/SPARK-12953 fix error when run RDDRelation.main(): "path file:/Users/sjk/pair.parquet already exists" Set DataFrameWriter's mode to SaveMode.Overwrite Author: shijinkui <shijinkui666@163.com> Closes #10864 from shijinkui/set_mode.
*	[SPARK-12247][ML][DOC] Documentation for spark.ml's ALS and collaborative ↵	BenFradet	2016-02-16	2	-182/+82
\| \| \| \| \| \| \| \| \| \|	filtering in general This documents the implementation of ALS in `spark.ml` with example code in scala, java and python. Author: BenFradet <benjamin.fradet@gmail.com> Closes #10411 from BenFradet/SPARK-12247.
*	[SPARK-13018][DOCS] Replace example code in mllib-pmml-model-export.md using ↵	Xin Ren	2016-02-15	1	-0/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	include_example Replace example code in mllib-pmml-model-export.md using include_example https://issues.apache.org/jira/browse/SPARK-13018 The example code in the user guide is embedded in the markdown and hence it is not easy to test. It would be nice to automatically test them. This JIRA is to discuss options to automate example code testing and see what we can do in Spark 1.6. Goal is to move actual example code to spark/examples and test compilation in Jenkins builds. Then in the markdown, we can reference part of the code to show in the user guide. This requires adding a Jekyll tag that is similar to https://github.com/jekyll/jekyll/blob/master/lib/jekyll/tags/include.rb, e.g., called include_example. `{% include_example scala/org/apache/spark/examples/mllib/PMMLModelExportExample.scala %}` Jekyll will find `examples/src/main/scala/org/apache/spark/examples/mllib/PMMLModelExportExample.scala` and pick code blocks marked "example" and replace code block in `{% highlight %}` in the markdown. See more sub-tasks in parent ticket: https://issues.apache.org/jira/browse/SPARK-11337 Author: Xin Ren <iamshrek@126.com> Closes #11126 from keypointt/SPARK-13018.
*	[SPARK-12363][MLLIB] Remove setRun and fix PowerIterationClustering failed test	Liang-Chi Hsieh	2016-02-13	1	-32/+21
\| \| \| \| \| \| \| \| \| \| \|	JIRA: https://issues.apache.org/jira/browse/SPARK-12363 This issue is pointed by yanboliang. When `setRuns` is removed from PowerIterationClustering, one of the tests will be failed. I found that some `dstAttr`s of the normalized graph are not correct values but 0.0. By setting `TripletFields.All` in `mapTriplets` it can work. Author: Liang-Chi Hsieh <viirya@gmail.com> Author: Xiangrui Meng <meng@databricks.com> Closes #10539 from viirya/fix-poweriter.
*	[SPARK-13170][STREAMING] Investigate replacing SynchronizedQueue as it is ↵	Sean Owen	2016-02-09	1	-3/+5
\| \| \| \| \| \| \| \| \| \|	deprecated Replace SynchronizeQueue with synchronized access to a Queue Author: Sean Owen <sowen@cloudera.com> Closes #11111 from srowen/SPARK-13170.
*	[SPARK-13177][EXAMPLES] Update ActorWordCount example to not directly use ↵	sachin aggarwal	2016-02-09	1	-4/+4
\| \| \| \| \| \| \| \|	low level linked list as it is deprecated. Author: sachin aggarwal <different.sachin@gmail.com> Closes #11113 from agsachin/master.
*	[SPARK-7799][SPARK-12786][STREAMING] Add "streaming-akka" project	Shixiong Zhu	2016-01-20	2	-20/+30
\| \| \| \| \| \| \| \| \| \| \| \| \|	Include the following changes: 1. Add "streaming-akka" project and org.apache.spark.streaming.akka.AkkaUtils for creating an actorStream 2. Remove "StreamingContext.actorStream" and "JavaStreamingContext.actorStream" 3. Update the ActorWordCount example and add the JavaActorWordCount example 4. Make "streaming-zeromq" depend on "streaming-akka" and update the codes accordingly Author: Shixiong Zhu <shixiong@databricks.com> Closes #10744 from zsxwing/streaming-akka-2.
*	[SPARK-12667] Remove block manager's internal "external block store" API	Reynold Xin	2016-01-15	2	-143/+0
\| \| \| \| \| \| \| \| \| \|	This pull request removes the external block store API. This is rarely used, and the file system interface is actually a better, more standard way to interact with external storage systems. There are some other things to remove also, as pointed out by JoshRosen. We will do those as follow-up pull requests. Author: Reynold Xin <rxin@databricks.com> Closes #10752 from rxin/remove-offheap.
*	[SPARK-12692][BUILD][STREAMING] Scala style: Fix the style violation (Space ↵	Kousuke Saruta	2016-01-11	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \|	before "," or ":") Fix the style violation (space before , and :). This PR is a followup for #10643. Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes #10685 from sarutak/SPARK-12692-followup-streaming.
*	[SPARK-12603][MLLIB] PySpark MLlib GaussianMixtureModel should support ↵	Yanbo Liang	2016-01-11	1	-0/+6
\| \| \| \| \| \| \| \| \| \|	single instance predict/predictSoft PySpark MLlib ```GaussianMixtureModel``` should support single instance ```predict/predictSoft``` just like Scala do. Author: Yanbo Liang <ybliang8@gmail.com> Closes #10552 from yanboliang/spark-12603.
*	[SPARK-12692][BUILD][MLLIB] Scala style: Fix the style violation (Space ↵	Kousuke Saruta	2016-01-10	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \|	before "," or ":") Fix the style violation (space before , and :). This PR is a followup for #10643. Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes #10684 from sarutak/SPARK-12692-followup-mllib.
*	[SPARK-12618][CORE][STREAMING][SQL] Clean up build warnings: 2.0.0 edition	Sean Owen	2016-01-08	2	-2/+2
\| \| \| \| \| \| \| \|	Fix most build warnings: mostly deprecated API usages. I'll annotate some of the changes below. CC rxin who is leading the charge to remove the deprecated APIs. Author: Sean Owen <sowen@cloudera.com> Closes #10570 from srowen/SPARK-12618.
*	[SPARK-12510][STREAMING] Refactor ActorReceiver to support Java	Shixiong Zhu	2016-01-07	1	-5/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	This PR includes the following changes: 1. Rename `ActorReceiver` to `ActorReceiverSupervisor` 2. Remove `ActorHelper` 3. Add a new `ActorReceiver` for Scala and `JavaActorReceiver` for Java 4. Add `JavaActorWordCount` example Author: Shixiong Zhu <shixiong@databricks.com> Closes #10457 from zsxwing/java-actor-stream.
*	[STREAMING][DOCS][EXAMPLES] Minor fixes	Jacek Laskowski	2016-01-07	1	-6/+4
\| \| \| \| \| \|	Author: Jacek Laskowski <jacek@japila.pl> Closes #10603 from jaceklaskowski/streaming-actor-custom-receiver.
*	[SPARK-12615] Remove some deprecated APIs in RDD/SparkContext	Reynold Xin	2016-01-05	2	-11/+2
\| \| \| \| \| \| \| \|	I looked at each case individually and it looks like they can all be removed. The only one that I had to think twice was toArray (I even thought about un-deprecating it, until I realized it was a problem in Java to have toArray returning java.util.List). Author: Reynold Xin <rxin@databricks.com> Closes #10569 from rxin/SPARK-12615.
*	[SPARK-3873][EXAMPLES] Import ordering fixes.	Marcelo Vanzin	2016-01-04	106	-154/+147
\| \| \| \| \| \|	Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #10575 from vanzin/SPARK-3873-examples.
*	[SPARK-12481][CORE][STREAMING][SQL] Remove usage of Hadoop deprecated APIs ↵	Sean Owen	2016-01-02	2	-6/+2
\| \| \| \| \| \| \| \| \| \|	and reflection that supported 1.x Remove use of deprecated Hadoop APIs now that 2.2+ is required Author: Sean Owen <sowen@cloudera.com> Closes #10446 from srowen/SPARK-12481.
*	[SPARK-12588] Remove HttpBroadcast in Spark 2.0.	Reynold Xin	2015-12-30	1	-5/+3
\| \| \| \| \| \| \| \|	We switched to TorrentBroadcast in Spark 1.1, and HttpBroadcast has been undocumented since then. It's time to remove it in Spark 2.0. Author: Reynold Xin <rxin@databricks.com> Closes #10531 from rxin/SPARK-12588.
*	[SPARK-12429][STREAMING][DOC] Add Accumulator and Broadcast example for ↵	Shixiong Zhu	2015-12-22	1	-5/+61
\| \| \| \| \| \| \| \| \| \|	Streaming This PR adds Scala, Java and Python examples to show how to use Accumulator and Broadcast in Spark Streaming to support checkpointing. Author: Shixiong Zhu <shixiong@databricks.com> Closes #10385 from zsxwing/accumulator-broadcast-example.
*	[SPARK-9057][STREAMING] Twitter example joining to static RDD of word ↵	Jeff L	2015-12-18	1	-0/+96
\| \| \| \| \| \| \| \| \| \|	sentiment values Example of joining a static RDD of word sentiments to a streaming RDD of Tweets in order to demo the usage of the transform() method. Author: Jeff L <sha0lin@alumni.carnegiemellon.edu> Closes #8431 from Agent007/SPARK-9057.
*	[SPARK-12410][STREAMING] Fix places that use '.' and '\|' directly in split	Shixiong Zhu	2015-12-17	1	-1/+1
\| \| \| \| \| \| \| \|	String.split accepts a regular expression, so we should escape "." and "\|". Author: Shixiong Zhu <shixiong@databricks.com> Closes #10361 from zsxwing/reg-bug.
*	[SPARK-6518][MLLIB][EXAMPLE][DOC] Add example code and user guide for ↵	Yu ISHIKAWA	2015-12-16	1	-0/+60
\| \| \| \| \| \| \| \| \| \| \|	bisecting k-means This PR includes only an example code in order to finish it quickly. I'll send another PR for the docs soon. Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #9952 from yu-iskw/SPARK-6518.
*	[SPARK-12215][ML][DOC] User guide section for KMeans in spark.ml	Yu ISHIKAWA	2015-12-16	1	-26/+23
\| \| \| \| \| \| \| \|	cc jkbradley Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #10244 from yu-iskw/SPARK-12215.
*	[MINOR][ML] Rename weights to coefficients for examples/DeveloperApiExample	Yanbo Liang	2015-12-15	1	-8/+8
\| \| \| \| \| \| \| \| \| \|	Rename ```weights``` to ```coefficients``` for examples/DeveloperApiExample. cc mengxr jkbradley Author: Yanbo Liang <ybliang8@gmail.com> Closes #10280 from yanboliang/spark-coefficients.
*	[SPARK-12199][DOC] Follow-up: Refine example code in ml-features.md	Xusen Yin	2015-12-12	1	-0/+0
\| \| \| \| \| \| \| \| \| \| \| \|	https://issues.apache.org/jira/browse/SPARK-12199 Follow-up PR of SPARK-11551. Fix some errors in ml-features.md mengxr Author: Xusen Yin <yinxusen@gmail.com> Closes #10193 from yinxusen/SPARK-12199.
*	[SPARK-11978][ML] Move dataset_example.py to examples/ml and rename to ↵	Yanbo Liang	2015-12-11	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	dataframe_example.py Since ```Dataset``` has a new meaning in Spark 1.6, we should rename it to avoid confusion. #9873 finished the work of Scala example, here we focus on the Python one. Move dataset_example.py to ```examples/ml``` and rename to ```dataframe_example.py```. BTW, fix minor missing issues of #9873. cc mengxr Author: Yanbo Liang <ybliang8@gmail.com> Closes #9957 from yanboliang/SPARK-11978.
*	[SPARK-12244][SPARK-12245][STREAMING] Rename trackStateByKey to mapWithState ↵	Tathagata Das	2015-12-09	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and change tracking function signature SPARK-12244: Based on feedback from early users and personal experience attempting to explain it, the name trackStateByKey had two problem. "trackState" is a completely new term which really does not give any intuition on what the operation is the resultant data stream of objects returned by the function is called in docs as the "emitted" data for the lack of a better. "mapWithState" makes sense because the API is like a mapping function like (Key, Value) => T with State as an additional parameter. The resultant data stream is "mapped data". So both problems are solved. SPARK-12245: From initial experiences, not having the key in the function makes it hard to return mapped stuff, as the whole information of the records is not there. Basically the user is restricted to doing something like mapValue() instead of map(). So adding the key as a parameter. Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #10224 from tdas/rename.
*	[SPARK-11551][DOC] Replace example code in ml-features.md using include_example	Xusen Yin	2015-12-09	18	-0/+929
\| \| \| \| \| \| \| \| \|	PR on behalf of somideshmukh, thanks! Author: Xusen Yin <yinxusen@gmail.com> Author: somideshmukh <somilde@us.ibm.com> Closes #10219 from yinxusen/SPARK-11551.
*	[SPARK-12159][ML] Add user guide section for IndexToString transformer	BenFradet	2015-12-08	1	-0/+60
\| \| \| \| \| \| \| \|	Documentation regarding the `IndexToString` label transformer with code snippets in Scala/Java/Python. Author: BenFradet <benjamin.fradet@gmail.com> Closes #10166 from BenFradet/SPARK-12159.
*	[SPARK-11605][MLLIB] ML 1.6 QA: API: Java compatibility, docs	Yuhao Yang	2015-12-08	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	jira: https://issues.apache.org/jira/browse/SPARK-11605 Check Java compatibility for MLlib for this release. fix: 1. `StreamingTest.registerStream` needs java friendly interface. 2. `GradientBoostedTreesModel.computeInitialPredictionAndError` and `GradientBoostedTreesModel.updatePredictionError` has java compatibility issue. Mark them as `developerAPI`. TBD: [updated] no fix for now per discussion. `org.apache.spark.mllib.classification.LogisticRegressionModel` `public scala.Option<java.lang.Object> getThreshold();` has wrong return type for Java invocation. `SVMModel` has the similar issue. Yet adding a `scala.Option<java.util.Double> getThreshold()` would result in an overloading error due to the same function signature. And adding a new function with different name seems to be not necessary. cc jkbradley feynmanliang Author: Yuhao Yang <hhbyyh@gmail.com> Closes #10102 from hhbyyh/javaAPI.
*	[SPARK-10393] use ML pipeline in LDA example	Yuhao Yang	2015-12-08	1	-113/+40
\| \| \| \| \| \| \| \| \| \| \|	jira: https://issues.apache.org/jira/browse/SPARK-10393 Since the logic of the text processing part has been moved to ML estimators/transformers, replace the related code in LDA Example with the ML pipeline. Author: Yuhao Yang <hhbyyh@gmail.com> Author: yuhaoyang <yuhao@zhanglipings-iMac.local> Closes #8551 from hhbyyh/ldaExUpdate.
*	[SPARK-11551][DOC][EXAMPLE] Revert PR #10002	Cheng Lian	2015-12-08	18	-928/+0
\| \| \| \| \| \| \| \| \| \|	This reverts PR #10002, commit 78209b0ccaf3f22b5e2345dfb2b98edfdb746819. The original PR wasn't tested on Jenkins before being merged. Author: Cheng Lian <lian@databricks.com> Closes #10200 from liancheng/revert-pr-10002.
*	[SPARK-11958][SPARK-11957][ML][DOC] SQLTransformer user guide and example code	Yanbo Liang	2015-12-07	1	-0/+45
\| \| \| \| \| \| \| \|	Add ```SQLTransformer``` user guide, example code and make Scala API doc more clear. Author: Yanbo Liang <ybliang8@gmail.com> Closes #10006 from yanboliang/spark-11958.
*	[SPARK-11551][DOC][EXAMPLE] Replace example code in ml-features.md using ↵	somideshmukh	2015-12-07	18	-0/+928
\| \| \| \| \| \| \| \| \| \| \| \| \|	include_example Made new patch contaning only markdown examples moved to exmaple/folder. Ony three java code were not shfted since they were contaning compliation error ,these classes are 1)StandardScale 2)NormalizerExample 3)VectorIndexer Author: Xusen Yin <yinxusen@gmail.com> Author: somideshmukh <somilde@us.ibm.com> Closes #10002 from somideshmukh/SomilBranch1.33.
*	[SPARK-11963][DOC] Add docs for QuantileDiscretizer	Xusen Yin	2015-12-07	1	-0/+49
\| \| \| \| \| \| \| \|	https://issues.apache.org/jira/browse/SPARK-11963 Author: Xusen Yin <yinxusen@gmail.com> Closes #9962 from yinxusen/SPARK-11963.
*	[SPARK-12084][CORE] Fix codes that uses ByteBuffer.array incorrectly	Shixiong Zhu	2015-12-04	1	-1/+4
\| \| \| \| \| \| \| \| \| \|	`ByteBuffer` doesn't guarantee all contents in `ByteBuffer.array` are valid. E.g, a ByteBuffer returned by `ByteBuffer.slice`. We should not use the whole content of `ByteBuffer` unless we know that's correct. This patch fixed all places that use `ByteBuffer.array` incorrectly. Author: Shixiong Zhu <shixiong@databricks.com> Closes #10083 from zsxwing/bytebuffer-array.