aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Preparing Spark release v1.5.0-rc3v1.5.0-rc3Patrick Wendell2015-08-3133-33/+33
|
* [SPARK-10341] [SQL] fix memory starving in unsafe SMJDavies Liu2015-08-313-6/+42
| | | | | | | | | | | | | | | In SMJ, the first ExternalSorter could consume all the memory before spilling, then the second can not even acquire the first page. Before we have a better memory allocator, SMJ should call prepare() before call any compute() of it's children. cc rxin JoshRosen Author: Davies Liu <davies@databricks.com> Closes #8511 from davies/smj_memory. (cherry picked from commit 540bdee93103a73736d282b95db6a8cda8f6a2b1) Signed-off-by: Reynold Xin <rxin@databricks.com>
* [SPARK-10369] [STREAMING] Don't remove ReceiverTrackingInfo when ↵zsxwing2015-08-312-2/+53
| | | | | | | | | | | | | deregisterReceivering since we may reuse it later `deregisterReceiver` should not remove `ReceiverTrackingInfo`. Otherwise, it will throw `java.util.NoSuchElementException: key not found` when restarting it. Author: zsxwing <zsxwing@gmail.com> Closes #8538 from zsxwing/SPARK-10369. (cherry picked from commit 4a5fe091658b1d06f427e404a11a84fc84f953c5) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
* [SPARK-100354] [MLLIB] fix some apparent memory issues in k-means|| ↵Xiangrui Meng2015-08-301-7/+14
| | | | | | | | | | | | | | | | | | | initializaiton * do not cache first cost RDD * change following cost RDD cache level to MEMORY_AND_DISK * remove Vector wrapper to save a object per instance Further improvements will be addressed in SPARK-10329 cc: yu-iskw HuJiayin Author: Xiangrui Meng <meng@databricks.com> Closes #8526 from mengxr/SPARK-10354. (cherry picked from commit f0f563a3c43fc9683e6920890cce44611c0c5f4b) Signed-off-by: Xiangrui Meng <meng@databricks.com>
* [SPARK-10353] [MLLIB] BLAS gemm not scaling when beta = 0.0 for some subset ↵Burak Yavuz2015-08-302-16/+15
| | | | | | | | | | | | | | | of matrix multiplications mengxr jkbradley rxin It would be great if this fix made it into RC3! Author: Burak Yavuz <brkyvz@gmail.com> Closes #8525 from brkyvz/blas-scaling. (cherry picked from commit 8d2ab75d3b71b632f2394f2453af32f417cb45e5) Signed-off-by: Xiangrui Meng <meng@databricks.com>
* [SPARK-10331] [MLLIB] Update example code in ml-guideXiangrui Meng2015-08-291-215/+147
| | | | | | | | | | | | | | | * The example code was added in 1.2, before `createDataFrame`. This PR switches to `createDataFrame`. Java code still uses JavaBean. * assume `sqlContext` is available * fix some minor issues from previous code review jkbradley srowen feynmanliang Author: Xiangrui Meng <meng@databricks.com> Closes #8518 from mengxr/SPARK-10331. (cherry picked from commit ca69fc8efda8a3e5442ffa16692a2b1eb86b7673) Signed-off-by: Xiangrui Meng <meng@databricks.com>
* [SPARK-10348] [MLLIB] updates ml-guideXiangrui Meng2015-08-292-52/+78
| | | | | | | | | | | | | | | | | * replace `ML Dataset` by `DataFrame` to unify the abstraction * ML algorithms -> pipeline components to describe the main concept * remove Scala API doc links from the main guide * `Section Title` -> `Section tile` to be consistent with other section titles in MLlib guide * modified lines break at 100 chars or periods jkbradley feynmanliang Author: Xiangrui Meng <meng@databricks.com> Closes #8517 from mengxr/SPARK-10348. (cherry picked from commit 905fbe498bdd29116468628e6a2a553c1fd57165) Signed-off-by: Xiangrui Meng <meng@databricks.com>
* [SPARK-10339] [SPARK-10334] [SPARK-10301] [SQL] Partitioned table scan can ↵Yin Huai2015-08-293-42/+65
| | | | | | | | | | | | | | | | | | OOM driver and throw a better error message when users need to enable parquet schema merging This fixes the problem that scanning partitioned table causes driver have a high memory pressure and takes down the cluster. Also, with this fix, we will be able to correctly show the query plan of a query consuming partitioned tables. https://issues.apache.org/jira/browse/SPARK-10339 https://issues.apache.org/jira/browse/SPARK-10334 Finally, this PR squeeze in a "quick fix" for SPARK-10301. It is not a real fix, but it just throw a better error message to let user know what to do. Author: Yin Huai <yhuai@databricks.com> Closes #8515 from yhuai/partitionedTableScan. (cherry picked from commit 097a7e36e0bf7290b1879331375bacc905583bd3) Signed-off-by: Michael Armbrust <michael@databricks.com>
* [SPARK-10330] Use SparkHadoopUtil TaskAttemptContext reflection methods in ↵Josh Rosen2015-08-295-12/+28
| | | | | | | | | | | | | more places SparkHadoopUtil contains methods that use reflection to work around TaskAttemptContext binary incompatibilities between Hadoop 1.x and 2.x. We should use these methods in more places. Author: Josh Rosen <joshrosen@databricks.com> Closes #8499 from JoshRosen/use-hadoop-reflection-in-more-places. (cherry picked from commit 6a6f3c91ee1f63dd464eb03d156d02c1a5887d88) Signed-off-by: Michael Armbrust <michael@databricks.com>
* [SPARK-10226] [SQL] Fix exclamation mark issue in SparkSQLwangwei2015-08-291-0/+1
| | | | | | | | | | | When I tested the latest version of spark with exclamation mark, I got some errors. Then I reseted the spark version and found that commit id "a2409d1c8e8ddec04b529ac6f6a12b5993f0eeda" brought the bug. With jline version changing from 0.9.94 to 2.12 after this commit, exclamation mark would be treated as a special character in ConsoleReader. Author: wangwei <wangwei82@huawei.com> Closes #8420 from small-wang/jline-SPARK-10226. (cherry picked from commit 277148b285748e863f2b9fdf6cf12963977f91ca) Signed-off-by: Michael Armbrust <michael@databricks.com>
* [SPARK-10344] [SQL] Add tests for extraStrategiesMichael Armbrust2015-08-292-1/+68
| | | | | | | | | | | Actually using this API requires access to a lot of classes that we might make private by accident. I've added some tests to prevent this. Author: Michael Armbrust <michael@databricks.com> Closes #8516 from marmbrus/extraStrategiesTests. (cherry picked from commit 5c3d16a9b91bb9a458d3ba141f7bef525cf3d285) Signed-off-by: Yin Huai <yhuai@databricks.com>
* [SPARK-10350] [DOC] [SQL] Removed duplicated option description from SQL guideGuoQiang Li2015-08-291-10/+0
| | | | | | | | | Author: GuoQiang Li <witgo@qq.com> Closes #8520 from witgo/SPARK-10350. (cherry picked from commit 5369be806848f43cb87c76504258c4e7de930c90) Signed-off-by: Michael Armbrust <michael@databricks.com>
* [SPARK-9910] [ML] User guide for train validation splitmartinzapletal2015-08-283-0/+287
| | | | | | | | | Author: martinzapletal <zapletal-martin@email.cz> Closes #8377 from zapletal-martin/SPARK-9910. (cherry picked from commit e8ea5bafee9ca734edf62021145d0c2d5491cba8) Signed-off-by: Xiangrui Meng <meng@databricks.com>
* [SPARK-9803] [SPARKR] Add subset and transform + testsfelixcheung2015-08-284-17/+85
| | | | | | | | | | | | | | | Add subset and transform Also reorganize `[` & `[[` to subset instead of select Note: for transform, transform is very similar to mutate. Spark doesn't seem to replace existing column with the name in mutate (ie. `mutate(df, age = df$age + 2)` - returned DataFrame has 2 columns with the same name 'age'), so therefore not doing that for now in transform. Though it is clearly stated it should replace column with matching name (should I open a JIRA for mutate/transform?) Author: felixcheung <felixcheung_m@hotmail.com> Closes #8503 from felixcheung/rsubset_transform. (cherry picked from commit 2a4e00ca4d4e7a148b4ff8ce0ad1c6d517cee55f) Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
* [SPARK-10326] [YARN] Fix app submission on windows.Marcelo Vanzin2015-08-281-1/+1
| | | | | | Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #8493 from vanzin/SPARK-10326.
* [SPARK-10323] [SQL] fix nullability of In/InSet/ArrayContainDavies Liu2015-08-287-97/+138
| | | | | | | | | | | After this PR, In/InSet/ArrayContain will return null if value is null, instead of false. They also will return null even if there is a null in the set/array. Author: Davies Liu <davies@databricks.com> Closes #8492 from davies/fix_in. (cherry picked from commit bb7f35239385ec74b5ee69631b5480fbcee253e4) Signed-off-by: Davies Liu <davies.liu@gmail.com>
* [SPARK-9671] [MLLIB] re-org user guide and add migration guideXiangrui Meng2015-08-283-106/+95
| | | | | | | | | | | | | | | | | | | | This PR updates the MLlib user guide and adds migration guide for 1.4->1.5. * merge migration guide for `spark.mllib` and `spark.ml` packages * remove dependency section from `spark.ml` guide * move the paragraph about `spark.mllib` and `spark.ml` to the top and recommend `spark.ml` * move Sam's talk to footnote to make the section focus on dependencies Minor changes to code examples and other wording will be in a separate PR. jkbradley srowen feynmanliang Author: Xiangrui Meng <meng@databricks.com> Closes #8498 from mengxr/SPARK-9671. (cherry picked from commit 88032ecaf0455886aed7a66b30af80dae7f6cff7) Signed-off-by: Xiangrui Meng <meng@databricks.com>
* [SPARK-10336][example] fix not being able to set intercept in LR exampleShuo Xiang2015-08-281-0/+1
| | | | | | | | | | | | | | | `fitIntercept` is a command line option but not set in the main program. dbtsai Author: Shuo Xiang <sxiang@pinterest.com> Closes #8510 from coderxiang/intercept and squashes the following commits: 57c9b7d [Shuo Xiang] fix not being able to set intercept in LR example (cherry picked from commit 45723214e694b9a440723e9504c562e6393709f3) Signed-off-by: DB Tsai <dbt@netflix.com>
* [SPARK-10325] Override hashCode() for public RowJosh Rosen2015-08-282-1/+23
| | | | | | | | | | | | | | | | This commit fixes an issue where the public SQL `Row` class did not override `hashCode`, causing it to violate the hashCode() + equals() contract. To fix this, I simply ported the `hashCode` implementation from the 1.4.x version of `Row`. Author: Josh Rosen <joshrosen@databricks.com> Closes #8500 from JoshRosen/SPARK-10325 and squashes the following commits: 51ffea1 [Josh Rosen] Override hashCode() for public Row. (cherry picked from commit d3f87dc39480f075170817bbd00142967a938078) Signed-off-by: Michael Armbrust <michael@databricks.com> Conflicts: sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala
* [SPARK-8952] [SPARKR] - Wrap normalizePath calls with suppressWarningsLuciano Resende2015-08-282-3/+3
| | | | | | | | | | | This is based on davies comment on SPARK-8952 which suggests to only call normalizePath() when path starts with '~' Author: Luciano Resende <lresende@apache.org> Closes #8343 from lresende/SPARK-8952. (cherry picked from commit 499e8e154bdcc9d7b2f685b159e0ddb4eae48fe4) Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
* [SPARK-9890] [DOC] [ML] User guide for CountVectorizerYuhao Yang2015-08-281-0/+109
| | | | | | | | | | | | | jira: https://issues.apache.org/jira/browse/SPARK-9890 document with Scala and java examples Author: Yuhao Yang <hhbyyh@gmail.com> Closes #8487 from hhbyyh/cvDoc. (cherry picked from commit e2a843090cb031f6aa774f6d9c031a7f0f732ee1) Signed-off-by: Xiangrui Meng <meng@databricks.com>
* typo in commentDharmesh Kakadia2015-08-281-1/+1
| | | | | | | | | Author: Dharmesh Kakadia <dharmeshkakadia@users.noreply.github.com> Closes #8497 from dharmeshkakadia/patch-2. (cherry picked from commit 71a077f6c16c8816eae13341f645ba50d997f63d) Signed-off-by: Sean Owen <sowen@cloudera.com>
* Fix DynamodDB/DynamoDB typo in Kinesis Integration docKeiji Yoshida2015-08-281-1/+1
| | | | | | | | | | | Fix DynamodDB/DynamoDB typo in Kinesis Integration doc Author: Keiji Yoshida <yoshida.keiji.84@gmail.com> Closes #8501 from yosssi/patch-1. (cherry picked from commit 18294cd8710427076caa86bfac596de67089d57e) Signed-off-by: Sean Owen <sowen@cloudera.com>
* [SPARK-10295] [CORE] Dynamic allocation in Mesos does not release when RDDs ↵Sean Owen2015-08-281-5/+0
| | | | | | | | | | | | | | | are cached Remove obsolete warning about dynamic allocation not working with cached RDDs See discussion in https://issues.apache.org/jira/browse/SPARK-10295 Author: Sean Owen <sowen@cloudera.com> Closes #8489 from srowen/SPARK-10295. (cherry picked from commit cc39803062119c1d14611dc227b9ed0ed1284d38) Signed-off-by: Sean Owen <sowen@cloudera.com>
* [SPARK-10035] [SQL] Parquet filters does not process EqualNullSafe filter.hyukjinkwon2015-08-282-139/+37
| | | | | | | | | | | | | | | | | | | | | | As I talked with Lian, 1. I added EquelNullSafe to ParquetFilters - It uses the same equality comparison filter with EqualTo since the Parquet filter performs actually null-safe equality comparison. 2. Updated the test code (ParquetFilterSuite) - Convert catalyst.Expression to sources.Filter - Removed Cast since only Literal is picked up as a proper Filter in DataSourceStrategy - Added EquelNullSafe comparison 3. Removed deprecated createFilter for catalyst.Expression Author: hyukjinkwon <gurwls223@gmail.com> Author: 권혁진 <gurwls223@gmail.com> Closes #8275 from HyukjinKwon/master. (cherry picked from commit ba5f7e1842f2c5852b5309910c0d39926643da69) Signed-off-by: Cheng Lian <lian@databricks.com>
* [SPARK-10328] [SPARKR] Fix generic for na.omitShivaram Venkataraman2015-08-283-5/+26
| | | | | | | | | | | | | S3 function is at https://stat.ethz.ch/R-manual/R-patched/library/stats/html/na.fail.html Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Author: Shivaram Venkataraman <shivaram.venkataraman@gmail.com> Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8495 from shivaram/na-omit-fix. (cherry picked from commit 2f99c37273c1d82e2ba39476e4429ea4aaba7ec6) Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
* [SPARK-10188] [PYSPARK] Pyspark CrossValidator with RMSE selects incorrect modelnoelsmith2015-08-273-1/+104
| | | | | | | | | | | | | | | | * Added isLargerBetter() method to Pyspark Evaluator to match the Scala version. * JavaEvaluator delegates isLargerBetter() to underlying Scala object. * Added check for isLargerBetter() in CrossValidator to determine whether to use argmin or argmax. * Added test cases for where smaller is better (RMSE) and larger is better (R-Squared). (This contribution is my original work and that I license the work to the project under Sparks' open source license) Author: noelsmith <mail@noelsmith.com> Closes #8399 from noel-smith/pyspark-rmse-xval-fix. (cherry picked from commit 7583681e6b0824d7eed471dc4d8fa0b2addf9ffc) Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
* [SPARK-SQL] [MINOR] Fixes some typos in HiveContextCheng Lian2015-08-272-5/+5
| | | | | | | | | Author: Cheng Lian <lian@databricks.com> Closes #8481 from liancheng/hive-context-typo. (cherry picked from commit 89b943438512fcfb239c268b43431397de46cbcf) Signed-off-by: Reynold Xin <rxin@databricks.com>
* [SPARK-9905] [ML] [DOC] Adds LinearRegressionSummary user guideFeynman Liang2015-08-271-13/+127
| | | | | | | | | | | | | | * Adds user guide for `LinearRegressionSummary` * Fixes unresolved issues in #8197 CC jkbradley mengxr Author: Feynman Liang <fliang@databricks.com> Closes #8491 from feynmanliang/SPARK-9905. (cherry picked from commit af0e1249b1c881c0fa7a921fd21fd2c27214b980) Signed-off-by: Xiangrui Meng <meng@databricks.com>
* [SPARK-9911] [DOC] [ML] Update Userguide for EvaluatorMechCoder2015-08-271-0/+13
| | | | | | | | | | | I added a small note about the different types of evaluator and the metrics used. Author: MechCoder <manojkumarsivaraj334@gmail.com> Closes #8304 from MechCoder/multiclass_evaluator. (cherry picked from commit 30734d45fbbb269437c062241a9161e198805a76) Signed-off-by: Xiangrui Meng <meng@databricks.com>
* [SPARK-10321] sizeInBytes in HadoopFsRelationDavies Liu2015-08-271-0/+2
| | | | | | | | | | | | | Having sizeInBytes in HadoopFsRelation to enable broadcast join. cc marmbrus Author: Davies Liu <davies@databricks.com> Closes #8490 from davies/sizeInByte. (cherry picked from commit 54cda0deb6bebf1470f16ba5bcc6c4fb842bdac1) Signed-off-by: Michael Armbrust <michael@databricks.com>
* [SPARK-10287] [SQL] Fixes JSONRelation refreshing on read pathYin Huai2015-08-274-25/+7
| | | | | | | | | | | | | https://issues.apache.org/jira/browse/SPARK-10287 After porting json to HadoopFsRelation, it seems hard to keep the behavior of picking up new files automatically for JSON. This PR removes this behavior, so JSON is consistent with others (ORC and Parquet). Author: Yin Huai <yhuai@databricks.com> Closes #8469 from yhuai/jsonRefresh. (cherry picked from commit b3dd569ad40905f8861a547a1e25ed3ca8e1d272) Signed-off-by: Yin Huai <yhuai@databricks.com>
* [SPARK-9680] [MLLIB] [DOC] StopWordsRemovers user guide and Java ↵Feynman Liang2015-08-272-3/+171
| | | | | | | | | | | | | | | | compatibility test * Adds user guide for ml.feature.StopWordsRemovers, ran code examples on my machine * Cleans up scaladocs for public methods * Adds test for Java compatibility * Follow up Python user guide code example is tracked by SPARK-10249 Author: Feynman Liang <fliang@databricks.com> Closes #8436 from feynmanliang/SPARK-10230. (cherry picked from commit 5bfe9e1111d9862084586549a7dc79476f67bab9) Signed-off-by: Xiangrui Meng <meng@databricks.com>
* [SPARK-9906] [ML] User guide for LogisticRegressionSummaryMechCoder2015-08-271-16/+133
| | | | | | | | | | | | | User guide for LogisticRegression summaries Author: MechCoder <manojkumarsivaraj334@gmail.com> Author: Manoj Kumar <mks542@nyu.edu> Author: Feynman Liang <fliang@databricks.com> Closes #8197 from MechCoder/log_summary_user_guide. (cherry picked from commit c94ecdfc5b3c0fe6c38a170dc2af9259354dc9e3) Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
* [SPARK-9901] User guide for RowMatrix Tall-and-skinny QRYuhao Yang2015-08-271-1/+10
| | | | | | | | | | | | | jira: https://issues.apache.org/jira/browse/SPARK-9901 The jira covers only the document update. I can further provide example code for QR (like the ones for SVD and PCA) in a separate PR. Author: Yuhao Yang <hhbyyh@gmail.com> Closes #8462 from hhbyyh/qrDoc. (cherry picked from commit 6185cdd2afcd492b77ff225b477b3624e3bc7bb2) Signed-off-by: Xiangrui Meng <meng@databricks.com>
* [SPARK-10315] remove document on spark.akka.failure-detector.thresholdCodingCat2015-08-271-10/+0
| | | | | | | | | | | | | https://issues.apache.org/jira/browse/SPARK-10315 this parameter is not used any longer and there is some mistake in the current document , should be 'akka.remote.watch-failure-detector.threshold' Author: CodingCat <zhunansjtu@gmail.com> Closes #8483 from CodingCat/SPARK_10315. (cherry picked from commit 84baa5e9b5edc8c55871fbed5057324450bf097f) Signed-off-by: Sean Owen <sowen@cloudera.com>
* [SPARK-9148] [SPARK-10252] [SQL] Update SQL Programming GuideMichael Armbrust2015-08-271-19/+73
| | | | | | | | | Author: Michael Armbrust <michael@databricks.com> Closes #8441 from marmbrus/documentation. (cherry picked from commit dc86a227e4fc8a9d8c3e8c68da8dff9298447fd0) Signed-off-by: Michael Armbrust <michael@databricks.com>
* [DOCS] [STREAMING] [KAFKA] Fix typo in exactly once semanticsMoussa Taifi2015-08-271-1/+1
| | | | | | | | | | | | Fix Typo in exactly once semantics [Semantics of output operations] link Author: Moussa Taifi <moutai10@gmail.com> Closes #8468 from moutai/patch-3. (cherry picked from commit 9625d13d575c97bbff264f6a94838aae72c9202d) Signed-off-by: Sean Owen <sowen@cloudera.com>
* [SPARK-10219] [SPARKR] Fix varargsToEnv and add test caseShivaram Venkataraman2015-08-262-1/+8
| | | | | | | | | | | cc sun-rui davies Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #8475 from shivaram/varargs-fix. (cherry picked from commit e936cf8088a06d6aefce44305f3904bbeb17b432) Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
* [SPARK-9424] [SQL] Parquet programming guide updates for 1.5Cheng Lian2015-08-261-8/+37
| | | | | | Author: Cheng Lian <lian@databricks.com> Closes #8467 from liancheng/spark-9424/parquet-docs-for-1.5.
* [SPARK-10308] [SPARKR] Add %in% to the exported namespaceShivaram Venkataraman2015-08-261-3/+4
| | | | | | | | | | | | | I also checked all the other functions defined in column.R, functions.R and DataFrame.R and everything else looked fine. cc yu-iskw Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #8473 from shivaram/in-namespace. (cherry picked from commit ad7f0f160be096c0fdae6e6cf7e3b6ba4a606de7) Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
* [SPARK-10305] [SQL] fix create DataFrame from Python classDavies Liu2015-08-262-0/+18
| | | | | | | | | | | cc jkbradley Author: Davies Liu <davies@databricks.com> Closes #8470 from davies/fix_create_df. (cherry picked from commit d41d6c48207159490c1e1d9cc54015725cfa41b2) Signed-off-by: Davies Liu <davies.liu@gmail.com>
* [SPARK-10241] [MLLIB] update since versions in mllib.recommendationXiangrui Meng2015-08-262-5/+25
| | | | | | | | | | | | | Same as #8421 but for `mllib.recommendation`. cc srowen coderxiang Author: Xiangrui Meng <meng@databricks.com> Closes #8432 from mengxr/SPARK-10241. (cherry picked from commit 086d4681df3ebfccfc04188262c10482f44553b0) Signed-off-by: Xiangrui Meng <meng@databricks.com>
* [SPARK-9665] [MLLIB] audit MLlib API annotationsXiangrui Meng2015-08-261-4/+8
| | | | | | | | | | | | | I only found `ml.NaiveBayes` missing `Experimental` annotation. This PR doesn't cover Python APIs. cc jkbradley Author: Xiangrui Meng <meng@databricks.com> Closes #8452 from mengxr/SPARK-9665. (cherry picked from commit 6519fd06cc8175c9182ef16cf8a37d7f255eb846) Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
* [SPARK-9316] [SPARKR] Add support for filtering using `[` (synonym for ↵felixcheung2015-08-252-1/+48
| | | | | | | | | | | | | | | | | | | filter / select) Add support for ``` df[df$name == "Smith", c(1,2)] df[df$age %in% c(19, 30), 1:2] ``` shivaram Author: felixcheung <felixcheung_m@hotmail.com> Closes #8394 from felixcheung/rsubset. (cherry picked from commit 75d4773aa50e24972c533e8b48697fde586429eb) Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
* [SPARK-10236] [MLLIB] update since versions in mllib.featureXiangrui Meng2015-08-258-16/+21
| | | | | | | | | | | | | | | | Same as #8421 but for `mllib.feature`. cc dbtsai Author: Xiangrui Meng <meng@databricks.com> Closes #8449 from mengxr/SPARK-10236.feature and squashes the following commits: 0e8d658 [Xiangrui Meng] remove unnecessary comment ad70b03 [Xiangrui Meng] update since versions in mllib.feature (cherry picked from commit 321d7759691bed9867b1f0470f12eab2faa50aff) Signed-off-by: DB Tsai <dbt@netflix.com>
* [SPARK-10235] [MLLIB] update since versions in mllib.regressionXiangrui Meng2015-08-258-29/+47
| | | | | | | | | | | | | | | Same as #8421 but for `mllib.regression`. cc freeman-lab dbtsai Author: Xiangrui Meng <meng@databricks.com> Closes #8426 from mengxr/SPARK-10235 and squashes the following commits: 6cd28e4 [Xiangrui Meng] update since versions in mllib.regression (cherry picked from commit 4657fa1f37d41dd4c7240a960342b68c7c591f48) Signed-off-by: DB Tsai <dbt@netflix.com>
* [SPARK-10243] [MLLIB] update since versions in mllib.treeXiangrui Meng2015-08-2512-44/+57
| | | | | | | | | | | | | Same as #8421 but for `mllib.tree`. cc jkbradley Author: Xiangrui Meng <meng@databricks.com> Closes #8442 from mengxr/SPARK-10236. (cherry picked from commit fb7e12fe2e14af8de4c206ca8096b2e8113bfddc) Signed-off-by: Xiangrui Meng <meng@databricks.com>
* [SPARK-10234] [MLLIB] update since version in mllib.clusteringXiangrui Meng2015-08-257-23/+44
| | | | | | | | | | | | | Same as #8421 but for `mllib.clustering`. cc feynmanliang yu-iskw Author: Xiangrui Meng <meng@databricks.com> Closes #8435 from mengxr/SPARK-10234. (cherry picked from commit d703372f86d6a59383ba8569fcd9d379849cffbf) Signed-off-by: Xiangrui Meng <meng@databricks.com>
* [SPARK-10240] [SPARK-10242] [MLLIB] update since versions in mlilb.random ↵Xiangrui Meng2015-08-254-25/+117
| | | | | | | | | | | | | | | and mllib.stat The same as #8241 but for `mllib.stat` and `mllib.random`. cc feynmanliang Author: Xiangrui Meng <meng@databricks.com> Closes #8439 from mengxr/SPARK-10242. (cherry picked from commit c3a54843c0c8a14059da4e6716c1ad45c69bbe6c) Signed-off-by: Xiangrui Meng <meng@databricks.com>