spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SPARK-12025][SPARKR] Rename some window rank function names for SparkR	Yanbo Liang	2015-11-27	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change ```cumeDist -> cume_dist, denseRank -> dense_rank, percentRank -> percent_rank, rowNumber -> row_number``` at SparkR side. There are two reasons that we should make this change: * We should follow the [naming convention rule of R](http://www.inside-r.org/node/230645) * Spark DataFrame has deprecated the old convention (such as ```cumeDist```) and will remove it in Spark 2.0. It's better to fix this issue before 1.6 release, otherwise we will make breaking API change. cc shivaram sun-rui Author: Yanbo Liang <ybliang8@gmail.com> Closes #10016 from yanboliang/SPARK-12025.
*	[SPARK-11339][SPARKR] Document the list of functions in R base package that ↵	felixcheung	2015-11-18	1	-1/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	are masked by functions with same name in SparkR Added tests for function that are reported as masked, to make sure the base:: or stats:: function can be called. For those we can't call, added them to SparkR programming guide. It would seem to me `table, sample, subset, filter, cov` not working are not actually expected - I investigated/experimented with them but couldn't get them to work. It looks like as they are defined in base or stats they are missing the S3 generic, eg. ``` > methods("transform") [1] transform,ANY-method transform.data.frame [3] transform,DataFrame-method transform.default see '?methods' for accessing help and source code > methods("subset") [1] subset.data.frame subset,DataFrame-method subset.default [4] subset.matrix see '?methods' for accessing help and source code Warning message: In .S3methods(generic.function, class, parent.frame()) : function 'subset' appears not to be S3 generic; found functions that look like S3 methods ``` Any idea? More information on masking: http://www.ats.ucla.edu/stat/r/faq/referencing_objects.htm http://www.sfu.ca/~sweldon/howTo/guide4.pdf This is what the output doc looks like (minus css): ![image](https://cloud.githubusercontent.com/assets/8969467/11229714/2946e5de-8d4d-11e5-94b0-dda9696b6fdd.png) Author: felixcheung <felixcheung_m@hotmail.com> Closes #9785 from felixcheung/rmasked.
*	[SPARK-11773][SPARKR] Implement collection functions in SparkR.	Sun Rui	2015-11-18	1	-0/+10
\| \| \| \| \| \|	Author: Sun Rui <rui.sun@intel.com> Closes #9764 from sun-rui/SPARK-11773.
*	[SPARK-11281][SPARKR] Add tests covering the issue.	zero323	2015-11-18	1	-3/+7
\| \| \| \| \| \| \| \|	The goal of this PR is to add tests covering the issue to ensure that is was resolved by [SPARK-11086](https://issues.apache.org/jira/browse/SPARK-11086). Author: zero323 <matthew.szymkiewicz@gmail.com> Closes #9743 from zero323/SPARK-11281-tests.
*	[SPARK-11086][SPARKR] Use dropFactors column-wise instead of nested loop ↵	zero323	2015-11-15	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	when createDataFrame Use `dropFactors` column-wise instead of nested loop when `createDataFrame` from a `data.frame` At this moment SparkR createDataFrame is using nested loop to convert factors to character when called on a local data.frame. It works but is incredibly slow especially with data.table (~ 2 orders of magnitude compared to PySpark / Pandas version on a DateFrame of size 1M rows x 2 columns). A simple improvement is to apply `dropFactor `column-wise and then reshape output list. It should at least partially address [SPARK-8277](https://issues.apache.org/jira/browse/SPARK-8277). Author: zero323 <matthew.szymkiewicz@gmail.com> Closes #9099 from zero323/SPARK-11086.
*	[SPARK-11420] Updating Stddev support via Imperative Aggregate	JihongMa	2015-11-12	1	-2/+2
\| \| \| \| \| \| \| \|	switched stddev support from DeclarativeAggregate to ImperativeAggregate. Author: JihongMa <linlin200605@gmail.com> Closes #9380 from JihongMA/SPARK-11420.
*	[SPARK-11468] [SPARKR] add stddev/variance agg functions for Column	felixcheung	2015-11-10	1	-16/+67
\| \| \| \| \| \| \| \| \| \|	Checked names, none of them should conflict with anything in base shivaram davies rxin Author: felixcheung <felixcheung_m@hotmail.com> Closes #9489 from felixcheung/rstddev.
*	[SPARK-10863][SPARKR] Method coltypes() (New version)	Oscar D. Lara Yejas	2015-11-10	1	-1/+23
\| \| \| \| \| \| \| \|	This is a follow up on PR #8984, as the corresponding branch for such PR was damaged. Author: Oscar D. Lara Yejas <olarayej@mail.usf.edu> Closes #9579 from olarayej/SPARK-10863_NEW14.
*	[SPARK-9865][SPARKR] Flaky SparkR test: test_sparkSQL.R: sample on a DataFrame	felixcheung	2015-11-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Make sample test less flaky by setting the seed Tested with ``` repeat { if (count(sample(df, FALSE, 0.1)) == 3) { break } } ``` Author: felixcheung <felixcheung_m@hotmail.com> Closes #9549 from felixcheung/rsample.
*	[SPARK-10116][CORE] XORShiftRandom.hashSeed is random in high bits	Imran Rashid	2015-11-06	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \|	https://issues.apache.org/jira/browse/SPARK-10116 This is really trivial, just happened to notice it -- if `XORShiftRandom.hashSeed` is really supposed to have random bits throughout (as the comment implies), it needs to do something for the conversion to `long`. mengxr mkolod Author: Imran Rashid <irashid@cloudera.com> Closes #8314 from squito/SPARK-10116.
*	[SPARK-11260][SPARKR] with() function support	adrian555	2015-11-05	1	-0/+9
\| \| \| \| \| \| \|	Author: adrian555 <wzhuang@us.ibm.com> Author: Adrian Zhuang <adrian555@users.noreply.github.com> Closes #9443 from adrian555/with.
*	[SPARK-11210][SPARKR] Add window functions into SparkR [step 2].	Sun Rui	2015-10-30	1	-0/+5
\| \| \| \| \| \|	Author: Sun Rui <rui.sun@intel.com> Closes #9196 from sun-rui/SPARK-11210.
*	[SPARK-11209][SPARKR] Add window functions into SparkR [step 1].	Sun Rui	2015-10-26	1	-0/+2
\| \| \| \| \| \|	Author: Sun Rui <rui.sun@intel.com> Closes #9193 from sun-rui/SPARK-11209.
*	[SPARK-10979][SPARKR] Sparkrmerge: Add merge to DataFrame with R signature	Narine Kokhlikyan	2015-10-26	1	-4/+33
\| \| \| \| \| \| \| \| \|	Add merge function to DataFrame, which supports R signature. https://stat.ethz.ch/R-manual/R-devel/library/base/html/merge.html Author: Narine Kokhlikyan <narine.kokhlikyan@gmail.com> Closes #9012 from NarineK/sparkrmerge.
*	[SPARK-11197][SQL] run SQL on files directly	Davies Liu	2015-10-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This PR introduce a new feature to run SQL directly on files without create a table, for example: ``` select id from json.`path/to/json/files` as j ``` Author: Davies Liu <davies@databricks.com> Closes #9173 from davies/source.
*	[SPARK-10996] [SPARKR] Implement sampleBy() in DataFrameStatFunctions.	Sun Rui	2015-10-13	1	-0/+10
\| \| \| \| \| \|	Author: Sun Rui <rui.sun@intel.com> Closes #9023 from sun-rui/SPARK-10996.
*	[SPARK-10981] [SPARKR] SparkR Join improvements	Monica Liu	2015-10-13	1	-2/+25
\| \| \| \| \| \| \| \| \| \|	I was having issues with collect() and orderBy() in Spark 1.5.0 so I used the DataFrame.R file and test_sparkSQL.R file from the Spark 1.5.1 download. I only modified the join() function in DataFrame.R to include "full", "fullouter", "left", "right", and "leftsemi" and added corresponding test cases in the test for join() and merge() in test_sparkSQL.R file. Pull request because I filed this JIRA bug report: https://issues.apache.org/jira/browse/SPARK-10981 Author: Monica Liu <liu.monica.f@gmail.com> Closes #9029 from mfliu/master.
*	[SPARK-10913] [SPARKR] attach() function support	Adrian Zhuang	2015-10-13	1	-0/+20
\| \| \| \| \| \| \| \| \|	Bring the change code up to date. Author: Adrian Zhuang <adrian555@users.noreply.github.com> Author: adrian555 <wzhuang@us.ibm.com> Closes #9031 from adrian555/attach2.
*	[SPARK-10888] [SPARKR] Added as.DataFrame as a synonym to createDataFrame	Narine Kokhlikyan	2015-10-13	1	-0/+15
\| \| \| \| \| \| \| \| \|	as.DataFrame is more a R-style like signature. Also, I'd like to know if we could make the context, e.g. sqlContext global, so that we do not have to specify it as an argument, when we each time create a dataframe. Author: Narine Kokhlikyan <narine.kokhlikyan@gmail.com> Closes #8952 from NarineK/sparkrasDataFrame.
*	[SPARK-10051] [SPARKR] Support collecting data of StructType in DataFrame	Sun Rui	2015-10-13	1	-22/+29
\| \| \| \| \| \| \| \| \| \| \| \|	Two points in this PR: 1. Originally thought was that a named R list is assumed to be a struct in SerDe. But this is problematic because some R functions will implicitly generate named lists that are not intended to be a struct when transferred by SerDe. So SerDe clients have to explicitly mark a names list as struct by changing its class from "list" to "struct". 2. SerDe is in the Spark Core module, and data of StructType is represented as GenricRow which is defined in Spark SQL module. SerDe can't import GenricRow as in maven build Spark SQL module depends on Spark Core module. So this PR adds a registration hook in SerDe to allow SQLUtils in Spark SQL module to register its functions for serialization and deserialization of StructType. Author: Sun Rui <rui.sun@intel.com> Closes #8794 from sun-rui/SPARK-10051.
*	[SPARK-10079] [SPARKR] Make 'column' and 'col' functions be S4 functions.	Sun Rui	2015-10-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	1. Add a "col" function into DataFrame. 2. Move the current "col" function in Column.R to functions.R, convert it to S4 function. 3. Add a s4 "column" function in functions.R. 4. Convert the "column" function in Column.R to S4 function. This is for private use. Author: Sun Rui <rui.sun@intel.com> Closes #8864 from sun-rui/SPARK-10079.
*	[SPARK-10905] [SPARKR] Export freqItems() for DataFrameStatFunctions	Rerngvit Yanggratoke	2015-10-09	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \|	[SPARK-10905][SparkR]: Export freqItems() for DataFrameStatFunctions - Add function (together with roxygen2 doc) to DataFrame.R and generics.R - Expose the function in NAMESPACE - Add unit test for the function Author: Rerngvit Yanggratoke <rerngvit@kth.se> Closes #8962 from rerngvit/SPARK-10905.
*	[SPARK-10836] [SPARKR] Added sort(x, decreasing, col, ... ) method to DataFrame	Narine Kokhlikyan	2015-10-08	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the sort function can be used as an alternative to arrange(... ). As arguments it accepts x - dataframe, decreasing - TRUE/FALSE, a list of orderings for columns and the list of columns, represented as string names for example: sort(df, TRUE, "col1","col2","col3","col5") # for example, if we want to sort some of the columns in the same order sort(df, decreasing=TRUE, "col1") sort(df, decreasing=c(TRUE,FALSE), "col1","col2") Author: Narine Kokhlikyan <narine.kokhlikyan@gmail.com> Closes #8920 from NarineK/sparkrsort.
*	[SPARK-10752] [SPARKR] Implement corr() and cov in DataFrameStatFunctions.	Sun Rui	2015-10-07	1	-0/+12
\| \| \| \| \| \|	Author: Sun Rui <rui.sun@intel.com> Closes #8869 from sun-rui/SPARK-10752.
*	[SPARK-10904] [SPARKR] Fix to support `select(df, c("col1", "col2"))`	felixcheung	2015-10-03	1	-1/+8
\| \| \| \| \| \| \| \|	The fix is to coerce `c("a", "b")` into a list such that it could be serialized to call JVM with. Author: felixcheung <felixcheung_m@hotmail.com> Closes #8961 from felixcheung/rselect.
*	[SPARK-10807] [SPARKR] Added as.data.frame as a synonym for collect	Oscar D. Lara Yejas	2015-09-30	1	-1/+8
\| \| \| \| \| \| \| \| \| \|	Created method as.data.frame as a synonym for collect(). Author: Oscar D. Lara Yejas <olarayej@mail.usf.edu> Author: olarayej <oscar.lara.yejas@us.ibm.com> Author: Oscar D. Lara Yejas <oscar.lara.yejas@us.ibm.com> Closes #8908 from olarayej/SPARK-10807.
*	[SPARK-10050] [SPARKR] Support collecting data of MapType in DataFrame.	Sun Rui	2015-09-16	1	-12/+44
\| \| \| \| \| \| \| \| \|	1. Support collecting data of MapType from DataFrame. 2. Support data of MapType in createDataFrame. Author: Sun Rui <rui.sun@intel.com> Closes #8711 from sun-rui/SPARK-10050.
*	[SPARK-6548] Adding stddev to DataFrame functions	JihongMa	2015-09-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Adding STDDEV support for DataFrame using 1-pass online /parallel algorithm to compute variance. Please review the code change. Author: JihongMa <linlin200605@gmail.com> Author: Jihong MA <linlin200605@gmail.com> Author: Jihong MA <jihongma@jihongs-mbp.usca.ibm.com> Author: Jihong MA <jihongma@Jihongs-MacBook-Pro.local> Closes #6297 from JihongMA/SPARK-SQL.
*	[SPARK-10049] [SPARKR] Support collecting data of ArraryType in DataFrame.	Sun Rui	2015-09-10	1	-7/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	this PR : 1. Enhance reflection in RBackend. Automatically matching a Java array to Scala Seq when finding methods. Util functions like seq(), listToSeq() in R side can be removed, as they will conflict with the Serde logic that transferrs a Scala seq to R side. 2. Enhance the SerDe to support transferring a Scala seq to R side. Data of ArrayType in DataFrame after collection is observed to be of Scala Seq type. 3. Support ArrayType in createDataFrame(). Author: Sun Rui <rui.sun@intel.com> Closes #8458 from sun-rui/SPARK-10049.
*	[SPARK-8951] [SPARKR] support Unicode characters in collect()	CHOIJAEHONG	2015-09-03	1	-0/+26
\| \| \| \| \| \| \| \| \|	Spark gives an error message and does not show the output when a field of the result DataFrame contains characters in CJK. I changed SerDe.scala in order that Spark support Unicode characters when writes a string to R. Author: CHOIJAEHONG <redrock07@naver.com> Closes #7494 from CHOIJAEHONG1/SPARK-8951.
*	[SPARK-9803] [SPARKR] Add subset and transform + tests	felixcheung	2015-08-28	1	-1/+19
\| \| \| \| \| \| \| \| \| \| \| \|	Add subset and transform Also reorganize `[` & `[[` to subset instead of select Note: for transform, transform is very similar to mutate. Spark doesn't seem to replace existing column with the name in mutate (ie. `mutate(df, age = df$age + 2)` - returned DataFrame has 2 columns with the same name 'age'), so therefore not doing that for now in transform. Though it is clearly stated it should replace column with matching name (should I open a JIRA for mutate/transform?) Author: felixcheung <felixcheung_m@hotmail.com> Closes #8503 from felixcheung/rsubset_transform.
*	[SPARK-10328] [SPARKR] Fix generic for na.omit	Shivaram Venkataraman	2015-08-28	1	-1/+22
\| \| \| \| \| \| \| \| \| \|	S3 function is at https://stat.ethz.ch/R-manual/R-patched/library/stats/html/na.fail.html Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Author: Shivaram Venkataraman <shivaram.venkataraman@gmail.com> Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8495 from shivaram/na-omit-fix.
*	[SPARK-10219] [SPARKR] Fix varargsToEnv and add test case	Shivaram Venkataraman	2015-08-26	1	-0/+6
\| \| \| \| \| \| \| \|	cc sun-rui davies Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #8475 from shivaram/varargs-fix.
*	[MINOR] [SPARKR] Fix some validation problems in SparkR	Yu ISHIKAWA	2015-08-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Getting rid of some validation problems in SparkR https://github.com/apache/spark/pull/7883 cc shivaram ``` inst/tests/test_Serde.R:26:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_Serde.R:34:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_Serde.R:37:38: style: Trailing whitespace is superfluous. expect_equal(class(x), "character") ^~ inst/tests/test_Serde.R:50:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_Serde.R:55:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_Serde.R:60:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_sparkSQL.R:611:1: style: Trailing whitespace is superfluous. ^~ R/DataFrame.R:664:1: style: Trailing whitespace is superfluous. ^~~~~~~~~~~~~~ R/DataFrame.R:670:55: style: Trailing whitespace is superfluous. df <- data.frame(row.names = 1 : nrow) ^~~~~~~~~~~~~~~~ R/DataFrame.R:672:1: style: Trailing whitespace is superfluous. ^~~~~~~~~~~~~~ R/DataFrame.R:686:49: style: Trailing whitespace is superfluous. df[[names[colIndex]]] <- vec ^~~~~~~~~~~~~~~~~~ ``` Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8474 from yu-iskw/minor-fix-sparkr.
*	[SPARK-9316] [SPARKR] Add support for filtering using `[` (synonym for ↵	felixcheung	2015-08-25	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	filter / select) Add support for ``` df[df$name == "Smith", c(1,2)] df[df$age %in% c(19, 30), 1:2] ``` shivaram Author: felixcheung <felixcheung_m@hotmail.com> Closes #8394 from felixcheung/rsubset.
*	[SPARK-10106] [SPARKR] Add `ifelse` Column function to SparkR	Yu ISHIKAWA	2015-08-19	1	-1/+2
\| \| \| \| \| \| \| \| \|	### JIRA [[SPARK-10106] Add `ifelse` Column function to SparkR - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-10106) Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8303 from yu-iskw/SPARK-10106.
*	[SPARK-9856] [SPARKR] Add expression functions into SparkR whose params are ↵	Yu ISHIKAWA	2015-08-19	1	-6/+92
\| \| \| \| \| \| \| \| \| \| \| \| \|	complicated I added lots of Column functinos into SparkR. And I also added `rand(seed: Int)` and `randn(seed: Int)` in Scala. Since we need such APIs for R integer type. ### JIRA [[SPARK-9856] Add expression functions into SparkR whose params are complicated - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-9856) Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8264 from yu-iskw/SPARK-9856-3.
*	[SPARK-10075] [SPARKR] Add `when` expressino function in SparkR	Yu ISHIKAWA	2015-08-18	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Add `when` and `otherwise` as `Column` methods - Add `When` as an expression function - Add `%otherwise%` infix as an alias of `otherwise` Since R doesn't support a feature like method chaining, `otherwise(when(condition, value), value)` style is a little annoying for me. If `%otherwise%` looks strange for shivaram, I can remove it. What do you think? ### JIRA [[SPARK-10075] Add `when` expressino function in SparkR - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-10075) Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8266 from yu-iskw/SPARK-10075.
*	[SPARK-9871] [SPARKR] Add expression functions into SparkR which have a ↵	Yu ISHIKAWA	2015-08-16	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	variable parameter ### Summary - Add `lit` function - Add `concat`, `greatest`, `least` functions I think we need to improve `collect` function in order to implement `struct` function. Since `collect` doesn't work with arguments which includes a nested `list` variable. It seems that a list against `struct` still has `jobj` classes. So it would be better to solve this problem on another issue. ### JIRA [[SPARK-9871] Add expression functions into SparkR which have a variable parameter - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-9871) Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8194 from yu-iskw/SPARK-9856.
*	[SPARK-8844] [SPARKR] head/collect is broken in SparkR.	Sun Rui	2015-08-16	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a WIP patch for SPARK-8844 for collecting reviews. This bug is about reading an empty DataFrame. in readCol(), lapply(1:numRows, function(x) { does not take into consideration the case where numRows = 0. Will add unit test case. Author: Sun Rui <rui.sun@intel.com> Closes #7419 from sun-rui/SPARK-8844.
*	[SPARK-9855] [SPARKR] Add expression functions into SparkR whose params are ↵	Yu ISHIKAWA	2015-08-12	1	-9/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	simple I added lots of expression functions for SparkR. This PR includes only functions whose params are only `(Column)` or `(Column, Column)`. And I think we need to improve how to test those functions. However, it would be better to work on another issue. ## Diff Summary - Add lots of functions in `functions.R` and their generic in `generic.R` - Add aliases for `ceiling` and `sign` - Move expression functions from `column.R` to `functions.R` - Modify `rdname` from `column` to `functions` I haven't supported `not` function, because the name has a collesion with `testthat` package. I didn't think of the way to define it. ## New Supported Functions ``` approxCountDistinct ascii base64 bin bitwiseNOT ceil (alias: ceiling) crc32 dayofmonth dayofyear explode factorial hex hour initcap isNaN last_day length log2 ltrim md5 minute month negate quarter reverse round rtrim second sha1 signum (alias: sign) size soundex to_date trim unbase64 unhex weekofyear year datediff levenshtein months_between nanvl pmod ``` ## JIRA [[SPARK-9855] Add expression functions into SparkR whose params are simple - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-9855) Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8123 from yu-iskw/SPARK-9855.
*	[SPARK-9318] [SPARK-9320] [SPARKR] Aliases for merge and summary functions ↵	Hossein	2015-07-31	1	-2/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	on DataFrames This PR adds synonyms for ```merge``` and ```summary``` in SparkR DataFrame API. cc shivaram Author: Hossein <hossein@databricks.com> Closes #7806 from falaki/SPARK-9320 and squashes the following commits: 72600f7 [Hossein] Updated docs 92a6e75 [Hossein] Fixed merge generic signature issue 4c2b051 [Hossein] Fixing naming with mllib summary 0f3a64c [Hossein] Added ... to generic for merge 30fbaf8 [Hossein] Merged master ae1a4cf [Hossein] Merge branch 'master' into SPARK-9320 e8eb86f [Hossein] Add a generic for merge fc01f2d [Hossein] Added unit test 8d92012 [Hossein] Added merge as an alias for join 5b8bedc [Hossein] Added unit test 632693d [Hossein] Added summary as an alias for describe for DataFrame
*	[SPARK-9324] [SPARK-9322] [SPARK-9321] [SPARKR] Some aliases for R-like ↵	Hossein	2015-07-31	1	-3/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	functions in DataFrames Adds following aliases: * unique (distinct) * rbind (unionAll): accepts many DataFrames * nrow (count) * ncol * dim * names (columns): along with the replacement function to change names Author: Hossein <hossein@databricks.com> Closes #7764 from falaki/sparkR-alias and squashes the following commits: 56016f5 [Hossein] Updated R documentation 5e4a4d0 [Hossein] Removed extra code f51cbef [Hossein] Merge branch 'master' into sparkR-alias c1b88bd [Hossein] Moved setGeneric and other comments applied d9307f8 [Hossein] Added tests b5aa988 [Hossein] Added dim, ncol, nrow, names, rbind, and unique functions to DataFrames
*	[SPARK-9510] [SPARKR] Remaining SparkR style fixes	Shivaram Venkataraman	2015-07-31	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	With the change in this patch, I get no more warnings from `./dev/lint-r` in my machine Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #7834 from shivaram/sparkr-style-fixes and squashes the following commits: 716cd8e [Shivaram Venkataraman] Remaining SparkR style fixes
*	[SPARK-9053] [SPARKR] Fix spaces around parens, infix operators etc.	Yu ISHIKAWA	2015-07-31	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	### JIRA [[SPARK-9053] Fix spaces around parens, infix operators etc. - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-9053) ### The Result of `lint-r` [The result of lint-r at the rivision:a4c83cb1e4b066cd60264b6572fd3e51d160d26a](https://gist.github.com/yu-iskw/d253d7f8ef351f86443d) Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #7584 from yu-iskw/SPARK-9053 and squashes the following commits: 613170f [Yu ISHIKAWA] Ignore a warning about a space before a left parentheses ede61e1 [Yu ISHIKAWA] Ignores two warnings about a space before a left parentheses. TODO: After updating `lintr`, we will remove the ignores de3e0db [Yu ISHIKAWA] Add '## nolint start' & '## nolint end' statement to ignore infix space warnings e233ea8 [Yu ISHIKAWA] [SPARK-9053][SparkR] Fix spaces around parens, infix operators etc.
*	[SPARK-8742] [SPARKR] Improve SparkR error messages for DataFrame API	Hossein	2015-07-30	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch improves SparkR error message reporting, especially with DataFrame API. When there is a user error (e.g., malformed SQL query), the message of the cause is sent back through the RPC and the R client reads it and returns it back to user. cc shivaram Author: Hossein <hossein@databricks.com> Closes #7742 from falaki/SPARK-8742 and squashes the following commits: 4f643c9 [Hossein] Not logging exceptions in RBackendHandler 4a8005c [Hossein] Returning stack track of causing exception from RBackendHandler 5cf17f0 [Hossein] Adding unit test for error messages from SQLContext 2af75d5 [Hossein] Reading error message in case of failure and stoping with that message f479c99 [Hossein] Wrting exception cause message in JVM
*	[SPARK-9248] [SPARKR] Closing curly-braces should always be on their own line	Yuu ISHIKAWA	2015-07-30	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	### JIRA [[SPARK-9248] Closing curly-braces should always be on their own line - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-9248) ## The result of `dev/lint-r` [The result of `dev/lint-r` for SPARK-9248 at the revistion:6175d6cfe795fbd88e3ee713fac375038a3993a8](https://gist.github.com/yu-iskw/96cadcea4ce664c41f81) Author: Yuu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #7795 from yu-iskw/SPARK-9248 and squashes the following commits: c8eccd3 [Yuu ISHIKAWA] [SPARK-9248][SparkR] Closing curly-braces should always be on their own line
*	[SPARK-8364] [SPARKR] Add crosstab to SparkR DataFrames	Xiangrui Meng	2015-07-22	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add `crosstab` to SparkR DataFrames, which takes two column names and returns a local R data.frame. This is similar to `table` in R. However, `table` in SparkR is used for loading SQL tables as DataFrames. The return type is data.frame instead table for `crosstab` to be compatible with Scala/Python. I couldn't run R tests successfully on my local. Many unit tests failed. So let's try Jenkins. Author: Xiangrui Meng <meng@databricks.com> Closes #7318 from mengxr/SPARK-8364 and squashes the following commits: d75e894 [Xiangrui Meng] fix tests 53f6ddd [Xiangrui Meng] fix tests f1348d6 [Xiangrui Meng] update test 47cb088 [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into SPARK-8364 5621262 [Xiangrui Meng] first version without test
*	[SPARK-9093] [SPARKR] Fix single-quotes strings in SparkR	Yu ISHIKAWA	2015-07-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	[[SPARK-9093] Fix single-quotes strings in SparkR - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-9093) This is the result of lintr at the rivision:011551620faa87107a787530f074af3d9be7e695 [[SPARK-9093] The result of lintr at 011551620faa87107a787530f074af3d9be7e695](https://gist.github.com/yu-iskw/8c47acf3202796da4d01) Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #7439 from yu-iskw/SPARK-9093 and squashes the following commits: 61c391e [Yu ISHIKAWA] [SPARK-9093][SparkR] Fix single-quotes strings in SparkR
*	[SPARK-8807] [SPARKR] Add between operator in SparkR	Liang-Chi Hsieh	2015-07-15	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \|	JIRA: https://issues.apache.org/jira/browse/SPARK-8807 Add between operator in SparkR. Author: Liang-Chi Hsieh <viirya@appier.com> Closes #7356 from viirya/add_r_between and squashes the following commits: 7f51b44 [Liang-Chi Hsieh] Add test for non-numeric column. c6a25c5 [Liang-Chi Hsieh] Add between function.