[SPARK-14883][DOCS] Fix wrong R examples and make them up-to-date

## What changes were proposed in this pull request? This issue aims to fix some errors in R examples and make them up-to-date in docs and example modules. - Remove the wrong usage of `map`. We need to use `lapply` in `sparkR` if needed. However, `lapply` is private so far. The corrected example will be added later. - Fix the wrong example in Section `Generic Load/Save Functions` of `docs/sql-programming-guide.md` for consistency - Fix datatypes in `sparkr.md`. - Update a data result in `sparkr.md`. - Replace deprecated functions to remove warnings: jsonFile -> read.json, parquetFile -> read.parquet - Use up-to-date R-like functions: loadDF -> read.df, saveDF -> write.df, saveAsParquetFile -> write.parquet - Replace `SparkR DataFrame` with `SparkDataFrame` in `dataframe.R` and `data-manipulation.R`. - Other minor syntax fixes and a typo. ## How was this patch tested? Manual. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #12649 from dongjoon-hyun/SPARK-14883.
author: Dongjoon Hyun <dongjoon@apache.org> 2016-04-24 22:10:27 -0700
committer: Shivaram Venkataraman <shivaram@cs.berkeley.edu> 2016-04-24 22:10:27 -0700
commit: 6ab4d9e0c76b69b4d6d5f39037a77bdfb042be19 (patch)
tree: 494b601ba783d7b025b805504bde8f3f92b7667b /docs/sparkr.md
parent: 35319d326488b3bf9235dfcf9ac4533ce846f21f (diff)
download: spark-6ab4d9e0c76b69b4d6d5f39037a77bdfb042be19.tar.gz
spark-6ab4d9e0c76b69b4d6d5f39037a77bdfb042be19.tar.bz2
spark-6ab4d9e0c76b69b4d6d5f39037a77bdfb042be19.zip
1 files changed, 5 insertions, 6 deletions
diff --git a/docs/sparkr.md b/docs/sparkr.md
index a0b4f93776..760534ae14 100644
--- a/docs/sparkr.md
+++ b/docs/sparkr.md
@@ -141,7 +141,7 @@ head(people)
 # SparkR automatically infers the schema from the JSON file
 printSchema(people)
 # root
-#  |-- age: integer (nullable = true)
+#  |-- age: long (nullable = true)
 #  |-- name: string (nullable = true)
 
 {% endhighlight %}
@@ -195,7 +195,7 @@ df <- createDataFrame(sqlContext, faithful)
 
 # Get basic information about the DataFrame
 df
-## DataFrame[eruptions:double, waiting:double]
+## SparkDataFrame[eruptions:double, waiting:double]
 
 # Select only the "eruptions" column
 head(select(df, df$eruptions))
@@ -228,14 +228,13 @@ SparkR data frames support a number of commonly used functions to aggregate data
 # We use the `n` operator to count the number of times each waiting time appears
 head(summarize(groupBy(df, df$waiting), count = n(df$waiting)))
 ##  waiting count
-##1      81    13
-##2      60     6
-##3      68     1
+##1      70     4
+##2      67     1
+##3      69     2
 
 # We can also sort the output from the aggregation to get the most common waiting times
 waiting_counts <- summarize(groupBy(df, df$waiting), count = n(df$waiting))
 head(arrange(waiting_counts, desc(waiting_counts$count)))
-
 ##   waiting count
 ##1      78    15
 ##2      83    14
author	Dongjoon Hyun <dongjoon@apache.org>	2016-04-24 22:10:27 -0700
committer	Shivaram Venkataraman <shivaram@cs.berkeley.edu>	2016-04-24 22:10:27 -0700
commit	6ab4d9e0c76b69b4d6d5f39037a77bdfb042be19 (patch)
tree	494b601ba783d7b025b805504bde8f3f92b7667b /docs/sparkr.md
parent	35319d326488b3bf9235dfcf9ac4533ce846f21f (diff)
download	spark-6ab4d9e0c76b69b4d6d5f39037a77bdfb042be19.tar.gz spark-6ab4d9e0c76b69b4d6d5f39037a77bdfb042be19.tar.bz2 spark-6ab4d9e0c76b69b4d6d5f39037a77bdfb042be19.zip