From 10e1c0e638774f5d746771b6dd251de2480f94eb Mon Sep 17 00:00:00 2001 From: Cheng Lian Date: Tue, 2 Aug 2016 15:02:40 +0800 Subject: [SPARK-16734][EXAMPLES][SQL] Revise examples of all language bindings ## What changes were proposed in this pull request? This PR makes various minor updates to examples of all language bindings to make sure they are consistent with each other. Some typos and missing parts (JDBC example in Scala/Java/Python) are also fixed. ## How was this patch tested? Manually tested. Author: Cheng Lian Closes #14368 from liancheng/revise-examples. --- docs/sql-programming-guide.md | 56 +++++++++++-------------------------------- 1 file changed, 14 insertions(+), 42 deletions(-) (limited to 'docs/sql-programming-guide.md') diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index d8c8698e31..5877f2b745 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -132,7 +132,7 @@ from a Hive table, or from [Spark data sources](#data-sources). As an example, the following creates a DataFrame based on the content of a JSON file: -{% include_example create_DataFrames r/RSparkSQLExample.R %} +{% include_example create_df r/RSparkSQLExample.R %} @@ -180,7 +180,7 @@ In addition to simple column references and expressions, DataFrames also have a
-{% include_example dataframe_operations r/RSparkSQLExample.R %} +{% include_example untyped_ops r/RSparkSQLExample.R %} For a complete list of the types of operations that can be performed on a DataFrame refer to the [API Documentation](api/R/index.html). @@ -214,7 +214,7 @@ The `sql` function on a `SparkSession` enables applications to run SQL queries p
The `sql` function enables applications to run SQL queries programmatically and returns the result as a `SparkDataFrame`. -{% include_example sql_query r/RSparkSQLExample.R %} +{% include_example run_sql r/RSparkSQLExample.R %}
@@ -377,7 +377,7 @@ In the simplest form, the default data source (`parquet` unless otherwise config
-{% include_example source_parquet r/RSparkSQLExample.R %} +{% include_example generic_load_save_functions r/RSparkSQLExample.R %}
@@ -400,13 +400,11 @@ using this syntax.
- {% include_example manual_load_options python/sql/datasource.py %}
-
- -{% include_example source_json r/RSparkSQLExample.R %} +
+{% include_example manual_load_options r/RSparkSQLExample.R %}
@@ -425,13 +423,11 @@ file directly with SQL.
- {% include_example direct_sql python/sql/datasource.py %}
- -{% include_example direct_query r/RSparkSQLExample.R %} +{% include_example direct_sql r/RSparkSQLExample.R %}
@@ -523,7 +519,7 @@ Using the data from the above example:
-{% include_example load_programmatically r/RSparkSQLExample.R %} +{% include_example basic_parquet_example r/RSparkSQLExample.R %}
@@ -839,7 +835,7 @@ Note that the file that is offered as _a json file_ is not a typical JSON file. line must contain a separate, self-contained valid JSON object. As a consequence, a regular multi-line JSON file will most often fail. -{% include_example load_json_file r/RSparkSQLExample.R %} +{% include_example json_dataset r/RSparkSQLExample.R %} @@ -925,7 +921,7 @@ You may need to grant write privilege to the user who starts the spark applicati When working with Hive one must instantiate `SparkSession` with Hive support. This adds support for finding tables in the MetaStore and writing queries using HiveQL. -{% include_example hive_table r/RSparkSQLExample.R %} +{% include_example spark_hive r/RSparkSQLExample.R %} @@ -1067,43 +1063,19 @@ the Data Sources API. The following options are supported:
- -{% highlight scala %} -val jdbcDF = spark.read.format("jdbc").options( - Map("url" -> "jdbc:postgresql:dbserver", - "dbtable" -> "schema.tablename")).load() -{% endhighlight %} - +{% include_example jdbc_dataset scala/org/apache/spark/examples/sql/SQLDataSourceExample.scala %}
- -{% highlight java %} - -Map options = new HashMap<>(); -options.put("url", "jdbc:postgresql:dbserver"); -options.put("dbtable", "schema.tablename"); - -Dataset jdbcDF = spark.read().format("jdbc"). options(options).load(); -{% endhighlight %} - - +{% include_example jdbc_dataset java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java %}
- -{% highlight python %} - -df = spark.read.format('jdbc').options(url='jdbc:postgresql:dbserver', dbtable='schema.tablename').load() - -{% endhighlight %} - +{% include_example jdbc_dataset python/sql/datasource.py %}
- -{% include_example jdbc r/RSparkSQLExample.R %} - +{% include_example jdbc_dataset r/RSparkSQLExample.R %}
-- cgit v1.2.3