aboutsummaryrefslogtreecommitdiff
path: root/docs/sql-programming-guide.md
diff options
context:
space:
mode:
authorCheng Lian <lian@databricks.com>2016-08-02 15:02:40 +0800
committerWenchen Fan <wenchen@databricks.com>2016-08-02 15:02:40 +0800
commit10e1c0e638774f5d746771b6dd251de2480f94eb (patch)
tree9aa40fef6c863aceb18243bc0ff8c7a824818cf7 /docs/sql-programming-guide.md
parent5184df06b347f86776c8ac87415b8002a5942a35 (diff)
downloadspark-10e1c0e638774f5d746771b6dd251de2480f94eb.tar.gz
spark-10e1c0e638774f5d746771b6dd251de2480f94eb.tar.bz2
spark-10e1c0e638774f5d746771b6dd251de2480f94eb.zip
[SPARK-16734][EXAMPLES][SQL] Revise examples of all language bindings
## What changes were proposed in this pull request? This PR makes various minor updates to examples of all language bindings to make sure they are consistent with each other. Some typos and missing parts (JDBC example in Scala/Java/Python) are also fixed. ## How was this patch tested? Manually tested. Author: Cheng Lian <lian@databricks.com> Closes #14368 from liancheng/revise-examples.
Diffstat (limited to 'docs/sql-programming-guide.md')
-rw-r--r--docs/sql-programming-guide.md56
1 files changed, 14 insertions, 42 deletions
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index d8c8698e31..5877f2b745 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -132,7 +132,7 @@ from a Hive table, or from [Spark data sources](#data-sources).
As an example, the following creates a DataFrame based on the content of a JSON file:
-{% include_example create_DataFrames r/RSparkSQLExample.R %}
+{% include_example create_df r/RSparkSQLExample.R %}
</div>
</div>
@@ -180,7 +180,7 @@ In addition to simple column references and expressions, DataFrames also have a
<div data-lang="r" markdown="1">
-{% include_example dataframe_operations r/RSparkSQLExample.R %}
+{% include_example untyped_ops r/RSparkSQLExample.R %}
For a complete list of the types of operations that can be performed on a DataFrame refer to the [API Documentation](api/R/index.html).
@@ -214,7 +214,7 @@ The `sql` function on a `SparkSession` enables applications to run SQL queries p
<div data-lang="r" markdown="1">
The `sql` function enables applications to run SQL queries programmatically and returns the result as a `SparkDataFrame`.
-{% include_example sql_query r/RSparkSQLExample.R %}
+{% include_example run_sql r/RSparkSQLExample.R %}
</div>
</div>
@@ -377,7 +377,7 @@ In the simplest form, the default data source (`parquet` unless otherwise config
<div data-lang="r" markdown="1">
-{% include_example source_parquet r/RSparkSQLExample.R %}
+{% include_example generic_load_save_functions r/RSparkSQLExample.R %}
</div>
</div>
@@ -400,13 +400,11 @@ using this syntax.
</div>
<div data-lang="python" markdown="1">
-
{% include_example manual_load_options python/sql/datasource.py %}
</div>
-<div data-lang="r" markdown="1">
-
-{% include_example source_json r/RSparkSQLExample.R %}
+<div data-lang="r" markdown="1">
+{% include_example manual_load_options r/RSparkSQLExample.R %}
</div>
</div>
@@ -425,13 +423,11 @@ file directly with SQL.
</div>
<div data-lang="python" markdown="1">
-
{% include_example direct_sql python/sql/datasource.py %}
</div>
<div data-lang="r" markdown="1">
-
-{% include_example direct_query r/RSparkSQLExample.R %}
+{% include_example direct_sql r/RSparkSQLExample.R %}
</div>
</div>
@@ -523,7 +519,7 @@ Using the data from the above example:
<div data-lang="r" markdown="1">
-{% include_example load_programmatically r/RSparkSQLExample.R %}
+{% include_example basic_parquet_example r/RSparkSQLExample.R %}
</div>
@@ -839,7 +835,7 @@ Note that the file that is offered as _a json file_ is not a typical JSON file.
line must contain a separate, self-contained valid JSON object. As a consequence,
a regular multi-line JSON file will most often fail.
-{% include_example load_json_file r/RSparkSQLExample.R %}
+{% include_example json_dataset r/RSparkSQLExample.R %}
</div>
@@ -925,7 +921,7 @@ You may need to grant write privilege to the user who starts the spark applicati
When working with Hive one must instantiate `SparkSession` with Hive support. This
adds support for finding tables in the MetaStore and writing queries using HiveQL.
-{% include_example hive_table r/RSparkSQLExample.R %}
+{% include_example spark_hive r/RSparkSQLExample.R %}
</div>
</div>
@@ -1067,43 +1063,19 @@ the Data Sources API. The following options are supported:
<div class="codetabs">
<div data-lang="scala" markdown="1">
-
-{% highlight scala %}
-val jdbcDF = spark.read.format("jdbc").options(
- Map("url" -> "jdbc:postgresql:dbserver",
- "dbtable" -> "schema.tablename")).load()
-{% endhighlight %}
-
+{% include_example jdbc_dataset scala/org/apache/spark/examples/sql/SQLDataSourceExample.scala %}
</div>
<div data-lang="java" markdown="1">
-
-{% highlight java %}
-
-Map<String, String> options = new HashMap<>();
-options.put("url", "jdbc:postgresql:dbserver");
-options.put("dbtable", "schema.tablename");
-
-Dataset<Row> jdbcDF = spark.read().format("jdbc"). options(options).load();
-{% endhighlight %}
-
-
+{% include_example jdbc_dataset java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java %}
</div>
<div data-lang="python" markdown="1">
-
-{% highlight python %}
-
-df = spark.read.format('jdbc').options(url='jdbc:postgresql:dbserver', dbtable='schema.tablename').load()
-
-{% endhighlight %}
-
+{% include_example jdbc_dataset python/sql/datasource.py %}
</div>
<div data-lang="r" markdown="1">
-
-{% include_example jdbc r/RSparkSQLExample.R %}
-
+{% include_example jdbc_dataset r/RSparkSQLExample.R %}
</div>
<div data-lang="sql" markdown="1">