aboutsummaryrefslogtreecommitdiff
path: root/docs/sql-programming-guide.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/sql-programming-guide.md')
-rw-r--r--docs/sql-programming-guide.md56
1 files changed, 14 insertions, 42 deletions
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index d8c8698e31..5877f2b745 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -132,7 +132,7 @@ from a Hive table, or from [Spark data sources](#data-sources).
As an example, the following creates a DataFrame based on the content of a JSON file:
-{% include_example create_DataFrames r/RSparkSQLExample.R %}
+{% include_example create_df r/RSparkSQLExample.R %}
</div>
</div>
@@ -180,7 +180,7 @@ In addition to simple column references and expressions, DataFrames also have a
<div data-lang="r" markdown="1">
-{% include_example dataframe_operations r/RSparkSQLExample.R %}
+{% include_example untyped_ops r/RSparkSQLExample.R %}
For a complete list of the types of operations that can be performed on a DataFrame refer to the [API Documentation](api/R/index.html).
@@ -214,7 +214,7 @@ The `sql` function on a `SparkSession` enables applications to run SQL queries p
<div data-lang="r" markdown="1">
The `sql` function enables applications to run SQL queries programmatically and returns the result as a `SparkDataFrame`.
-{% include_example sql_query r/RSparkSQLExample.R %}
+{% include_example run_sql r/RSparkSQLExample.R %}
</div>
</div>
@@ -377,7 +377,7 @@ In the simplest form, the default data source (`parquet` unless otherwise config
<div data-lang="r" markdown="1">
-{% include_example source_parquet r/RSparkSQLExample.R %}
+{% include_example generic_load_save_functions r/RSparkSQLExample.R %}
</div>
</div>
@@ -400,13 +400,11 @@ using this syntax.
</div>
<div data-lang="python" markdown="1">
-
{% include_example manual_load_options python/sql/datasource.py %}
</div>
-<div data-lang="r" markdown="1">
-
-{% include_example source_json r/RSparkSQLExample.R %}
+<div data-lang="r" markdown="1">
+{% include_example manual_load_options r/RSparkSQLExample.R %}
</div>
</div>
@@ -425,13 +423,11 @@ file directly with SQL.
</div>
<div data-lang="python" markdown="1">
-
{% include_example direct_sql python/sql/datasource.py %}
</div>
<div data-lang="r" markdown="1">
-
-{% include_example direct_query r/RSparkSQLExample.R %}
+{% include_example direct_sql r/RSparkSQLExample.R %}
</div>
</div>
@@ -523,7 +519,7 @@ Using the data from the above example:
<div data-lang="r" markdown="1">
-{% include_example load_programmatically r/RSparkSQLExample.R %}
+{% include_example basic_parquet_example r/RSparkSQLExample.R %}
</div>
@@ -839,7 +835,7 @@ Note that the file that is offered as _a json file_ is not a typical JSON file.
line must contain a separate, self-contained valid JSON object. As a consequence,
a regular multi-line JSON file will most often fail.
-{% include_example load_json_file r/RSparkSQLExample.R %}
+{% include_example json_dataset r/RSparkSQLExample.R %}
</div>
@@ -925,7 +921,7 @@ You may need to grant write privilege to the user who starts the spark applicati
When working with Hive one must instantiate `SparkSession` with Hive support. This
adds support for finding tables in the MetaStore and writing queries using HiveQL.
-{% include_example hive_table r/RSparkSQLExample.R %}
+{% include_example spark_hive r/RSparkSQLExample.R %}
</div>
</div>
@@ -1067,43 +1063,19 @@ the Data Sources API. The following options are supported:
<div class="codetabs">
<div data-lang="scala" markdown="1">
-
-{% highlight scala %}
-val jdbcDF = spark.read.format("jdbc").options(
- Map("url" -> "jdbc:postgresql:dbserver",
- "dbtable" -> "schema.tablename")).load()
-{% endhighlight %}
-
+{% include_example jdbc_dataset scala/org/apache/spark/examples/sql/SQLDataSourceExample.scala %}
</div>
<div data-lang="java" markdown="1">
-
-{% highlight java %}
-
-Map<String, String> options = new HashMap<>();
-options.put("url", "jdbc:postgresql:dbserver");
-options.put("dbtable", "schema.tablename");
-
-Dataset<Row> jdbcDF = spark.read().format("jdbc"). options(options).load();
-{% endhighlight %}
-
-
+{% include_example jdbc_dataset java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java %}
</div>
<div data-lang="python" markdown="1">
-
-{% highlight python %}
-
-df = spark.read.format('jdbc').options(url='jdbc:postgresql:dbserver', dbtable='schema.tablename').load()
-
-{% endhighlight %}
-
+{% include_example jdbc_dataset python/sql/datasource.py %}
</div>
<div data-lang="r" markdown="1">
-
-{% include_example jdbc r/RSparkSQLExample.R %}
-
+{% include_example jdbc_dataset r/RSparkSQLExample.R %}
</div>
<div data-lang="sql" markdown="1">