diff options
Diffstat (limited to 'docs/sql-programming-guide.md')
-rw-r--r-- | docs/sql-programming-guide.md | 56 |
1 files changed, 14 insertions, 42 deletions
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index d8c8698e31..5877f2b745 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -132,7 +132,7 @@ from a Hive table, or from [Spark data sources](#data-sources). As an example, the following creates a DataFrame based on the content of a JSON file: -{% include_example create_DataFrames r/RSparkSQLExample.R %} +{% include_example create_df r/RSparkSQLExample.R %} </div> </div> @@ -180,7 +180,7 @@ In addition to simple column references and expressions, DataFrames also have a <div data-lang="r" markdown="1"> -{% include_example dataframe_operations r/RSparkSQLExample.R %} +{% include_example untyped_ops r/RSparkSQLExample.R %} For a complete list of the types of operations that can be performed on a DataFrame refer to the [API Documentation](api/R/index.html). @@ -214,7 +214,7 @@ The `sql` function on a `SparkSession` enables applications to run SQL queries p <div data-lang="r" markdown="1"> The `sql` function enables applications to run SQL queries programmatically and returns the result as a `SparkDataFrame`. -{% include_example sql_query r/RSparkSQLExample.R %} +{% include_example run_sql r/RSparkSQLExample.R %} </div> </div> @@ -377,7 +377,7 @@ In the simplest form, the default data source (`parquet` unless otherwise config <div data-lang="r" markdown="1"> -{% include_example source_parquet r/RSparkSQLExample.R %} +{% include_example generic_load_save_functions r/RSparkSQLExample.R %} </div> </div> @@ -400,13 +400,11 @@ using this syntax. </div> <div data-lang="python" markdown="1"> - {% include_example manual_load_options python/sql/datasource.py %} </div> -<div data-lang="r" markdown="1"> - -{% include_example source_json r/RSparkSQLExample.R %} +<div data-lang="r" markdown="1"> +{% include_example manual_load_options r/RSparkSQLExample.R %} </div> </div> @@ -425,13 +423,11 @@ file directly with SQL. </div> <div data-lang="python" markdown="1"> - {% include_example direct_sql python/sql/datasource.py %} </div> <div data-lang="r" markdown="1"> - -{% include_example direct_query r/RSparkSQLExample.R %} +{% include_example direct_sql r/RSparkSQLExample.R %} </div> </div> @@ -523,7 +519,7 @@ Using the data from the above example: <div data-lang="r" markdown="1"> -{% include_example load_programmatically r/RSparkSQLExample.R %} +{% include_example basic_parquet_example r/RSparkSQLExample.R %} </div> @@ -839,7 +835,7 @@ Note that the file that is offered as _a json file_ is not a typical JSON file. line must contain a separate, self-contained valid JSON object. As a consequence, a regular multi-line JSON file will most often fail. -{% include_example load_json_file r/RSparkSQLExample.R %} +{% include_example json_dataset r/RSparkSQLExample.R %} </div> @@ -925,7 +921,7 @@ You may need to grant write privilege to the user who starts the spark applicati When working with Hive one must instantiate `SparkSession` with Hive support. This adds support for finding tables in the MetaStore and writing queries using HiveQL. -{% include_example hive_table r/RSparkSQLExample.R %} +{% include_example spark_hive r/RSparkSQLExample.R %} </div> </div> @@ -1067,43 +1063,19 @@ the Data Sources API. The following options are supported: <div class="codetabs"> <div data-lang="scala" markdown="1"> - -{% highlight scala %} -val jdbcDF = spark.read.format("jdbc").options( - Map("url" -> "jdbc:postgresql:dbserver", - "dbtable" -> "schema.tablename")).load() -{% endhighlight %} - +{% include_example jdbc_dataset scala/org/apache/spark/examples/sql/SQLDataSourceExample.scala %} </div> <div data-lang="java" markdown="1"> - -{% highlight java %} - -Map<String, String> options = new HashMap<>(); -options.put("url", "jdbc:postgresql:dbserver"); -options.put("dbtable", "schema.tablename"); - -Dataset<Row> jdbcDF = spark.read().format("jdbc"). options(options).load(); -{% endhighlight %} - - +{% include_example jdbc_dataset java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java %} </div> <div data-lang="python" markdown="1"> - -{% highlight python %} - -df = spark.read.format('jdbc').options(url='jdbc:postgresql:dbserver', dbtable='schema.tablename').load() - -{% endhighlight %} - +{% include_example jdbc_dataset python/sql/datasource.py %} </div> <div data-lang="r" markdown="1"> - -{% include_example jdbc r/RSparkSQLExample.R %} - +{% include_example jdbc_dataset r/RSparkSQLExample.R %} </div> <div data-lang="sql" markdown="1"> |