aboutsummaryrefslogtreecommitdiff
path: root/docs/sql-programming-guide.md
diff options
context:
space:
mode:
authorWenchen Fan <wenchen@databricks.com>2016-10-10 15:48:57 +0800
committerWenchen Fan <wenchen@databricks.com>2016-10-10 15:48:57 +0800
commit23ddff4b2b2744c3dc84d928e144c541ad5df376 (patch)
treef61b64ea46adbd1eb424a0bbb8e8e383d1ee4e3b /docs/sql-programming-guide.md
parent16590030c15b32e83b584283697b6f783cffe043 (diff)
downloadspark-23ddff4b2b2744c3dc84d928e144c541ad5df376.tar.gz
spark-23ddff4b2b2744c3dc84d928e144c541ad5df376.tar.bz2
spark-23ddff4b2b2744c3dc84d928e144c541ad5df376.zip
[SPARK-17338][SQL] add global temp view
## What changes were proposed in this pull request? Global temporary view is a cross-session temporary view, which means it's shared among all sessions. Its lifetime is the lifetime of the Spark application, i.e. it will be automatically dropped when the application terminates. It's tied to a system preserved database `global_temp`(configurable via SparkConf), and we must use the qualified name to refer a global temp view, e.g. SELECT * FROM global_temp.view1. changes for `SessionCatalog`: 1. add a new field `gloabalTempViews: GlobalTempViewManager`, to access the shared global temp views, and the global temp db name. 2. `createDatabase` will fail if users wanna create `global_temp`, which is system preserved. 3. `setCurrentDatabase` will fail if users wanna set `global_temp`, which is system preserved. 4. add `createGlobalTempView`, which is used in `CreateViewCommand` to create global temp views. 5. add `dropGlobalTempView`, which is used in `CatalogImpl` to drop global temp view. 6. add `alterTempViewDefinition`, which is used in `AlterViewAsCommand` to update the view definition for local/global temp views. 7. `renameTable`/`dropTable`/`isTemporaryTable`/`lookupRelation`/`getTempViewOrPermanentTableMetadata`/`refreshTable` will handle global temp views. changes for SQL commands: 1. `CreateViewCommand`/`AlterViewAsCommand` is updated to support global temp views 2. `ShowTablesCommand` outputs a new column `database`, which is used to distinguish global and local temp views. 3. other commands can also handle global temp views if they call `SessionCatalog` APIs which accepts global temp views, e.g. `DropTableCommand`, `AlterTableRenameCommand`, `ShowColumnsCommand`, etc. changes for other public API 1. add a new method `dropGlobalTempView` in `Catalog` 2. `Catalog.findTable` can find global temp view 3. add a new method `createGlobalTempView` in `Dataset` ## How was this patch tested? new tests in `SQLViewSuite` Author: Wenchen Fan <wenchen@databricks.com> Closes #14897 from cloud-fan/global-temp-view.
Diffstat (limited to 'docs/sql-programming-guide.md')
-rw-r--r--docs/sql-programming-guide.md45
1 files changed, 40 insertions, 5 deletions
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index 71bdd19c16..835cb6981f 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -220,6 +220,41 @@ The `sql` function enables applications to run SQL queries programmatically and
</div>
+## Global Temporary View
+
+Temporay views in Spark SQL are session-scoped and will disappear if the session that creates it
+terminates. If you want to have a temporary view that is shared among all sessions and keep alive
+until the Spark application terminiates, you can create a global temporary view. Global temporary
+view is tied to a system preserved database `global_temp`, and we must use the qualified name to
+refer it, e.g. `SELECT * FROM global_temp.view1`.
+
+<div class="codetabs">
+<div data-lang="scala" markdown="1">
+{% include_example global_temp_view scala/org/apache/spark/examples/sql/SparkSQLExample.scala %}
+</div>
+
+<div data-lang="java" markdown="1">
+{% include_example global_temp_view java/org/apache/spark/examples/sql/JavaSparkSQLExample.java %}
+</div>
+
+<div data-lang="python" markdown="1">
+{% include_example global_temp_view python/sql/basic.py %}
+</div>
+
+<div data-lang="sql" markdown="1">
+
+{% highlight sql %}
+
+CREATE GLOBAL TEMPORARY VIEW temp_view AS SELECT a + 1, b * 2 FROM tbl
+
+SELECT * FROM global_temp.temp_view
+
+{% endhighlight %}
+
+</div>
+</div>
+
+
## Creating Datasets
Datasets are similar to RDDs, however, instead of using Java serialization or Kryo they use
@@ -1058,14 +1093,14 @@ the Data Sources API. The following options are supported:
The JDBC fetch size, which determines how many rows to fetch per round trip. This can help performance on JDBC drivers which default to low fetch size (eg. Oracle with 10 rows).
</td>
</tr>
-
+
<tr>
<td><code>truncate</code></td>
<td>
- This is a JDBC writer related option. When <code>SaveMode.Overwrite</code> is enabled, this option causes Spark to truncate an existing table instead of dropping and recreating it. This can be more efficient, and prevents the table metadata (e.g. indices) from being removed. However, it will not work in some cases, such as when the new data has a different schema. It defaults to <code>false</code>.
+ This is a JDBC writer related option. When <code>SaveMode.Overwrite</code> is enabled, this option causes Spark to truncate an existing table instead of dropping and recreating it. This can be more efficient, and prevents the table metadata (e.g. indices) from being removed. However, it will not work in some cases, such as when the new data has a different schema. It defaults to <code>false</code>.
</td>
</tr>
-
+
<tr>
<td><code>createTableOptions</code></td>
<td>
@@ -1101,11 +1136,11 @@ USING org.apache.spark.sql.jdbc
OPTIONS (
url "jdbc:postgresql:dbserver",
dbtable "schema.tablename",
- user 'username',
+ user 'username',
password 'password'
)
-INSERT INTO TABLE jdbcTable
+INSERT INTO TABLE jdbcTable
SELECT * FROM resultTable
{% endhighlight %}