Updates to SQL page

author: Matei Alexandru Zaharia <matei@apache.org> 2015-07-25 23:10:48 +0000
committer: Matei Alexandru Zaharia <matei@apache.org> 2015-07-25 23:10:48 +0000
commit: f4fb827ef5aa831ace6f0ce21d6b02e83f409b63 (patch)
tree: 9c84b511d584f0f9cd3500f6a887fc92d8348955 /site/sql
parent: 2de4e60511dad1ec7e4ac3974b14dcf85faaad50 (diff)
download: spark-website-f4fb827ef5aa831ace6f0ce21d6b02e83f409b63.tar.gz
spark-website-f4fb827ef5aa831ace6f0ce21d6b02e83f409b63.tar.bz2
spark-website-f4fb827ef5aa831ace6f0ce21d6b02e83f409b63.zip
1 files changed, 15 insertions, 18 deletions
diff --git a/site/sql/index.html b/site/sql/index.html
index e14fc7171..1f1b6ab0c 100644
--- a/site/sql/index.html
+++ b/site/sql/index.html
@@ -6,7 +6,7 @@
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
 
   <title>
-     Spark SQL | Apache Spark
+     Spark SQL &amp; DataFrames | Apache Spark
     
   </title>
 
@@ -93,7 +93,7 @@
           Libraries <b class="caret"></b>
         </a>
         <ul class="dropdown-menu">
-          <li><a href="/sql/">Spark SQL</a></li>
+          <li><a href="/sql/">SQL and DataFrames</a></li>
           <li><a href="/streaming/">Spark Streaming</a></li>
           <li><a href="/mllib/">MLlib (machine learning)</a></li>
           <li><a href="/graphx/">GraphX (graph)</a></li>
@@ -160,7 +160,7 @@
         Built-in Libraries:
       </p>
       <ul class="list-none">
-        <li><a href="/sql/">Spark SQL</a></li>
+        <li><a href="/sql/">SQL and DataFrames</a></li>
         <li><a href="/streaming/">Spark Streaming</a></li>
         <li><a href="/mllib/">MLlib (machine learning)</a></li>
         <li><a href="/graphx/">GraphX (graph)</a></li>
@@ -181,8 +181,7 @@
 	  Seamlessly mix SQL queries with Spark programs.
     </p>
     <p>
-	  Spark SQL lets you query structured data as a distributed dataset (RDD) in Spark, with integrated APIs in Python, Scala and Java. 
-	  This tight integration makes it easy to run SQL queries alongside complex analytic algorithms.
+	  Spark SQL lets you query structured data inside Spark programs, using either SQL or a familiar <a href="/docs/latest/sql-programming-guide.html">DataFrame API</a>. Usable in Java, Scala, Python and R.
     </p>
   </div>
   <div class="col-md-5 col-sm-5 col-padded-top col-center">
@@ -200,19 +199,19 @@
 
 <div class="row row-padded">
   <div class="col-md-7 col-sm-7">
-    <h2>Unified Data Access</h2>
+    <h2>Uniform Data Access</h2>
     <p class="lead">
-      Load and query data from a variety of sources.
+      Connect to any data source the same way.
     </p>
     <p>
-      SchemaRDDs provide a single interface for efficiently working with structured data, including Apache Hive tables, parquet files and JSON files.
+      DataFrames and SQL provide a common way to access a variety of data sources, including Hive, Avro, Parquet, ORC, JSON, and JDBC. You can even join data across these sources.
     </p>
   </div>
   <div class="col-md-5 col-sm-5 col-padded-top col-center">
     <div style="margin-top: 15px; text-align: left; display: inline-block;">
       <div class="code">
 		sqlCtx.<span class="sparkop">jsonFile</span>(<span class="closure">"s3n://..."</span>)<br />&nbsp;&nbsp;.registerAsTable("json")<br />
-		schema_rdd = sqlCtx.<span class="sparkop">sql</span>(<span class="closure">"""<br />
+		results = sqlCtx.<span class="sparkop">sql</span>(<span class="closure">"""<br />
 			&nbsp;&nbsp;SELECT * <br />
 			&nbsp;&nbsp;FROM hiveTable<br />
 			&nbsp;&nbsp;JOIN json ..."""</span>)<br />
@@ -226,7 +225,7 @@
   <div class="col-md-7 col-sm-7">
     <h2>Hive Compatibility</h2>
     <p class="lead">
-      Run unmodified Hive queries on existing warehouses.
+      Run unmodified Hive queries on existing data.
     </p>
     <p>
       Spark SQL reuses the Hive frontend and metastore, giving you full compatibility with
@@ -248,7 +247,7 @@
       Connect through JDBC or ODBC.
     </p>
     <p>
-      Spark SQL includes a server mode with industry standard JDBC and ODBC connectivity.
+      A server mode provides industry standard JDBC and ODBC connectivity for business intelligence tools.
     </p>
   </div>
   <div class="col-md-5 col-sm-5 col-padded-top col-center">
@@ -288,13 +287,11 @@
   
 <div class="row">
   <div class="col-md-4 col-padded">
-    <h3>Scalability</h3>
+    <h3>Performance &amp; Scalability</h3>
     <p>
-  	  Use the same engine for both interactive and long queries.		
-    </p>
-	<p>
-      Spark SQL takes advantage of the RDD model to support mid-query fault tolerance, letting it scale to large jobs too.
-	  Don't worry about using a different engine for historical data.
+      Spark SQL includes a cost-based optimizer, columnar storage and code generation to make queries fast.
+      At the same time, it scales to thousands of nodes and multi hour queries using the Spark engine, which provides full mid-query fault tolerance.
+      Don't worry about using a different engine for historical data.
     </p>
   </div>
 
@@ -322,7 +319,7 @@
     </p>
     <ul class="list-narrow">
       <li><a href="/downloads.html">Download Spark</a>. It includes Spark SQL as a module.</li>
-      <li>Read the <a href="/docs/latest/sql-programming-guide.html">Spark SQL programming guide</a>, which includes a examples of common use cases.</li>
+      <li>Read the <a href="/docs/latest/sql-programming-guide.html">Spark SQL and DataFrame guide</a> to learn the API.</li>
     </ul>
   </div>
 </div>
author	Matei Alexandru Zaharia <matei@apache.org>	2015-07-25 23:10:48 +0000
committer	Matei Alexandru Zaharia <matei@apache.org>	2015-07-25 23:10:48 +0000
commit	f4fb827ef5aa831ace6f0ce21d6b02e83f409b63 (patch)
tree	9c84b511d584f0f9cd3500f6a887fc92d8348955 /site/sql
parent	2de4e60511dad1ec7e4ac3974b14dcf85faaad50 (diff)
download	spark-website-f4fb827ef5aa831ace6f0ce21d6b02e83f409b63.tar.gz spark-website-f4fb827ef5aa831ace6f0ce21d6b02e83f409b63.tar.bz2 spark-website-f4fb827ef5aa831ace6f0ce21d6b02e83f409b63.zip