summaryrefslogtreecommitdiff
path: root/site/documentation.html
diff options
context:
space:
mode:
Diffstat (limited to 'site/documentation.html')
-rw-r--r--site/documentation.html308
1 files changed, 175 insertions, 133 deletions
diff --git a/site/documentation.html b/site/documentation.html
index 697919801..a2aae3e54 100644
--- a/site/documentation.html
+++ b/site/documentation.html
@@ -1,27 +1,20 @@
<!DOCTYPE html>
-<!--[if IE 6]>
-<html id="ie6" dir="ltr" lang="en-US">
-<![endif]-->
-<!--[if IE 7]>
-<html id="ie7" dir="ltr" lang="en-US">
-<![endif]-->
-<!--[if IE 8]>
-<html id="ie8" dir="ltr" lang="en-US">
-<![endif]-->
-<!--[if !(IE 6) | !(IE 7) | !(IE 8) ]><!-->
-<html dir="ltr" lang="en-US">
-<!--<![endif]-->
+<html lang="en">
<head>
- <link rel="shortcut icon" href="/favicon.ico" />
- <meta charset="UTF-8" />
- <meta name="viewport" content="width=device-width" />
+ <meta charset="utf-8">
+ <meta http-equiv="X-UA-Compatible" content="IE=edge">
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
+
<title>
Documentation | Apache Spark
</title>
- <link rel="stylesheet" type="text/css" media="all" href="/css/style.css" />
- <link rel="stylesheet" href="/css/pygments-default.css">
+
+
+ <!-- Bootstrap core CSS -->
+ <link href="/css/cerulean.min.css" rel="stylesheet">
+ <link href="/css/custom.css" rel="stylesheet">
<script type="text/javascript">
<!-- Google Analytics initialization -->
@@ -46,102 +39,137 @@
}
</script>
- <link rel='canonical' href='/index.html' />
-
- <style type="text/css">
- #site-title,
- #site-description {
- position: absolute !important;
- clip: rect(1px 1px 1px 1px); /* IE6, IE7 */
- clip: rect(1px, 1px, 1px, 1px);
- }
- </style>
- <style type="text/css" id="custom-background-css">
- body.custom-background { background-color: #f1f1f1; }
- </style>
+ <!-- HTML5 shim and Respond.js IE8 support of HTML5 elements and media queries -->
+ <!--[if lt IE 9]>
+ <script src="https://oss.maxcdn.com/libs/html5shiv/3.7.0/html5shiv.js"></script>
+ <script src="https://oss.maxcdn.com/libs/respond.js/1.3.0/respond.min.js"></script>
+ <![endif]-->
</head>
-<!--body class="page singular"-->
-<body class="page singular">
-<div id="page" class="hfeed">
-
- <header id="branding" role="banner">
- <hgroup>
- <h1 id="site-title"><span><a href="/" title="Spark" rel="home">Spark</a></span></h1>
- <h2 id="site-description">Lightning-Fast Cluster Computing</h2>
- </hgroup>
-
- <a id="main-logo" href="/">
- <img style="height:175px; width:auto;" src="/images/spark-project-header1-cropped.png" alt="Spark: Lightning-Fast Cluster Computing" title="Spark: Lightning-Fast Cluster Computing" />
- </a>
- <div class="widget-summit">
- <a href="http://spark-summit.org"><img src="/images/Summit-Logo-FINALtr-150x150px.png" /></a>
- <div class="text">
- <a href="http://spark-summit.org/2013">
-
- <strong>Videos and Slides<br/>
- Available Now!</strong>
- </a>
- </div>
+<body>
+
+<div class="container" style="max-width: 1200px;">
+
+<div class="masthead">
+
+ <p class="lead">
+ <a href="/">
+ <img src="/images/spark-logo.png"
+ style="height:100px; width:auto; vertical-align: bottom; margin-top: 20px;"></a><span class="tagline">
+ Lightning-fast cluster computing
+ </span>
+ </p>
+
+</div>
+
+<nav class="navbar navbar-default" role="navigation">
+ <!-- Brand and toggle get grouped for better mobile display -->
+ <div class="navbar-header">
+ <button type="button" class="navbar-toggle" data-toggle="collapse"
+ data-target="#navbar-collapse-1">
+ <span class="sr-only">Toggle navigation</span>
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ </button>
</div>
- <nav id="access" role="navigation">
- <h3 class="assistive-text">Main menu</h3>
- <div class="menu-main-menu-container">
- <ul id="menu-main-menu" class="menu">
-
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/index.html">Home</a>
- </li>
-
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/downloads.html">Downloads</a>
- </li>
-
- <li class="menu-item menu-item-type-post_type menu-item-object-page current-menu-item">
- <a href="/documentation.html">Documentation</a>
- </li>
+ <!-- Collect the nav links, forms, and other content for toggling -->
+ <div class="collapse navbar-collapse" id="navbar-collapse-1">
+ <ul class="nav navbar-nav">
+ <li><a href="/downloads.html">Download</a></li>
+ <li class="dropdown">
+ <a href="#" class="dropdown-toggle" data-toggle="dropdown">
+ Related Projects <b class="caret"></b>
+ </a>
+ <ul class="dropdown-menu">
+ <li><a href="http://shark.cs.berkeley.edu">Shark (SQL)</a></li>
+ <li><a href="/streaming/">Spark Streaming</a></li>
+ <li><a href="/mllib/">MLlib (machine learning)</a></li>
+ <li><a href="http://amplab.github.io/graphx/">GraphX (graph)</a></li>
+ </ul>
+ </li>
+ <li class="dropdown">
+ <a href="#" class="dropdown-toggle" data-toggle="dropdown">
+ Documentation <b class="caret"></b>
+ </a>
+ <ul class="dropdown-menu">
+ <li><a href="/documentation.html">Overview</a></li>
+ <li><a href="/docs/latest/">Latest Release</a></li>
+ <li><a href="/examples.html">Examples</a></li>
+ </ul>
+ </li>
+ <li class="dropdown">
+ <a href="#" class="dropdown-toggle" data-toggle="dropdown">
+ Community <b class="caret"></b>
+ </a>
+ <ul class="dropdown-menu">
+ <li><a href="/community.html">Mailing Lists</a></li>
+ <li><a href="/community.html#events">Events and Meetups</a></li>
+ <li><a href="/community.html#history">Project History</a></li>
+ <li><a href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark">Powered By</a></li>
+ </ul>
+ </li>
+ <li><a href="/faq.html">FAQ</a></li>
+ </ul>
+ </div>
+ <!-- /.navbar-collapse -->
+</nav>
+
+
+<div class="row">
+ <div class="col-md-3 col-md-push-9">
+ <div class="news" style="margin-bottom: 20px;">
+ <h5>Latest News</h5>
+ <ul class="list-unstyled">
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/examples.html">Examples</a>
- </li>
+ <li><a href="/news/spark-0-8-1-released.html">Spark 0.8.1 released</a>
+ <span class="small">(Dec 19, 2013)</span></li>
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/mailing-lists.html">Mailing Lists</a>
- </li>
+ <li><a href="/news/spark-summit-2013-is-a-wrap.html">Spark Summit 2013 is a Wrap</a>
+ <span class="small">(Dec 15, 2013)</span></li>
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/research.html">Research</a>
- </li>
+ <li><a href="/news/announcing-the-first-spark-summit.html">Announcing the first Spark Summit: December 2, 2013</a>
+ <span class="small">(Oct 08, 2013)</span></li>
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/faq.html">FAQ</a>
- </li>
+ <li><a href="/news/spark-0-8-0-released.html">Spark 0.8.0 released</a>
+ <span class="small">(Sep 25, 2013)</span></li>
- </ul></div>
- </nav><!-- #access -->
-</header><!-- #branding -->
-
-
+ </ul>
+ <p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p>
+ </div>
+ <div class="hidden-xs hidden-sm">
+ <a href="/downloads.html" class="btn btn-success btn-lg btn-block" style="margin-bottom: 30px;">
+ Download Spark
+ </a>
+ <p style="font-size: 16px; font-weight: 500; color: #555;">
+ Related Projects:
+ </p>
+ <ul class="list-narrow">
+ <li><a href="http://shark.cs.berkeley.edu">Shark (SQL)</a></li>
+ <li><a href="/streaming/">Spark Streaming</a></li>
+ <li><a href="/mllib/">MLlib (machine learning)</a></li>
+ <li><a href="http://amplab.github.io/graphx/">GraphX (graph)</a></li>
+ </ul>
+ </div>
+ </div>
- <div id="main">
- <div id="primary">
- <div id="content" role="main">
-
- <article class="page type-page status-publish hentry">
- <h2>Spark Documentation</h2>
+ <div class="col-md-9 col-md-pull-3">
+ <h2>Spark Documentation</h2>
<p>Setup instructions, programming guides, and other documentation are available for each version of Spark below:</p>
<ul>
<li><a href="/docs/latest/">Spark 0.8.1 (latest release)</a></li>
- <li><a href="/docs/0.8.0/">Spark 0.8.0</a></li>
<li><a href="/docs/0.7.3/">Spark 0.7.3</a></li>
<li><a href="/docs/0.6.2/">Spark 0.6.2</a></li>
- <li><a href="https://github.com/mesos/spark/wiki/Spark-0.5-Documentation">Spark 0.5.x</a> (hosted on GitHub)</li>
</ul>
-<p>Read these documents to get started with Spark. In addition, this page lists some external resources for learning Spark.</p>
+<p>Read these documents to get started with Spark, as well as with the built-in components
+(<a href="/docs/latest/mllib-guide.html">MLlib</a> and
+<a href="/docs/latest/streaming-programming-guide.html">Spark Streaming</a>).</p>
+
+<p>In addition, this page lists some external resources for learning Spark.</p>
<h3>Video Tutorials</h3>
@@ -156,23 +184,37 @@
<h3>Hands-On Exercises</h3>
<ul>
- <li><a href="http://ampcamp.berkeley.edu/3/exercises/">Hands-on exercises</a> are available online. These exercises let you launch a small EC2 cluster, load a dataset, and query it with Spark, Shark, Spark Streaming, and MLLib.</li>
+ <li><a href="http://spark-summit.org/2013/exercises/">Hands-on exercises</a> are available online from Spark Summit 2013. These exercises let you launch a small EC2 cluster, load a dataset, and query it with Spark, Shark, Spark Streaming, and MLLib.</li>
</ul>
-<h3>Spark Summit Slides and Videos</h3>
+<p><a name="summit"></a></p>
+<h3>Training Materials</h3>
+<ul>
+ <li><a href="http://spark-summit.org/2013">Spark Summit 2013</a> contained a training session for which
+ <a href="http://spark-summit.org/summit-2013/#day2">slides and videos</a> are available for free online.
+ The session also included <a href="http://spark-summit.org/2013/exercises/">exercises</a> that you can run on Amazon EC2.</li>
+ <li>The <a href="https://amplab.cs.berkeley.edu/">UC Berkeley AMPLab</a> regularly hosts training camps on Spark and related projects.
+Slides and videos are available online:
<ul>
- <li><a href="http://spark-summit.org/2013">Spark Summit 2013</a> was held in downtown San Francisco in December 2013. Slides and Videos of all talks are available for free. Look for links next to talk titles on the event agenda.</li>
+ <li><a href="http://ampcamp.berkeley.edu/3/">AMP Camp Three</a> (Berkeley, CA, August 2013)</li>
+ <li><a href="http://ampcamp.berkeley.edu/amp-camp-two-strata-2013/">AMP Camp Two</a> (Strata Santa Clara, February 2013)</li>
+ <li><a href="http://ampcamp.berkeley.edu/agenda-2012/">AMP Camp One</a> (Berkeley, CA, August 2012)</li>
+ </ul>
+ </li>
</ul>
-<h3>AMP Camp Slides and Videos</h3>
+<h3>External Tutorials, Blog Posts, and Talks</h3>
<ul>
- <li>The <a href="https://amplab.cs.berkeley.edu/">UC Berkeley AMPLab</a> regularly hosts two-day training camps on Spark and related "big data" components.
-Slides and videos from each camp are posted online:
- <br /><a href="http://ampcamp.berkeley.edu/3/">AMP Camp Three</a> <em>Big Data Bootcamp Berkeley</em> (August 2013)
- <br /><a href="http://ampcamp.berkeley.edu/amp-camp-two-strata-2013/">AMP Camp Two</a> <em>Big Data Bootcamp Strata</em> (February 2013)
- <br /><a href="http://ampcamp.berkeley.edu/agenda-2012/">AMP Camp One</a> <em>Big Data Bootcamp Berkeley</em> (August 2012)
- </li>
+ <li><a href="http://spark-summit.org/2013">Spark Summit 2013</a> &mdash; contained 30 talks about Spark use cases, available as slides and videos</li>
+ <li><a href="http://www.pwendell.com/2013/09/28/declarative-streams.html">Sampling Twitter Using Declarative Streams</a> &mdash; Spark Streaming tutorial by Patrick Wendell</li>
+ <li><a href="http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/">A Powerful Big Data Trio: Spark, Parquet and Avro</a> &mdash; Using Parquet in Spark by Matt Massie</li>
+ <li><a href="http://www.slideshare.net/EvanChan2/cassandra2013-spark-talk-final">Real-time Analytics with Cassandra, Spark, and Shark</a> &mdash; Presentation by Evan Chan from Ooyala at 2013 Cassandra Summit</li>
+ <li><a href="http://syndeticlogic.net/?p=311">Getting Spark Setup in Eclipse</a> &mdash; Developer blog post by James Percent</li>
+ <li><a href="http://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923">Run Spark and Shark on Amazon Elastic MapReduce</a> &mdash; Article by Amazon Elastic MapReduce team member Parviz Deyhim</li>
+ <li><a href="http://blog.quantifind.com/posts/spark-unit-test/">Unit testing with Spark</a> &mdash; Quantifind tech blog post by Imran Rashid</li>
+ <li><a href="http://blog.quantifind.com/posts/logging-post/">Configuring Spark logs</a> &mdash; Quantifind tech blog by Imran Rashid</li>
+ <li><a href="http://www.ibm.com/developerworks/library/os-spark/">Spark, an alternative for fast data analytics</a> &mdash; IBM Developer Works article by M. Tim Jones</li>
</ul>
<h3>Books</h3>
@@ -181,27 +223,27 @@ Slides and videos from each camp are posted online:
<li><a href="http://www.packtpub.com/fast-data-processing-with-spark/book">Fast Data Processing with Spark</a>, by Holden Karau (Packt Publishing)</li>
</ul>
-<h3>External Tutorials, Development Blogs, and Talks</h3>
+<h3>Examples</h3>
<ul>
- <li><a href="http://www.pwendell.com/2013/09/28/declarative-streams.html">Sampling Twitter Using Declarative Streams</a> -- Spark Streaming tutorial by Patrick Wendell</li>
- <li><a href="http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/">A Powerful Big Data Trio: Spark, Parquet and Avro</a> -- Using Parquet in Spark by Matt Massie</li>
- <li><a href="http://www.slideshare.net/EvanChan2/cassandra2013-spark-talk-final">Real-time Analytics with Cassandra, Spark, and Shark</a> -- Presentation by Evan Chan from Ooyala at the 2013 Cassandra Summit</li>
- <li><a href="http://syndeticlogic.net/?p=311">Getting Spark Setup in Eclipse</a> -- Developer blog post by James Percent</li>
- <li><a href="http://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923">Run Spark and Shark on Amazon Elastic MapReduce</a> -- Article by Amazon AWS Elastic MapReduce team member Parviz Deyhim</li>
- <li><a href="http://blog.quantifind.com/posts/spark-unit-test/">Unit testing with Spark</a> -- Quantifind tech blog post by Imran Rashid</li>
- <li><a href="http://blog.quantifind.com/posts/logging-post/">Configuring Spark logs</a> -- Quantifind tech blog by Imran Rashid</li>
- <li><a href="http://www.ibm.com/developerworks/library/os-spark/">Spark, an alternative for fast data analytics</a> -- IBM Developer Works article by M. Tim Jones</li>
+ <li>The <a href="/examples.html">Spark examples page</a> shows the basic API in Scala, Java and Python.</li>
</ul>
-<h3>Spark Internals</h3>
+<h3>Wiki</h3>
-<ul>
- <li><a href="http://www.youtube.com/watch?v=49Hr5xZyTEA">Overview of Spark Internals [advanced]</a> (<a href="/talks/dev-meetup-dec-2012.pptx">pptx</a>) (<a href="http://www.youtube.com/watch?v=49Hr5xZyTEA">video</a>)</li>
-</ul>
+<ul><li>
+The <a href="https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage">Spark wiki</a> contains
+information for developers, such as architecture documents and how to <a href="https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark">contribute</a> to Spark.
+</li></ul>
<h3>Research Papers</h3>
+<p>
+Spark was initially developed as a UC Berkeley research project, and much of the design is documented in papers.
+The <a href="/research.html">research page</a> lists some of the original motivation and direction.
+The following papers have been published about Spark and related projects.
+</p>
+
<ul>
<li>
<a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-214.pdf">Shark: SQL and Rich Analytics at Scale</a>. Reynold Xin, Joshua Rosen, Matei Zaharia, Michael J. Franklin, Scott Shenker, Ion Stoica. <em>Technical Report UCB/EECS-2012-214</em>. November 2012.
@@ -222,25 +264,25 @@ Slides and videos from each camp are posted online:
</li>
</ul>
- </article><!-- #post -->
-
- </div><!-- #content -->
-
- <footer id="colophon" role="contentinfo">
- <div id="site-generator">
- <p style="padding-top: 0; padding-bottom: 15px;">
- Apache Spark is an effort undergoing incubation at The Apache Software Foundation.
- <a href="http://incubator.apache.org/" style="border: none;">
- <img style="vertical-align: middle; border: none;" src="/images/incubator-logo.png" alt="Apache Incubator" title="Apache Incubator" />
- </a>
- </p>
</div>
-</footer><!-- #colophon -->
+</div>
+
+
+
+<footer class="small">
+ <hr>
+ Apache Spark is an effort undergoing incubation at The Apache Software Foundation.
+ <a href="http://incubator.apache.org/" style="border: none;">
+ <img style="vertical-align: middle; float: right; margin-bottom: 15px;"
+ src="/images/incubator-logo.png" alt="Apache Incubator" title="Apache Incubator" />
+ </a>
+</footer>
- </div><!-- #primary -->
- </div><!-- #main -->
-</div><!-- #page -->
+</div>
+<script src="https://code.jquery.com/jquery.js"></script>
+<script src="//netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js"></script>
+<script src="/js/lang-tabs.js"></script>
</body>
</html>