summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--index.md8
-rw-r--r--site/downloads.html2
-rw-r--r--site/index.html8
-rw-r--r--site/news/amp-camp-2013-registration-ope.html2
-rw-r--r--site/news/index.html36
-rw-r--r--site/news/run-spark-and-shark-on-amazon-emr.html2
-rw-r--r--site/news/spark-0-6-1-and-0-5-2-released.html2
-rw-r--r--site/news/spark-0-7-0-released.html2
-rw-r--r--site/news/spark-0-7-2-released.html2
-rw-r--r--site/news/spark-0-7-3-released.html2
-rw-r--r--site/news/spark-0-8-0-released.html2
-rw-r--r--site/news/spark-and-shark-in-the-news.html4
-rw-r--r--site/news/spark-meetups.html2
-rw-r--r--site/news/spark-user-survey-and-powered-by-page.html4
-rw-r--r--site/news/strata-exercises-now-available-online.html2
-rw-r--r--site/news/video-from-first-spark-development-meetup.html2
-rw-r--r--site/releases/spark-release-0-3.html2
-rw-r--r--site/releases/spark-release-0-5-0.html8
-rw-r--r--site/releases/spark-release-0-5-1.html2
-rw-r--r--site/releases/spark-release-0-6-0.html6
-rw-r--r--site/releases/spark-release-0-7-0.html4
-rw-r--r--site/releases/spark-release-0-8-0.html138
-rw-r--r--site/screencasts/3-transformations-and-caching.html147
-rw-r--r--site/screencasts/4-a-standalone-job-in-spark.html147
-rw-r--r--site/screencasts/index.html4
25 files changed, 125 insertions, 415 deletions
diff --git a/index.md b/index.md
index cc7ca6220..128933cc9 100644
--- a/index.md
+++ b/index.md
@@ -28,9 +28,11 @@ Spark is also the engine behind <a href="http://shark.cs.berkeley.edu" onclick="
While Spark is a new engine, it can access any data source supported by Hadoop, making it easy to run over existing data.
## Who uses it?
-Spark was initially developed in the <a href="https://amplab.cs.berkeley.edu" onclick="javascript:_gaq.push(['_trackEvent','outbound-article','http://amplab.cs.berkeley.edu']);">UC Berkeley AMPLab</a>, but is now being used and developed at a wide array of companies, including <a href="http://www.yahoo.com" onclick="javascript:_gaq.push(['_trackEvent','outbound-article','http://www.yahoo.com']);">Yahoo!</a>, <a href="http://www.conviva.com" onclick="javascript:_gaq.push(['_trackEvent','outbound-article','http://www.conviva.com']);">Conviva</a>, and <a href="http://www.quantifind.com" onclick="javascript:_gaq.push(['_trackEvent','outbound-article','http://www.quantifind.com']);">Quantifind</a>.
-In total, over 20 companies have contributed code to Spark.
-Spark is <a href="https://github.com/mesos/spark" onclick="javascript:_gaq.push(['_trackEvent','outbound-article','http://github.com']);">open source</a> under an Apache license, so <a href="{{site.url}}downloads.html" >download</a> it to check it out.
+Spark was initially created in the <a href="https://amplab.cs.berkeley.edu" onclick="javascript:_gaq.push(['_trackEvent','outbound-article','http://amplab.cs.berkeley.edu']);">UC Berkeley AMPLab</a>, but is now being used and developed at a wide array of companies.
+See our <a href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark">powered by page</a> for a list of users,
+and our <a href="https://cwiki.apache.org/confluence/display/SPARK/Committers">list of committers</a>.
+In total, over 25 companies have contributed code to Spark.
+Spark is <a href="https://github.com/apache/incubator-spark" onclick="javascript:_gaq.push(['_trackEvent','outbound-article','http://github.com']);">open source</a> under an Apache license, so <a href="{{site.url}}downloads.html" >download</a> it to try it out.
## Apache Incubator notice
Apache Spark is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
diff --git a/site/downloads.html b/site/downloads.html
index 30528f597..5bb126baa 100644
--- a/site/downloads.html
+++ b/site/downloads.html
@@ -149,7 +149,7 @@ version: 0.8.0-incubating
<h3 id="development-version">Development Version</h3>
<p>If you are interested in working with the newest under-development code or contributing to Spark development, you can also check out the master branch from Git: <tt>git clone git://github.com/apache/incubator-spark.git</tt>.</p>
-<p>Once you&#8217;ve downloaded Spark, you can find instructions for installing and building it on the <a href="/documentation.html">documentation page</a>.</p>
+<p>Once you’ve downloaded Spark, you can find instructions for installing and building it on the <a href="/documentation.html">documentation page</a>.</p>
<h3 id="previous-releases">Previous Releases</h3>
<ul>
diff --git a/site/index.html b/site/index.html
index c0f3e73c6..3930dbeb7 100644
--- a/site/index.html
+++ b/site/index.html
@@ -140,9 +140,11 @@ You can also use Spark interactively from the Scala and Python shells to rapidly
<p>While Spark is a new engine, it can access any data source supported by Hadoop, making it easy to run over existing data.</p>
<h2 id="who-uses-it">Who uses it?</h2>
-<p>Spark was initially developed in the <a href="https://amplab.cs.berkeley.edu" onclick="javascript:_gaq.push(['_trackEvent','outbound-article','http://amplab.cs.berkeley.edu']);">UC Berkeley AMPLab</a>, but is now being used and developed at a wide array of companies, including <a href="http://www.yahoo.com" onclick="javascript:_gaq.push(['_trackEvent','outbound-article','http://www.yahoo.com']);">Yahoo!</a>, <a href="http://www.conviva.com" onclick="javascript:_gaq.push(['_trackEvent','outbound-article','http://www.conviva.com']);">Conviva</a>, and <a href="http://www.quantifind.com" onclick="javascript:_gaq.push(['_trackEvent','outbound-article','http://www.quantifind.com']);">Quantifind</a>.
-In total, over 20 companies have contributed code to Spark.
-Spark is <a href="https://github.com/mesos/spark" onclick="javascript:_gaq.push(['_trackEvent','outbound-article','http://github.com']);">open source</a> under an Apache license, so <a href="/downloads.html">download</a> it to check it out.</p>
+<p>Spark was initially created in the <a href="https://amplab.cs.berkeley.edu" onclick="javascript:_gaq.push(['_trackEvent','outbound-article','http://amplab.cs.berkeley.edu']);">UC Berkeley AMPLab</a>, but is now being used and developed at a wide array of companies.
+See our <a href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark">powered by page</a> for a list of users,
+and our <a href="https://cwiki.apache.org/confluence/display/SPARK/Committers">list of committers</a>.
+In total, over 25 companies have contributed code to Spark.
+Spark is <a href="https://github.com/apache/incubator-spark" onclick="javascript:_gaq.push(['_trackEvent','outbound-article','http://github.com']);">open source</a> under an Apache license, so <a href="/downloads.html">download</a> it to try it out.</p>
<h2 id="apache-incubator-notice">Apache Incubator notice</h2>
<p>Apache Spark is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.</p>
diff --git a/site/news/amp-camp-2013-registration-ope.html b/site/news/amp-camp-2013-registration-ope.html
index 1e217db4f..8584fff1f 100644
--- a/site/news/amp-camp-2013-registration-ope.html
+++ b/site/news/amp-camp-2013-registration-ope.html
@@ -122,7 +122,7 @@
<h2>Registration open for AMP Camp training camp in Berkeley</h2>
-<p>Want to learn how to use Spark, Shark, GraphX, and related technologies in person? The AMP Lab is hosting a two-day training workshop for them on August 29th and 30th in Berkeley. The workshop will include tutorials, talks from users, and over four hours of hands-on exercises. <a href="http://ampcamp.berkeley.edu/amp-camp-three-berkeley-2013/">Registration is now open on the AMP Camp website</a>, for a price of $250 per person. We recommend signing up early because last year&#8217;s workshop was sold out.</p>
+<p>Want to learn how to use Spark, Shark, GraphX, and related technologies in person? The AMP Lab is hosting a two-day training workshop for them on August 29th and 30th in Berkeley. The workshop will include tutorials, talks from users, and over four hours of hands-on exercises. <a href="http://ampcamp.berkeley.edu/amp-camp-three-berkeley-2013/">Registration is now open on the AMP Camp website</a>, for a price of $250 per person. We recommend signing up early because last year’s workshop was sold out.</p>
</article><!-- #post -->
diff --git a/site/news/index.html b/site/news/index.html
index 4cadf5aad..43dc0dc5d 100644
--- a/site/news/index.html
+++ b/site/news/index.html
@@ -125,7 +125,7 @@
<h1 class="entry-title"><a href="/news/spark-0-8-0-released.html">Spark 0.8.0 released</a></h1>
<div class="entry-meta">September 25, 2013</div>
</header>
- <div class="entry-content"><p>We&#8217;re proud to announce the release of <a href="{{site.url}}releases/spark-release-0-8-0.html" title="Spark Release 0.8.0">Apache Spark 0.8.0</a>. Spark 0.8.0 is a major release that includes many new capabilities and usability improvements. It’s also our first release under the Apache incubator. It is the largest Spark release yet, with contributions from 67 developers and 24 companies. Major new features include an expanded monitoring framework and UI, a machine learning library, and support for running Spark inside of YARN.</p>
+ <div class="entry-content"><p>We’re proud to announce the release of <a href="/releases/spark-release-0-8-0.html" title="Spark Release 0.8.0">Apache Spark 0.8.0</a>. Spark 0.8.0 is a major release that includes many new capabilities and usability improvements. It’s also our first release under the Apache incubator. It is the largest Spark release yet, with contributions from 67 developers and 24 companies. Major new features include an expanded monitoring framework and UI, a machine learning library, and support for running Spark inside of YARN.</p>
</div>
</article>
@@ -135,7 +135,7 @@
<h1 class="entry-title"><a href="/news/spark-user-survey-and-powered-by-page.html">Spark user survey and "Powered By" page</a></h1>
<div class="entry-meta">September 05, 2013</div>
</header>
- <div class="entry-content"><p>As we continue developing Spark, we would love to get feedback from users and hear what you&#8217;d like us to work on next. We&#8217;ve decided that a good way to do that is a survey &#8211; we hope to run this at regular intervals. If you have a few minutes to participate, <a href="https://docs.google.com/forms/d/1eMXp4GjcIXglxJe5vYYBzXKVm-6AiYt1KThJwhCjJiY/viewform">fill in the survey here</a>. Your time is greatly appreciated.</p>
+ <div class="entry-content"><p>As we continue developing Spark, we would love to get feedback from users and hear what you’d like us to work on next. We’ve decided that a good way to do that is a survey – we hope to run this at regular intervals. If you have a few minutes to participate, <a href="https://docs.google.com/forms/d/1eMXp4GjcIXglxJe5vYYBzXKVm-6AiYt1KThJwhCjJiY/viewform">fill in the survey here</a>. Your time is greatly appreciated.</p>
</div>
</article>
@@ -145,7 +145,7 @@
<h1 class="entry-title"><a href="/news/fourth-spark-screencast-published.html">Fourth Spark screencast released</a></h1>
<div class="entry-meta">August 27, 2013</div>
</header>
- <div class="entry-content"><p>We have released the next screencast, <a href="{{site.url}}screencasts/4-a-standalone-job-in-spark.html">A Standalone Job in Scala</a> that takes you beyond the Spark shell, helping you write your first standalone Spark job.</p>
+ <div class="entry-content"><p>We have released the next screencast, <a href="/screencasts/4-a-standalone-job-in-spark.html">A Standalone Job in Scala</a> that takes you beyond the Spark shell, helping you write your first standalone Spark job.</p>
</div>
</article>
@@ -155,7 +155,7 @@
<h1 class="entry-title"><a href="/news/amp-camp-2013-registration-ope.html">Registration open for AMP Camp training camp in Berkeley</a></h1>
<div class="entry-meta">July 23, 2013</div>
</header>
- <div class="entry-content"><p>Want to learn how to use Spark, Shark, GraphX, and related technologies in person? The AMP Lab is hosting a two-day training workshop for them on August 29th and 30th in Berkeley. The workshop will include tutorials, talks from users, and over four hours of hands-on exercises. <a href="http://ampcamp.berkeley.edu/amp-camp-three-berkeley-2013/">Registration is now open on the AMP Camp website</a>, for a price of $250 per person. We recommend signing up early because last year&#8217;s workshop was sold out.</p>
+ <div class="entry-content"><p>Want to learn how to use Spark, Shark, GraphX, and related technologies in person? The AMP Lab is hosting a two-day training workshop for them on August 29th and 30th in Berkeley. The workshop will include tutorials, talks from users, and over four hours of hands-on exercises. <a href="http://ampcamp.berkeley.edu/amp-camp-three-berkeley-2013/">Registration is now open on the AMP Camp website</a>, for a price of $250 per person. We recommend signing up early because last year’s workshop was sold out.</p>
</div>
</article>
@@ -186,7 +186,7 @@
<h1 class="entry-title"><a href="/news/spark-0-7-3-released.html">Spark 0.7.3 released</a></h1>
<div class="entry-meta">July 16, 2013</div>
</header>
- <div class="entry-content"><p>We&#8217;ve just posted <a href="{{site.url}}releases/spark-release-0-7-3.html" title="Spark Release 0.7.3">Spark Release 0.7.3</a>, a maintenance release that contains several fixes, including streaming API updates and new functionality for adding JARs to a <code>spark-shell</code> session. We recommend that all users update to this release. Visit the <a href="{{site.url}}releases/spark-release-0-7-3.html" title="Spark Release 0.7.3">release notes</a> to read about the new features, or <a href="{{site.url}}downloads.html">download</a> the release today.</p>
+ <div class="entry-content"><p>We’ve just posted <a href="/releases/spark-release-0-7-3.html" title="Spark Release 0.7.3">Spark Release 0.7.3</a>, a maintenance release that contains several fixes, including streaming API updates and new functionality for adding JARs to a <code>spark-shell</code> session. We recommend that all users update to this release. Visit the <a href="/releases/spark-release-0-7-3.html" title="Spark Release 0.7.3">release notes</a> to read about the new features, or <a href="/downloads.html">download</a> the release today.</p>
</div>
</article>
@@ -216,7 +216,7 @@
<h1 class="entry-title"><a href="/news/spark-0-7-2-released.html">Spark 0.7.2 released</a></h1>
<div class="entry-meta">June 02, 2013</div>
</header>
- <div class="entry-content"><p>We&#8217;re happy to announce the release of <a href="{{site.url}}releases/spark-release-0-7-2.html" title="Spark Release 0.7.2">Spark 0.7.2</a>, a new maintenance release that includes several bug fixes and improvements, as well as new code examples and API features. We recommend that all users update to this release. Head over to the <a href="{{site.url}}releases/spark-release-0-7-2.html" title="Spark Release 0.7.2">release notes</a> to read about the new features, or <a href="{{site.url}}downloads.html">download</a> the release today.</p>
+ <div class="entry-content"><p>We’re happy to announce the release of <a href="/releases/spark-release-0-7-2.html" title="Spark Release 0.7.2">Spark 0.7.2</a>, a new maintenance release that includes several bug fixes and improvements, as well as new code examples and API features. We recommend that all users update to this release. Head over to the <a href="/releases/spark-release-0-7-2.html" title="Spark Release 0.7.2">release notes</a> to read about the new features, or <a href="/downloads.html">download</a> the release today.</p>
</div>
</article>
@@ -228,9 +228,9 @@
</header>
<div class="entry-content"><p>We have released the first two screencasts in a series of short hands-on video training courses we will be publishing to help new users get up and running with Spark in minutes.</p>
-<p>The first Spark screencast is called <a href="{{site.url}}screencasts/1-first-steps-with-spark.html">First Steps With Spark</a> and walks you through downloading and building Spark, as well as using the Spark shell, all in less than 10 minutes!</p>
+<p>The first Spark screencast is called <a href="/screencasts/1-first-steps-with-spark.html">First Steps With Spark</a> and walks you through downloading and building Spark, as well as using the Spark shell, all in less than 10 minutes!</p>
-<p>The second screencast is a 2 minute <a href="{{site.url}}screencasts/2-spark-documentation-overview.html">overview of the Spark documentation</a>.</p>
+<p>The second screencast is a 2 minute <a href="/screencasts/2-spark-documentation-overview.html">overview of the Spark documentation</a>.</p>
<p>We hope you find these screencasts useful.</p>
@@ -242,7 +242,7 @@
<h1 class="entry-title"><a href="/news/strata-exercises-now-available-online.html">Strata exercises now available online</a></h1>
<div class="entry-meta">March 17, 2013</div>
</header>
- <div class="entry-content"><p>At this year&#8217;s <a href="http://strataconf.com/strata2013">Strata</a> conference, the AMP Lab hosted a full day of tutorials on Spark, Shark, and Spark Streaming, including online exercises on Amazon EC2. Those exercises are now <a href="http://ampcamp.berkeley.edu/big-data-mini-course/">available online</a>, letting you learn Spark and Shark at your own pace on an EC2 cluster with real data. They are a great resource for learning the systems. You can also find <a href="http://ampcamp.berkeley.edu/amp-camp-two-strata-2013/">slides</a> from the Strata tutorials online, as well as <a href="http://ampcamp.berkeley.edu/amp-camp-one-berkeley-2012/">videos</a> from the AMP Camp workshop we held at Berkeley in August.</p>
+ <div class="entry-content"><p>At this year’s <a href="http://strataconf.com/strata2013">Strata</a> conference, the AMP Lab hosted a full day of tutorials on Spark, Shark, and Spark Streaming, including online exercises on Amazon EC2. Those exercises are now <a href="http://ampcamp.berkeley.edu/big-data-mini-course/">available online</a>, letting you learn Spark and Shark at your own pace on an EC2 cluster with real data. They are a great resource for learning the systems. You can also find <a href="http://ampcamp.berkeley.edu/amp-camp-two-strata-2013/">slides</a> from the Strata tutorials online, as well as <a href="http://ampcamp.berkeley.edu/amp-camp-one-berkeley-2012/">videos</a> from the AMP Camp workshop we held at Berkeley in August.</p>
</div>
</article>
@@ -252,7 +252,7 @@
<h1 class="entry-title"><a href="/news/spark-0-7-0-released.html">Spark 0.7.0 released</a></h1>
<div class="entry-meta">February 27, 2013</div>
</header>
- <div class="entry-content"><p>We&#8217;re proud to announce the release of <a href="{{site.url}}releases/spark-release-0-7-0.html" title="Spark Release 0.7.0">Spark 0.7.0</a>, a new major version of Spark that adds several key features, including a <a href="{{site.url}}docs/latest/python-programming-guide.html">Python API</a> for Spark and an <a href="{{site.url}}docs/latest/streaming-programming-guide.html">alpha of Spark Streaming</a>. This release is the result of the largest group of contributors yet behind a Spark release &#8211; 31 contributors from inside and outside Berkeley. Head over to the <a href="{{site.url}}releases/spark-release-0-7-0.html" title="Spark Release 0.7.0">release notes</a> to read more about the new features, or <a href="{{site.url}}downloads.html">download</a> the release today.</p>
+ <div class="entry-content"><p>We’re proud to announce the release of <a href="/releases/spark-release-0-7-0.html" title="Spark Release 0.7.0">Spark 0.7.0</a>, a new major version of Spark that adds several key features, including a <a href="/docs/latest/python-programming-guide.html">Python API</a> for Spark and an <a href="/docs/latest/streaming-programming-guide.html">alpha of Spark Streaming</a>. This release is the result of the largest group of contributors yet behind a Spark release – 31 contributors from inside and outside Berkeley. Head over to the <a href="/releases/spark-release-0-7-0.html" title="Spark Release 0.7.0">release notes</a> to read more about the new features, or <a href="/downloads.html">download</a> the release today.</p>
</div>
</article>
@@ -262,7 +262,7 @@
<h1 class="entry-title"><a href="/news/run-spark-and-shark-on-amazon-emr.html">Spark/Shark Tutorial for Amazon EMR</a></h1>
<div class="entry-meta">February 24, 2013</div>
</header>
- <div class="entry-content"><p>This weekend, Amazon posted an <a href="http://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923">article</a> and code that make it easy to launch Spark and Shark on Elastic MapReduce. The article includes examples of how to run both interactive Scala commands and SQL queries from Shark on data in S3. Head over to the <a href="http://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923">Amazon article</a> for details. We&#8217;re very excited because, to our knowledge, this makes Spark the first non-Hadoop engine that you can launch with EMR.</p>
+ <div class="entry-content"><p>This weekend, Amazon posted an <a href="http://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923">article</a> and code that make it easy to launch Spark and Shark on Elastic MapReduce. The article includes examples of how to run both interactive Scala commands and SQL queries from Shark on data in S3. Head over to the <a href="http://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923">Amazon article</a> for details. We’re very excited because, to our knowledge, this makes Spark the first non-Hadoop engine that you can launch with EMR.</p>
</div>
</article>
@@ -272,7 +272,7 @@
<h1 class="entry-title"><a href="/news/spark-0-6-2-released.html">Spark 0.6.2 released</a></h1>
<div class="entry-meta">February 07, 2013</div>
</header>
- <div class="entry-content"><p>We recently released <a href="{{site.url}}releases/spark-release-0-6-2.html" title="Spark Release 0.6.2">Spark 0.6.2</a>, a new version of Spark. This is a maintenance release that includes several bug fixes and usability improvements (see the <a href="{{site.url}}releases/spark-release-0-6-2.html" title="Spark Release 0.6.2">release notes</a>). We recommend that all users upgrade to this release.</p>
+ <div class="entry-content"><p>We recently released <a href="/releases/spark-release-0-6-2.html" title="Spark Release 0.6.2">Spark 0.6.2</a>, a new version of Spark. This is a maintenance release that includes several bug fixes and usability improvements (see the <a href="/releases/spark-release-0-6-2.html" title="Spark Release 0.6.2">release notes</a>). We recommend that all users upgrade to this release.</p>
</div>
</article>
@@ -297,7 +297,7 @@
<h1 class="entry-title"><a href="/news/video-from-first-spark-development-meetup.html">Video up from first Spark development meetup</a></h1>
<div class="entry-meta">December 21, 2012</div>
</header>
- <div class="entry-content"><p>On December 18th, we held the first of a series of Spark development meetups, for people interested in learning the Spark codebase and contributing to the project. There was quite a bit more demand than we anticipated, with over 80 people signing up and 64 attending. The first meetup was an <a href="http://www.meetup.com/spark-users/events/94101942/">introduction to Spark internals</a>. Thanks to one of the attendees, there&#8217;s now a <a href="http://www.youtube.com/watch?v=49Hr5xZyTEA">video of the meetup</a> on YouTube. We&#8217;ve also posted the <a href="http://files.meetup.com/3138542/dev-meetup-dec-2012.pptx">slides</a>. Look to see more development meetups on Spark and Shark in the future.</p>
+ <div class="entry-content"><p>On December 18th, we held the first of a series of Spark development meetups, for people interested in learning the Spark codebase and contributing to the project. There was quite a bit more demand than we anticipated, with over 80 people signing up and 64 attending. The first meetup was an <a href="http://www.meetup.com/spark-users/events/94101942/">introduction to Spark internals</a>. Thanks to one of the attendees, there’s now a <a href="http://www.youtube.com/watch?v=49Hr5xZyTEA">video of the meetup</a> on YouTube. We’ve also posted the <a href="http://files.meetup.com/3138542/dev-meetup-dec-2012.pptx">slides</a>. Look to see more development meetups on Spark and Shark in the future.</p>
</div>
</article>
@@ -307,7 +307,7 @@
<h1 class="entry-title"><a href="/news/spark-and-shark-in-the-news.html">Spark and Shark in the news</a></h1>
<div class="entry-meta">December 21, 2012</div>
</header>
- <div class="entry-content"><p>Recently, we&#8217;ve seen quite a bit of coverage of both Spark and <a href="http://shark.cs.berkeley.edu">Shark</a> in the news. I wanted to list some of the more recent articles, for readers interested in learning more.</p>
+ <div class="entry-content"><p>Recently, we’ve seen quite a bit of coverage of both Spark and <a href="http://shark.cs.berkeley.edu">Shark</a> in the news. I wanted to list some of the more recent articles, for readers interested in learning more.</p>
<ul>
<li>Curt Monash, editor of the popular DBMS2 blog, wrote a great <a href="http://www.dbms2.com/2012/12/13/introduction-to-spark-shark-bdas-and-amplab/">introduction to Spark and Shark</a>, as well as a more detailed <a href="http://www.dbms2.com/2012/12/13/spark-shark-and-rdds-technology-notes/">technical overview</a>.</li>
@@ -317,7 +317,7 @@
<li><a href="http://data-informed.com/spark-an-open-source-engine-for-iterative-data-mining/">DataInformed</a> interviewed two Spark users and wrote about their applications in anomaly detection, predictive analytics and data mining.</li>
</ul>
-<p>In other news, there will be a full day of tutorials on Spark and Shark at the <a href="http://strataconf.com/strata2013">O&#8217;Reilly Strata conference</a> in February. They include a three-hour <a href="http://strataconf.com/strata2013/public/schedule/detail/27438">introduction to Spark, Shark and BDAS</a> Tuesday morning, and a three-hour <a href="http://strataconf.com/strata2013/public/schedule/detail/27440">hands-on exercise session</a>. </p>
+<p>In other news, there will be a full day of tutorials on Spark and Shark at the <a href="http://strataconf.com/strata2013">O’Reilly Strata conference</a> in February. They include a three-hour <a href="http://strataconf.com/strata2013/public/schedule/detail/27438">introduction to Spark, Shark and BDAS</a> Tuesday morning, and a three-hour <a href="http://strataconf.com/strata2013/public/schedule/detail/27440">hands-on exercise session</a>. </p>
</div>
</article>
@@ -327,7 +327,7 @@
<h1 class="entry-title"><a href="/news/spark-0-6-1-and-0-5-2-released.html">Spark 0.6.1 and 0.5.2 out</a></h1>
<div class="entry-meta">November 22, 2012</div>
</header>
- <div class="entry-content"><p>Today we&#8217;ve made available two maintenance releases for Spark: <a href="{{site.url}}releases/spark-release-0-6-1.html" title="Spark Release 0.6.1">0.6.1</a> and <a href="{{site.url}}releases/spark-release-0-5-2.html" title="Spark Release 0.5.2">0.5.2</a>. They both contain important bug fixes as well as some new features, such as the ability to build against Hadoop 2 distributions. We recommend that users update to the latest version for their branch; for new users, we recommend <a href="{{site.url}}releases/spark-release-0-6-1.html" title="Spark Release 0.6.1">0.6.1</a>.</p>
+ <div class="entry-content"><p>Today we’ve made available two maintenance releases for Spark: <a href="/releases/spark-release-0-6-1.html" title="Spark Release 0.6.1">0.6.1</a> and <a href="/releases/spark-release-0-5-2.html" title="Spark Release 0.5.2">0.5.2</a>. They both contain important bug fixes as well as some new features, such as the ability to build against Hadoop 2 distributions. We recommend that users update to the latest version for their branch; for new users, we recommend <a href="/releases/spark-release-0-6-1.html" title="Spark Release 0.6.1">0.6.1</a>.</p>
</div>
</article>
@@ -337,7 +337,7 @@
<h1 class="entry-title"><a href="/news/spark-version-0-6-0-released.html">Spark version 0.6.0 released</a></h1>
<div class="entry-meta">October 15, 2012</div>
</header>
- <div class="entry-content"><p><a href="{{site.url}}releases/spark-release-0-6-0.html">Spark version 0.6.0</a> was released today, a major release that brings a wide range of performance improvements and new features, including a simpler standalone deploy mode and a Java API. Read more about it in the <a href="{{site.url}}releases/spark-release-0-6-0.html">release notes</a>.</p>
+ <div class="entry-content"><p><a href="/releases/spark-release-0-6-0.html">Spark version 0.6.0</a> was released today, a major release that brings a wide range of performance improvements and new features, including a simpler standalone deploy mode and a Java API. Read more about it in the <a href="/releases/spark-release-0-6-0.html">release notes</a>.</p>
</div>
</article>
@@ -357,7 +357,7 @@
<h1 class="entry-title"><a href="/news/spark-meetups.html">We've started hosting a Bay Area Spark User Meetup</a></h1>
<div class="entry-meta">January 10, 2012</div>
</header>
- <div class="entry-content"><p>We&#8217;ve started hosting a regular <a href="http://www.meetup.com/spark-users/">Bay Area Spark User Meetup</a>. Sign up on the meetup.com page to be notified about events and meet other Spark developers and users.</p>
+ <div class="entry-content"><p>We’ve started hosting a regular <a href="http://www.meetup.com/spark-users/">Bay Area Spark User Meetup</a>. Sign up on the meetup.com page to be notified about events and meet other Spark developers and users.</p>
</div>
</article>
diff --git a/site/news/run-spark-and-shark-on-amazon-emr.html b/site/news/run-spark-and-shark-on-amazon-emr.html
index 0d032d38f..b29e3b9a8 100644
--- a/site/news/run-spark-and-shark-on-amazon-emr.html
+++ b/site/news/run-spark-and-shark-on-amazon-emr.html
@@ -122,7 +122,7 @@
<h2>Spark/Shark Tutorial for Amazon EMR</h2>
-<p>This weekend, Amazon posted an <a href="http://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923">article</a> and code that make it easy to launch Spark and Shark on Elastic MapReduce. The article includes examples of how to run both interactive Scala commands and SQL queries from Shark on data in S3. Head over to the <a href="http://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923">Amazon article</a> for details. We&#8217;re very excited because, to our knowledge, this makes Spark the first non-Hadoop engine that you can launch with EMR.</p>
+<p>This weekend, Amazon posted an <a href="http://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923">article</a> and code that make it easy to launch Spark and Shark on Elastic MapReduce. The article includes examples of how to run both interactive Scala commands and SQL queries from Shark on data in S3. Head over to the <a href="http://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923">Amazon article</a> for details. We’re very excited because, to our knowledge, this makes Spark the first non-Hadoop engine that you can launch with EMR.</p>
</article><!-- #post -->
diff --git a/site/news/spark-0-6-1-and-0-5-2-released.html b/site/news/spark-0-6-1-and-0-5-2-released.html
index 551d8867f..26a2317aa 100644
--- a/site/news/spark-0-6-1-and-0-5-2-released.html
+++ b/site/news/spark-0-6-1-and-0-5-2-released.html
@@ -122,7 +122,7 @@
<h2>Spark 0.6.1 and 0.5.2 out</h2>
-<p>Today we&#8217;ve made available two maintenance releases for Spark: <a href="/releases/spark-release-0-6-1.html" title="Spark Release 0.6.1">0.6.1</a> and <a href="/releases/spark-release-0-5-2.html" title="Spark Release 0.5.2">0.5.2</a>. They both contain important bug fixes as well as some new features, such as the ability to build against Hadoop 2 distributions. We recommend that users update to the latest version for their branch; for new users, we recommend <a href="/releases/spark-release-0-6-1.html" title="Spark Release 0.6.1">0.6.1</a>.</p>
+<p>Today we’ve made available two maintenance releases for Spark: <a href="/releases/spark-release-0-6-1.html" title="Spark Release 0.6.1">0.6.1</a> and <a href="/releases/spark-release-0-5-2.html" title="Spark Release 0.5.2">0.5.2</a>. They both contain important bug fixes as well as some new features, such as the ability to build against Hadoop 2 distributions. We recommend that users update to the latest version for their branch; for new users, we recommend <a href="/releases/spark-release-0-6-1.html" title="Spark Release 0.6.1">0.6.1</a>.</p>
</article><!-- #post -->
diff --git a/site/news/spark-0-7-0-released.html b/site/news/spark-0-7-0-released.html
index c2d209e21..5ee78012c 100644
--- a/site/news/spark-0-7-0-released.html
+++ b/site/news/spark-0-7-0-released.html
@@ -122,7 +122,7 @@
<h2>Spark 0.7.0 released</h2>
-<p>We&#8217;re proud to announce the release of <a href="/releases/spark-release-0-7-0.html" title="Spark Release 0.7.0">Spark 0.7.0</a>, a new major version of Spark that adds several key features, including a <a href="/docs/latest/python-programming-guide.html">Python API</a> for Spark and an <a href="/docs/latest/streaming-programming-guide.html">alpha of Spark Streaming</a>. This release is the result of the largest group of contributors yet behind a Spark release &#8211; 31 contributors from inside and outside Berkeley. Head over to the <a href="/releases/spark-release-0-7-0.html" title="Spark Release 0.7.0">release notes</a> to read more about the new features, or <a href="/downloads.html">download</a> the release today.</p>
+<p>We’re proud to announce the release of <a href="/releases/spark-release-0-7-0.html" title="Spark Release 0.7.0">Spark 0.7.0</a>, a new major version of Spark that adds several key features, including a <a href="/docs/latest/python-programming-guide.html">Python API</a> for Spark and an <a href="/docs/latest/streaming-programming-guide.html">alpha of Spark Streaming</a>. This release is the result of the largest group of contributors yet behind a Spark release – 31 contributors from inside and outside Berkeley. Head over to the <a href="/releases/spark-release-0-7-0.html" title="Spark Release 0.7.0">release notes</a> to read more about the new features, or <a href="/downloads.html">download</a> the release today.</p>
</article><!-- #post -->
diff --git a/site/news/spark-0-7-2-released.html b/site/news/spark-0-7-2-released.html
index 14b5abaf5..a5663701d 100644
--- a/site/news/spark-0-7-2-released.html
+++ b/site/news/spark-0-7-2-released.html
@@ -122,7 +122,7 @@
<h2>Spark 0.7.2 released</h2>
-<p>We&#8217;re happy to announce the release of <a href="/releases/spark-release-0-7-2.html" title="Spark Release 0.7.2">Spark 0.7.2</a>, a new maintenance release that includes several bug fixes and improvements, as well as new code examples and API features. We recommend that all users update to this release. Head over to the <a href="/releases/spark-release-0-7-2.html" title="Spark Release 0.7.2">release notes</a> to read about the new features, or <a href="/downloads.html">download</a> the release today.</p>
+<p>We’re happy to announce the release of <a href="/releases/spark-release-0-7-2.html" title="Spark Release 0.7.2">Spark 0.7.2</a>, a new maintenance release that includes several bug fixes and improvements, as well as new code examples and API features. We recommend that all users update to this release. Head over to the <a href="/releases/spark-release-0-7-2.html" title="Spark Release 0.7.2">release notes</a> to read about the new features, or <a href="/downloads.html">download</a> the release today.</p>
</article><!-- #post -->
diff --git a/site/news/spark-0-7-3-released.html b/site/news/spark-0-7-3-released.html
index b9cabc73e..2470d3cdc 100644
--- a/site/news/spark-0-7-3-released.html
+++ b/site/news/spark-0-7-3-released.html
@@ -122,7 +122,7 @@
<h2>Spark 0.7.3 released</h2>
-<p>We&#8217;ve just posted <a href="/releases/spark-release-0-7-3.html" title="Spark Release 0.7.3">Spark Release 0.7.3</a>, a maintenance release that contains several fixes, including streaming API updates and new functionality for adding JARs to a <code>spark-shell</code> session. We recommend that all users update to this release. Visit the <a href="/releases/spark-release-0-7-3.html" title="Spark Release 0.7.3">release notes</a> to read about the new features, or <a href="/downloads.html">download</a> the release today.</p>
+<p>We’ve just posted <a href="/releases/spark-release-0-7-3.html" title="Spark Release 0.7.3">Spark Release 0.7.3</a>, a maintenance release that contains several fixes, including streaming API updates and new functionality for adding JARs to a <code>spark-shell</code> session. We recommend that all users update to this release. Visit the <a href="/releases/spark-release-0-7-3.html" title="Spark Release 0.7.3">release notes</a> to read about the new features, or <a href="/downloads.html">download</a> the release today.</p>
</article><!-- #post -->
diff --git a/site/news/spark-0-8-0-released.html b/site/news/spark-0-8-0-released.html
index dc619ed41..cc3352ee5 100644
--- a/site/news/spark-0-8-0-released.html
+++ b/site/news/spark-0-8-0-released.html
@@ -122,7 +122,7 @@
<h2>Spark 0.8.0 released</h2>
-<p>We&#8217;re proud to announce the release of <a href="/releases/spark-release-0-8-0.html" title="Spark Release 0.8.0">Apache Spark 0.8.0</a>. Spark 0.8.0 is a major release that includes many new capabilities and usability improvements. It’s also our first release under the Apache incubator. It is the largest Spark release yet, with contributions from 67 developers and 24 companies. Major new features include an expanded monitoring framework and UI, a machine learning library, and support for running Spark inside of YARN.</p>
+<p>We’re proud to announce the release of <a href="/releases/spark-release-0-8-0.html" title="Spark Release 0.8.0">Apache Spark 0.8.0</a>. Spark 0.8.0 is a major release that includes many new capabilities and usability improvements. It’s also our first release under the Apache incubator. It is the largest Spark release yet, with contributions from 67 developers and 24 companies. Major new features include an expanded monitoring framework and UI, a machine learning library, and support for running Spark inside of YARN.</p>
</article><!-- #post -->
diff --git a/site/news/spark-and-shark-in-the-news.html b/site/news/spark-and-shark-in-the-news.html
index 624a82412..fb5e04a03 100644
--- a/site/news/spark-and-shark-in-the-news.html
+++ b/site/news/spark-and-shark-in-the-news.html
@@ -122,7 +122,7 @@
<h2>Spark and Shark in the news</h2>
-<p>Recently, we&#8217;ve seen quite a bit of coverage of both Spark and <a href="http://shark.cs.berkeley.edu">Shark</a> in the news. I wanted to list some of the more recent articles, for readers interested in learning more.</p>
+<p>Recently, we’ve seen quite a bit of coverage of both Spark and <a href="http://shark.cs.berkeley.edu">Shark</a> in the news. I wanted to list some of the more recent articles, for readers interested in learning more.</p>
<ul>
<li>Curt Monash, editor of the popular DBMS2 blog, wrote a great <a href="http://www.dbms2.com/2012/12/13/introduction-to-spark-shark-bdas-and-amplab/">introduction to Spark and Shark</a>, as well as a more detailed <a href="http://www.dbms2.com/2012/12/13/spark-shark-and-rdds-technology-notes/">technical overview</a>.</li>
@@ -132,7 +132,7 @@
<li><a href="http://data-informed.com/spark-an-open-source-engine-for-iterative-data-mining/">DataInformed</a> interviewed two Spark users and wrote about their applications in anomaly detection, predictive analytics and data mining.</li>
</ul>
-<p>In other news, there will be a full day of tutorials on Spark and Shark at the <a href="http://strataconf.com/strata2013">O&#8217;Reilly Strata conference</a> in February. They include a three-hour <a href="http://strataconf.com/strata2013/public/schedule/detail/27438">introduction to Spark, Shark and BDAS</a> Tuesday morning, and a three-hour <a href="http://strataconf.com/strata2013/public/schedule/detail/27440">hands-on exercise session</a>. </p>
+<p>In other news, there will be a full day of tutorials on Spark and Shark at the <a href="http://strataconf.com/strata2013">O’Reilly Strata conference</a> in February. They include a three-hour <a href="http://strataconf.com/strata2013/public/schedule/detail/27438">introduction to Spark, Shark and BDAS</a> Tuesday morning, and a three-hour <a href="http://strataconf.com/strata2013/public/schedule/detail/27440">hands-on exercise session</a>. </p>
</article><!-- #post -->
diff --git a/site/news/spark-meetups.html b/site/news/spark-meetups.html
index 552260714..c9af65749 100644
--- a/site/news/spark-meetups.html
+++ b/site/news/spark-meetups.html
@@ -122,7 +122,7 @@
<h2>We've started hosting a Bay Area Spark User Meetup</h2>
-<p>We&#8217;ve started hosting a regular <a href="http://www.meetup.com/spark-users/">Bay Area Spark User Meetup</a>. Sign up on the meetup.com page to be notified about events and meet other Spark developers and users.</p>
+<p>We’ve started hosting a regular <a href="http://www.meetup.com/spark-users/">Bay Area Spark User Meetup</a>. Sign up on the meetup.com page to be notified about events and meet other Spark developers and users.</p>
</article><!-- #post -->
diff --git a/site/news/spark-user-survey-and-powered-by-page.html b/site/news/spark-user-survey-and-powered-by-page.html
index 267fd9c07..640a2f470 100644
--- a/site/news/spark-user-survey-and-powered-by-page.html
+++ b/site/news/spark-user-survey-and-powered-by-page.html
@@ -122,9 +122,9 @@
<h2>Spark user survey and "Powered By" page</h2>
-<p>As we continue developing Spark, we would love to get feedback from users and hear what you&#8217;d like us to work on next. We&#8217;ve decided that a good way to do that is a survey &#8211; we hope to run this at regular intervals. If you have a few minutes to participate, <a href="https://docs.google.com/forms/d/1eMXp4GjcIXglxJe5vYYBzXKVm-6AiYt1KThJwhCjJiY/viewform">fill in the survey here</a>. Your time is greatly appreciated.</p>
+<p>As we continue developing Spark, we would love to get feedback from users and hear what you’d like us to work on next. We’ve decided that a good way to do that is a survey – we hope to run this at regular intervals. If you have a few minutes to participate, <a href="https://docs.google.com/forms/d/1eMXp4GjcIXglxJe5vYYBzXKVm-6AiYt1KThJwhCjJiY/viewform">fill in the survey here</a>. Your time is greatly appreciated.</p>
-<p>In parallel, we are starting a <a href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark">&#8220;powered by&#8221; page</a> on the Apache Spark wiki for organizations that are using, or contributing to, Spark. Sign up if you&#8217;d like to support the project! This is a great way to let the world know you&#8217;re using Spark, and can also be helpful to generate leads for recruiting. You can also add yourself when you fill the survey.</p>
+<p>In parallel, we are starting a <a href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark">“powered by” page</a> on the Apache Spark wiki for organizations that are using, or contributing to, Spark. Sign up if you’d like to support the project! This is a great way to let the world know you’re using Spark, and can also be helpful to generate leads for recruiting. You can also add yourself when you fill the survey.</p>
<p>Thanks for taking the time to give feedback.</p>
diff --git a/site/news/strata-exercises-now-available-online.html b/site/news/strata-exercises-now-available-online.html
index 20a096492..5ea61a2bd 100644
--- a/site/news/strata-exercises-now-available-online.html
+++ b/site/news/strata-exercises-now-available-online.html
@@ -122,7 +122,7 @@
<h2>Strata exercises now available online</h2>
-<p>At this year&#8217;s <a href="http://strataconf.com/strata2013">Strata</a> conference, the AMP Lab hosted a full day of tutorials on Spark, Shark, and Spark Streaming, including online exercises on Amazon EC2. Those exercises are now <a href="http://ampcamp.berkeley.edu/big-data-mini-course/">available online</a>, letting you learn Spark and Shark at your own pace on an EC2 cluster with real data. They are a great resource for learning the systems. You can also find <a href="http://ampcamp.berkeley.edu/amp-camp-two-strata-2013/">slides</a> from the Strata tutorials online, as well as <a href="http://ampcamp.berkeley.edu/amp-camp-one-berkeley-2012/">videos</a> from the AMP Camp workshop we held at Berkeley in August.</p>
+<p>At this year’s <a href="http://strataconf.com/strata2013">Strata</a> conference, the AMP Lab hosted a full day of tutorials on Spark, Shark, and Spark Streaming, including online exercises on Amazon EC2. Those exercises are now <a href="http://ampcamp.berkeley.edu/big-data-mini-course/">available online</a>, letting you learn Spark and Shark at your own pace on an EC2 cluster with real data. They are a great resource for learning the systems. You can also find <a href="http://ampcamp.berkeley.edu/amp-camp-two-strata-2013/">slides</a> from the Strata tutorials online, as well as <a href="http://ampcamp.berkeley.edu/amp-camp-one-berkeley-2012/">videos</a> from the AMP Camp workshop we held at Berkeley in August.</p>
</article><!-- #post -->
diff --git a/site/news/video-from-first-spark-development-meetup.html b/site/news/video-from-first-spark-development-meetup.html
index 3a3f84369..ecab29435 100644
--- a/site/news/video-from-first-spark-development-meetup.html
+++ b/site/news/video-from-first-spark-development-meetup.html
@@ -122,7 +122,7 @@
<h2>Video up from first Spark development meetup</h2>
-<p>On December 18th, we held the first of a series of Spark development meetups, for people interested in learning the Spark codebase and contributing to the project. There was quite a bit more demand than we anticipated, with over 80 people signing up and 64 attending. The first meetup was an <a href="http://www.meetup.com/spark-users/events/94101942/">introduction to Spark internals</a>. Thanks to one of the attendees, there&#8217;s now a <a href="http://www.youtube.com/watch?v=49Hr5xZyTEA">video of the meetup</a> on YouTube. We&#8217;ve also posted the <a href="http://files.meetup.com/3138542/dev-meetup-dec-2012.pptx">slides</a>. Look to see more development meetups on Spark and Shark in the future.</p>
+<p>On December 18th, we held the first of a series of Spark development meetups, for people interested in learning the Spark codebase and contributing to the project. There was quite a bit more demand than we anticipated, with over 80 people signing up and 64 attending. The first meetup was an <a href="http://www.meetup.com/spark-users/events/94101942/">introduction to Spark internals</a>. Thanks to one of the attendees, there’s now a <a href="http://www.youtube.com/watch?v=49Hr5xZyTEA">video of the meetup</a> on YouTube. We’ve also posted the <a href="http://files.meetup.com/3138542/dev-meetup-dec-2012.pptx">slides</a>. Look to see more development meetups on Spark and Shark in the future.</p>
</article><!-- #post -->
diff --git a/site/releases/spark-release-0-3.html b/site/releases/spark-release-0-3.html
index a494c5051..ee8632a01 100644
--- a/site/releases/spark-release-0-3.html
+++ b/site/releases/spark-release-0-3.html
@@ -138,7 +138,7 @@
<h3>Native Types for SequenceFiles</h3>
-<p>In working with SequenceFiles, which store objects that implement Hadoop&#8217;s Writable interface, Spark will now let you use native types for certain common Writable types, like IntWritable and Text. For example:</p>
+<p>In working with SequenceFiles, which store objects that implement Hadoop’s Writable interface, Spark will now let you use native types for certain common Writable types, like IntWritable and Text. For example:</p>
<div class="code">
<span class="comment">// Will read a SequenceFile of (IntWritable, Text)</span><br />
diff --git a/site/releases/spark-release-0-5-0.html b/site/releases/spark-release-0-5-0.html
index 405a9cbbc..1bd5aeb85 100644
--- a/site/releases/spark-release-0-5-0.html
+++ b/site/releases/spark-release-0-5-0.html
@@ -126,10 +126,10 @@
<h3>Mesos 0.9 Support</h3>
-<p>This release runs on <a href="http://www.mesosproject.org/">Apache Mesos 0.9</a>, the first Apache Incubator release of Mesos, which contains significant usability and stability improvements. Most notable are better memory accounting for applications with long-term memory use, easier access of old jobs&#8217; traces and logs (by keeping a history of executed tasks on the web UI), and simpler installation.</p>
+<p>This release runs on <a href="http://www.mesosproject.org/">Apache Mesos 0.9</a>, the first Apache Incubator release of Mesos, which contains significant usability and stability improvements. Most notable are better memory accounting for applications with long-term memory use, easier access of old jobs’ traces and logs (by keeping a history of executed tasks on the web UI), and simpler installation.</p>
<h3>Performance Improvements</h3>
-<p>Spark&#8217;s scheduling is more communication-efficient when sending out operations on RDDs with large lineage graphs. In addition, the cache replacement policy has been improved to more smartly replace data when an RDD does not fit in the cache, shuffles are more efficient, and the serializer used for shipping closures is now configurable, making it possible to use faster libraries than Java serialization there.</p>
+<p>Spark’s scheduling is more communication-efficient when sending out operations on RDDs with large lineage graphs. In addition, the cache replacement policy has been improved to more smartly replace data when an RDD does not fit in the cache, shuffles are more efficient, and the serializer used for shipping closures is now configurable, making it possible to use faster libraries than Java serialization there.</p>
<h3>Debug Improvements</h3>
@@ -141,11 +141,11 @@
<h3>EC2 Launch Script Improvements</h3>
-<p>Spark&#8217;s EC2 launch scripts are now included in the main package, and have the ability to discover and use the latest Spark AMI automatically instead of launching a hardcoded machine image ID.</p>
+<p>Spark’s EC2 launch scripts are now included in the main package, and have the ability to discover and use the latest Spark AMI automatically instead of launching a hardcoded machine image ID.</p>
<h3>New Hadoop API Support</h3>
-<p>You can now use Spark to read and write data to storage formats in the new <tt>org.apache.mapreduce</tt> packages (the &#8220;new Hadoop&#8221; API). In addition, this release fixes an issue caused by a HDFS initialization bug in some recent versions of HDFS.</p>
+<p>You can now use Spark to read and write data to storage formats in the new <tt>org.apache.mapreduce</tt> packages (the “new Hadoop” API). In addition, this release fixes an issue caused by a HDFS initialization bug in some recent versions of HDFS.</p>
</article><!-- #post -->
diff --git a/site/releases/spark-release-0-5-1.html b/site/releases/spark-release-0-5-1.html
index f460287e6..956b6a997 100644
--- a/site/releases/spark-release-0-5-1.html
+++ b/site/releases/spark-release-0-5-1.html
@@ -155,7 +155,7 @@
<h3>EC2 Improvements</h3>
-<p>Spark&#8217;s EC2 launch script now configures Spark&#8217;s memory limit automatically based on the machine&#8217;s available RAM.</p>
+<p>Spark’s EC2 launch script now configures Spark’s memory limit automatically based on the machine’s available RAM.</p>
</article><!-- #post -->
diff --git a/site/releases/spark-release-0-6-0.html b/site/releases/spark-release-0-6-0.html
index 150205250..e1ea2accb 100644
--- a/site/releases/spark-release-0-6-0.html
+++ b/site/releases/spark-release-0-6-0.html
@@ -134,11 +134,11 @@
<h3>Java API</h3>
-<p>Java programmers can now use Spark through a new <a href="/docs/0.6.0/java-programming-guide.html">Java API layer</a>. This layer makes available all of Spark&#8217;s features, including parallel transformations, distributed datasets, broadcast variables, and accumulators, in a Java-friendly manner.</p>
+<p>Java programmers can now use Spark through a new <a href="/docs/0.6.0/java-programming-guide.html">Java API layer</a>. This layer makes available all of Spark’s features, including parallel transformations, distributed datasets, broadcast variables, and accumulators, in a Java-friendly manner.</p>
<h3>Expanded Documentation</h3>
-<p>Spark&#8217;s <a href="/docs/0.6.0/">documentation</a> has been expanded with a new <a href="/docs/0.6.0/quick-start.html">quick start guide</a>, additional deployment instructions, configuration guide, tuning guide, and improved <a href="/docs/0.6.0/api/core">Scaladoc</a> API documentation.</p>
+<p>Spark’s <a href="/docs/0.6.0/">documentation</a> has been expanded with a new <a href="/docs/0.6.0/quick-start.html">quick start guide</a>, additional deployment instructions, configuration guide, tuning guide, and improved <a href="/docs/0.6.0/api/core">Scaladoc</a> API documentation.</p>
<h3>Engine Changes</h3>
@@ -161,7 +161,7 @@
<h3>Enhanced Debugging</h3>
-<p>Spark&#8217;s log now prints which operation in your program each RDD and job described in your logs belongs to, making it easier to tie back to which parts of your code experience problems.</p>
+<p>Spark’s log now prints which operation in your program each RDD and job described in your logs belongs to, making it easier to tie back to which parts of your code experience problems.</p>
<h3>Maven Artifacts</h3>
diff --git a/site/releases/spark-release-0-7-0.html b/site/releases/spark-release-0-7-0.html
index 30631fb54..fee9c63a7 100644
--- a/site/releases/spark-release-0-7-0.html
+++ b/site/releases/spark-release-0-7-0.html
@@ -148,7 +148,7 @@
<h3>New Operations</h3>
-<p>This release adds several RDD transformations, including <tt>keys</tt>, <tt>values</tt>, <tt>keyBy</tt>, <tt>subtract</tt>, <tt>coalesce</tt>, <tt>zip</tt>. It also adds <tt>SparkContext.hadoopConfiguration</tt> to allow programs to configure Hadoop input/output settings globally across operations. Finally, it adds the <tt>RDD.toDebugString()</tt> method, which can be used to print an RDD&#8217;s lineage graph for troubleshooting.</p>
+<p>This release adds several RDD transformations, including <tt>keys</tt>, <tt>values</tt>, <tt>keyBy</tt>, <tt>subtract</tt>, <tt>coalesce</tt>, <tt>zip</tt>. It also adds <tt>SparkContext.hadoopConfiguration</tt> to allow programs to configure Hadoop input/output settings globally across operations. Finally, it adds the <tt>RDD.toDebugString()</tt> method, which can be used to print an RDD’s lineage graph for troubleshooting.</p>
<h3>EC2 Improvements</h3>
@@ -185,7 +185,7 @@
<h3>Credits</h3>
-<p>Spark 0.7 was the work of many contributors from Berkeley and outside&#8212;in total, 31 different contributors, of which 20 were from outside Berkeley. Here are the people who contributed, along with areas they worked on:</p>
+<p>Spark 0.7 was the work of many contributors from Berkeley and outside—in total, 31 different contributors, of which 20 were from outside Berkeley. Here are the people who contributed, along with areas they worked on:</p>
<ul>
<li>Mikhail Bautin -- Maven build</li>
diff --git a/site/releases/spark-release-0-8-0.html b/site/releases/spark-release-0-8-0.html
index aac3aed44..5c405c1c8 100644
--- a/site/releases/spark-release-0-8-0.html
+++ b/site/releases/spark-release-0-8-0.html
@@ -172,13 +172,13 @@
<li>The examples build has been isolated from the core build, substantially reducing the potential for dependency conflicts.</li>
<li>The Spark Streaming Twitter API has been updated to use OAuth authentication instead of the deprecated username/password authentication in Spark 0.7.0.</li>
<li>Several new example jobs have been added, including PageRank implementations in Java, Scala and Python, examples for accessing HBase and Cassandra, and MLlib examples.</li>
- <li>Support for running on Mesos has been improved &#8211; now you can deploy a Spark assembly JAR as part of the Mesos job, instead of having Spark pre-installed on each machine. The default Mesos version has also been updated to 0.13.</li>
+ <li>Support for running on Mesos has been improved – now you can deploy a Spark assembly JAR as part of the Mesos job, instead of having Spark pre-installed on each machine. The default Mesos version has also been updated to 0.13.</li>
<li>This release includes various optimizations to PySpark and to the job scheduler.</li>
</ul>
<h3 id="compatibility">Compatibility</h3>
<ul>
- <li><strong>This release changes Spark’s package name to &#8216;org.apache.spark&#8217;</strong>, so those upgrading from Spark 0.7 will need to adjust their imports accordingly. In addition, we’ve moved the <code>RDD</code> class to the org.apache.spark.rdd package (it was previously in the top-level package). The Spark artifacts published through Maven have also changed to the new package name.</li>
+ <li><strong>This release changes Spark’s package name to ‘org.apache.spark’</strong>, so those upgrading from Spark 0.7 will need to adjust their imports accordingly. In addition, we’ve moved the <code>RDD</code> class to the org.apache.spark.rdd package (it was previously in the top-level package). The Spark artifacts published through Maven have also changed to the new package name.</li>
<li>In the Java API, use of Scala’s <code>Option</code> class has been replaced with <code>Optional</code> from the Guava library.</li>
<li>Linking against Spark for arbitrary Hadoop versions is now possible by specifying a dependency on <code>hadoop-client</code>, instead of rebuilding <code>spark-core</code> against your version of Hadoop. See the documentation <a href="http://spark.incubator.apache.org/docs/0.8.0/scala-programming-guide.html#linking-with-spark">here</a> for details.</li>
<li>If you are building Spark, you’ll now need to run <code>sbt/sbt assembly</code> instead of <code>package</code>.</li>
@@ -188,73 +188,73 @@
<p>Spark 0.8.0 was the result of the largest team of contributors yet. The following developers contributed to this release:</p>
<ul>
- <li>Andrew Ash &#8211; documentation, code cleanup and logging improvements</li>
- <li>Mikhail Bautin &#8211; bug fix</li>
- <li>Konstantin Boudnik &#8211; Maven build, bug fixes, and documentation</li>
- <li>Ian Buss &#8211; sbt configuration improvement</li>
- <li>Evan Chan &#8211; API improvement, bug fix, and documentation</li>
- <li>Lian Cheng &#8211; bug fix</li>
- <li>Tathagata Das &#8211; performance improvement in streaming receiver and streaming bug fix</li>
- <li>Aaron Davidson &#8211; Python improvements, bug fix, and unit tests</li>
- <li>Giovanni Delussu &#8211; coalesced RDD feature</li>
- <li>Joseph E. Gonzalez &#8211; improvement to zipPartitions</li>
- <li>Karen Feng &#8211; several improvements to web UI</li>
- <li>Andy Feng &#8211; HDFS metrics</li>
- <li>Ali Ghodsi &#8211; configuration improvements and locality-aware coalesce</li>
- <li>Christoph Grothaus &#8211; bug fix</li>
- <li>Thomas Graves &#8211; support for secure YARN cluster and various YARN-related improvements</li>
- <li>Stephen Haberman &#8211; bug fix, documentation, and code cleanup</li>
- <li>Mark Hamstra &#8211; bug fixes and Maven build</li>
- <li>Benjamin Hindman &#8211; Mesos compatibility and documentation</li>
- <li>Liang-Chi Hsieh &#8211; bug fixes in build and in YARN mode</li>
- <li>Shane Huang &#8211; shuffle improvements, bug fix</li>
- <li>Ethan Jewett &#8211; Spark/HBase example</li>
- <li>Holden Karau &#8211; bug fix and EC2 improvement</li>
- <li>Kody Koeniger &#8211; JDBV RDD implementation</li>
- <li>Andy Konwinski &#8211; documentation</li>
- <li>Jey Kottalam &#8211; PySpark optimizations, Hadoop agnostic build (lead), and bug fixes</li>
- <li>Andrey Kouznetsov &#8211; Bug fix</li>
- <li>S. Kumar &#8211; Spark Streaming example</li>
- <li>Ryan LeCompte &#8211; topK method optimization and serialization improvements</li>
- <li>Gavin Li &#8211; compression codecs and pipe support</li>
- <li>Harold Lim &#8211; fair scheduler</li>
- <li>Dmitriy Lyubimov &#8211; bug fix</li>
- <li>Chris Mattmann &#8211; Apache mentor</li>
- <li>David McCauley &#8211; JSON API improvement</li>
- <li>Sean McNamara &#8211; added <code>takeOrdered</code> function, bug fixes, and a build fix</li>
- <li>Mridul Muralidharan &#8211; YARN integration (lead) and scheduler improvements</li>
- <li>Marc Mercer &#8211; improvements to UI json output</li>
- <li>Christopher Nguyen &#8211; bug fixes</li>
- <li>Erik van Oosten &#8211; example fix</li>
- <li>Kay Ousterhout &#8211; fix for scheduler regression and bug fixes</li>
- <li>Xinghao Pan &#8211; MLLib contributions</li>
- <li>Hiral Patel &#8211; bug fix</li>
- <li>James Phillpotts &#8211; updated Twitter API for Spark streaming</li>
- <li>Nick Pentreath &#8211; scala pageRank example, bagel improvement, and several Java examples</li>
- <li>Alexander Pivovarov &#8211; logging improvement and Maven build</li>
- <li>Mike Potts &#8211; configuration improvement</li>
- <li>Rohit Rai &#8211; Spark/Cassandra example</li>
- <li>Imran Rashid &#8211; bug fixes and UI improvement</li>
- <li>Charles Reiss &#8211; bug fixes, code cleanup, performance improvements</li>
- <li>Josh Rosen &#8211; Python API improvements, Java API improvements, EC2 scripts and bug fixes</li>
- <li>Henry Saputra &#8211; Apache mentor</li>
- <li>Jerry Shao &#8211; bug fixes, metrics system</li>
- <li>Prashant Sharma &#8211; documentation</li>
- <li>Mingfei Shi &#8211; joblogger and bug fix</li>
- <li>Andre Shumacher &#8211; several PySpark features</li>
- <li>Ginger Smith &#8211; MLLib contribution</li>
- <li>Evan Sparks &#8211; contributions to MLLib</li>
- <li>Ram Sriharsha &#8211; bug fix and RDD removal feature</li>
- <li>Ameet Talwalkar &#8211; MLlib contributions</li>
- <li>Roman Tkalenko &#8211; code refactoring and cleanup</li>
- <li>Chu Tong &#8211; Java PageRank algorithm and bug fix in bash scripts</li>
- <li>Shivaram Venkataraman &#8211; bug fixes, contributions to MLLib, netty shuffle fixes, and Java API additions</li>
- <li>Patrick Wendell &#8211; release manager, bug fixes, documentation, metrics system, and web UI</li>
- <li>Andrew Xia &#8211; fair scheduler (lead), metrics system, and ui improvements</li>
- <li>Reynold Xin &#8211; shuffle improvements, bug fixes, code refactoring, usability improvements, MLLib contributions</li>
- <li>Matei Zaharia &#8211; MLLib contributions, documentation, examples, UI improvements, PySpark improvements, and bug fixes</li>
- <li>Wu Zeming &#8211; bug fix in scheduler</li>
- <li>Bill Zhao &#8211; log message improvement</li>
+ <li>Andrew Ash – documentation, code cleanup and logging improvements</li>
+ <li>Mikhail Bautin – bug fix</li>
+ <li>Konstantin Boudnik – Maven build, bug fixes, and documentation</li>
+ <li>Ian Buss – sbt configuration improvement</li>
+ <li>Evan Chan – API improvement, bug fix, and documentation</li>
+ <li>Lian Cheng – bug fix</li>
+ <li>Tathagata Das – performance improvement in streaming receiver and streaming bug fix</li>
+ <li>Aaron Davidson – Python improvements, bug fix, and unit tests</li>
+ <li>Giovanni Delussu – coalesced RDD feature</li>
+ <li>Joseph E. Gonzalez – improvement to zipPartitions</li>
+ <li>Karen Feng – several improvements to web UI</li>
+ <li>Andy Feng – HDFS metrics</li>
+ <li>Ali Ghodsi – configuration improvements and locality-aware coalesce</li>
+ <li>Christoph Grothaus – bug fix</li>
+ <li>Thomas Graves – support for secure YARN cluster and various YARN-related improvements</li>
+ <li>Stephen Haberman – bug fix, documentation, and code cleanup</li>
+ <li>Mark Hamstra – bug fixes and Maven build</li>
+ <li>Benjamin Hindman – Mesos compatibility and documentation</li>
+ <li>Liang-Chi Hsieh – bug fixes in build and in YARN mode</li>
+ <li>Shane Huang – shuffle improvements, bug fix</li>
+ <li>Ethan Jewett – Spark/HBase example</li>
+ <li>Holden Karau – bug fix and EC2 improvement</li>
+ <li>Kody Koeniger – JDBV RDD implementation</li>
+ <li>Andy Konwinski – documentation</li>
+ <li>Jey Kottalam – PySpark optimizations, Hadoop agnostic build (lead), and bug fixes</li>
+ <li>Andrey Kouznetsov – Bug fix</li>
+ <li>S. Kumar – Spark Streaming example</li>
+ <li>Ryan LeCompte – topK method optimization and serialization improvements</li>
+ <li>Gavin Li – compression codecs and pipe support</li>
+ <li>Harold Lim – fair scheduler</li>
+ <li>Dmitriy Lyubimov – bug fix</li>
+ <li>Chris Mattmann – Apache mentor</li>
+ <li>David McCauley – JSON API improvement</li>
+ <li>Sean McNamara – added <code>takeOrdered</code> function, bug fixes, and a build fix</li>
+ <li>Mridul Muralidharan – YARN integration (lead) and scheduler improvements</li>
+ <li>Marc Mercer – improvements to UI json output</li>
+ <li>Christopher Nguyen – bug fixes</li>
+ <li>Erik van Oosten – example fix</li>
+ <li>Kay Ousterhout – fix for scheduler regression and bug fixes</li>
+ <li>Xinghao Pan – MLLib contributions</li>
+ <li>Hiral Patel – bug fix</li>
+ <li>James Phillpotts – updated Twitter API for Spark streaming</li>
+ <li>Nick Pentreath – scala pageRank example, bagel improvement, and several Java examples</li>
+ <li>Alexander Pivovarov – logging improvement and Maven build</li>
+ <li>Mike Potts – configuration improvement</li>
+ <li>Rohit Rai – Spark/Cassandra example</li>
+ <li>Imran Rashid – bug fixes and UI improvement</li>
+ <li>Charles Reiss – bug fixes, code cleanup, performance improvements</li>
+ <li>Josh Rosen – Python API improvements, Java API improvements, EC2 scripts and bug fixes</li>
+ <li>Henry Saputra – Apache mentor</li>
+ <li>Jerry Shao – bug fixes, metrics system</li>
+ <li>Prashant Sharma – documentation</li>
+ <li>Mingfei Shi – joblogger and bug fix</li>
+ <li>Andre Shumacher – several PySpark features</li>
+ <li>Ginger Smith – MLLib contribution</li>
+ <li>Evan Sparks – contributions to MLLib</li>
+ <li>Ram Sriharsha – bug fix and RDD removal feature</li>
+ <li>Ameet Talwalkar – MLlib contributions</li>
+ <li>Roman Tkalenko – code refactoring and cleanup</li>
+ <li>Chu Tong – Java PageRank algorithm and bug fix in bash scripts</li>
+ <li>Shivaram Venkataraman – bug fixes, contributions to MLLib, netty shuffle fixes, and Java API additions</li>
+ <li>Patrick Wendell – release manager, bug fixes, documentation, metrics system, and web UI</li>
+ <li>Andrew Xia – fair scheduler (lead), metrics system, and ui improvements</li>
+ <li>Reynold Xin – shuffle improvements, bug fixes, code refactoring, usability improvements, MLLib contributions</li>
+ <li>Matei Zaharia – MLLib contributions, documentation, examples, UI improvements, PySpark improvements, and bug fixes</li>
+ <li>Wu Zeming – bug fix in scheduler</li>
+ <li>Bill Zhao – log message improvement</li>
</ul>
<p>Thanks to everyone who contributed!
diff --git a/site/screencasts/3-transformations-and-caching.html b/site/screencasts/3-transformations-and-caching.html
index 52f039de7..e4010148c 100644
--- a/site/screencasts/3-transformations-and-caching.html
+++ b/site/screencasts/3-transformations-and-caching.html
@@ -1,127 +1,3 @@
-<!DOCTYPE html>
-<!--[if IE 6]>
-<html id="ie6" dir="ltr" lang="en-US">
-<![endif]-->
-<!--[if IE 7]>
-<html id="ie7" dir="ltr" lang="en-US">
-<![endif]-->
-<!--[if IE 8]>
-<html id="ie8" dir="ltr" lang="en-US">
-<![endif]-->
-<!--[if !(IE 6) | !(IE 7) | !(IE 8) ]><!-->
-<html dir="ltr" lang="en-US">
-<!--<![endif]-->
-<head>
- <link rel="shortcut icon" href="/favicon.ico" />
- <meta charset="UTF-8" />
- <meta name="viewport" content="width=device-width" />
- <title>
- Transformations and Caching - Spark Screencast #3 | Apache Spark
-
- </title>
-
- <link rel="stylesheet" type="text/css" media="all" href="/css/style.css" />
- <link rel="stylesheet" href="/css/pygments-default.css">
-
- <script type="text/javascript">
- <!-- Google Analytics initialization -->
- var _gaq = _gaq || [];
- _gaq.push(['_setAccount', 'UA-32518208-2']);
- _gaq.push(['_trackPageview']);
- (function() {
- var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
- ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
- var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
- })();
-
- <!-- Adds slight delay to links to allow async reporting -->
- function trackOutboundLink(link, category, action) {
- try {
- _gaq.push(['_trackEvent', category , action]);
- } catch(err){}
-
- setTimeout(function() {
- document.location.href = link.href;
- }, 100);
- }
- </script>
-
- <link rel='canonical' href='/index.html' />
-
- <style type="text/css">
- #site-title,
- #site-description {
- position: absolute !important;
- clip: rect(1px 1px 1px 1px); /* IE6, IE7 */
- clip: rect(1px, 1px, 1px, 1px);
- }
- </style>
- <style type="text/css" id="custom-background-css">
- body.custom-background { background-color: #f1f1f1; }
- </style>
-</head>
-
-<!--body class="page singular"-->
-<body class="singular">
-<div id="page" class="hfeed">
-
- <header id="branding" role="banner">
- <hgroup>
- <h1 id="site-title"><span><a href="/" title="Spark" rel="home">Spark</a></span></h1>
- <h2 id="site-description">Lightning-Fast Cluster Computing</h2>
- </hgroup>
-
- <a href="/">
- <img src="/images/spark-project-header1.png" width="1000" height="220" alt="Spark: Lightning-Fast Cluster Computing" title="Spark: Lightning-Fast Cluster Computing" />
- </a>
-
- <nav id="access" role="navigation">
- <h3 class="assistive-text">Main menu</h3>
- <div class="menu-main-menu-container">
- <ul id="menu-main-menu" class="menu">
-
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/index.html">Home</a>
- </li>
-
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/downloads.html">Downloads</a>
- </li>
-
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/documentation.html">Documentation</a>
- </li>
-
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/examples.html">Examples</a>
- </li>
-
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/mailing-lists.html">Mailing Lists</a>
- </li>
-
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/research.html">Research</a>
- </li>
-
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/faq.html">FAQ</a>
- </li>
-
- </ul></div>
- </nav><!-- #access -->
-</header><!-- #branding -->
-
-
-
- <div id="main">
- <div id="primary">
- <div id="content" role="main">
-
- <article class="page type-page status-publish hentry">
- <h2>Transformations and Caching - Spark Screencast #3</h2>
-
-
<p>In this third Spark screencast, we demonstrate more advanced use of RDD actions and transformations, as well as caching RDDs in memory.</p>
<div class="video-container video-square shadow"><iframe width="755" height="705" src="http://www.youtube.com/embed/T1lZcimvL18?autohide=0&amp;showinfo=0" frameborder="0" allowfullscreen=""></iframe></div>
@@ -129,26 +5,3 @@
<p>Check out the next spark screencast in the series, <a href="/screencasts/4-a-standalone-job-in-spark.html">Spark Screencast #4 - A Standalone Job in Scala</a>.</p>
<p>For more information and links to other Spark screencasts, check out the <a href="/documentation.html">Spark documentation page</a>.</p>
-
- </article><!-- #post -->
-
- </div><!-- #content -->
-
- <footer id="colophon" role="contentinfo">
- <div id="site-generator">
- <p style="padding-top: 0; padding-bottom: 15px;">
- Apache Spark is an effort undergoing incubation at The Apache Software Foundation.
- <a href="http://incubator.apache.org/" style="border: none;">
- <img style="vertical-align: middle; border: none;" src="/images/incubator-logo.png" alt="Apache Incubator" title="Apache Incubator" />
- </a>
- </p>
- </div>
-</footer><!-- #colophon -->
-
- </div><!-- #primary -->
- </div><!-- #main -->
-</div><!-- #page -->
-
-
-</body>
-</html>
diff --git a/site/screencasts/4-a-standalone-job-in-spark.html b/site/screencasts/4-a-standalone-job-in-spark.html
index 5c84da4af..2ee574040 100644
--- a/site/screencasts/4-a-standalone-job-in-spark.html
+++ b/site/screencasts/4-a-standalone-job-in-spark.html
@@ -1,152 +1,5 @@
-<!DOCTYPE html>
-<!--[if IE 6]>
-<html id="ie6" dir="ltr" lang="en-US">
-<![endif]-->
-<!--[if IE 7]>
-<html id="ie7" dir="ltr" lang="en-US">
-<![endif]-->
-<!--[if IE 8]>
-<html id="ie8" dir="ltr" lang="en-US">
-<![endif]-->
-<!--[if !(IE 6) | !(IE 7) | !(IE 8) ]><!-->
-<html dir="ltr" lang="en-US">
-<!--<![endif]-->
-<head>
- <link rel="shortcut icon" href="/favicon.ico" />
- <meta charset="UTF-8" />
- <meta name="viewport" content="width=device-width" />
- <title>
- A Standalone Job in Scala - Spark Screencast #4 | Apache Spark
-
- </title>
-
- <link rel="stylesheet" type="text/css" media="all" href="/css/style.css" />
- <link rel="stylesheet" href="/css/pygments-default.css">
-
- <script type="text/javascript">
- <!-- Google Analytics initialization -->
- var _gaq = _gaq || [];
- _gaq.push(['_setAccount', 'UA-32518208-2']);
- _gaq.push(['_trackPageview']);
- (function() {
- var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
- ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
- var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
- })();
-
- <!-- Adds slight delay to links to allow async reporting -->
- function trackOutboundLink(link, category, action) {
- try {
- _gaq.push(['_trackEvent', category , action]);
- } catch(err){}
-
- setTimeout(function() {
- document.location.href = link.href;
- }, 100);
- }
- </script>
-
- <link rel='canonical' href='/index.html' />
-
- <style type="text/css">
- #site-title,
- #site-description {
- position: absolute !important;
- clip: rect(1px 1px 1px 1px); /* IE6, IE7 */
- clip: rect(1px, 1px, 1px, 1px);
- }
- </style>
- <style type="text/css" id="custom-background-css">
- body.custom-background { background-color: #f1f1f1; }
- </style>
-</head>
-
-<!--body class="page singular"-->
-<body class="singular">
-<div id="page" class="hfeed">
-
- <header id="branding" role="banner">
- <hgroup>
- <h1 id="site-title"><span><a href="/" title="Spark" rel="home">Spark</a></span></h1>
- <h2 id="site-description">Lightning-Fast Cluster Computing</h2>
- </hgroup>
-
- <a href="/">
- <img src="/images/spark-project-header1.png" width="1000" height="220" alt="Spark: Lightning-Fast Cluster Computing" title="Spark: Lightning-Fast Cluster Computing" />
- </a>
-
- <nav id="access" role="navigation">
- <h3 class="assistive-text">Main menu</h3>
- <div class="menu-main-menu-container">
- <ul id="menu-main-menu" class="menu">
-
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/index.html">Home</a>
- </li>
-
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/downloads.html">Downloads</a>
- </li>
-
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/documentation.html">Documentation</a>
- </li>
-
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/examples.html">Examples</a>
- </li>
-
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/mailing-lists.html">Mailing Lists</a>
- </li>
-
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/research.html">Research</a>
- </li>
-
- <li class="menu-item menu-item-type-post_type menu-item-object-page ">
- <a href="/faq.html">FAQ</a>
- </li>
-
- </ul></div>
- </nav><!-- #access -->
-</header><!-- #branding -->
-
-
-
- <div id="main">
- <div id="primary">
- <div id="content" role="main">
-
- <article class="page type-page status-publish hentry">
- <h2>A Standalone Job in Scala - Spark Screencast #4</h2>
-
-
<p>In this Spark screencast, we create a standalone Apache Spark job in Scala. In the job, we create a spark context and read a file into an RDD of strings; then apply transformations and actions to the RDD and print out the results.</p>
<div class="video-container video-16x9 shadow"><iframe width="755" height="425" src="http://www.youtube.com/embed/GaBn-YjlR8Q?autohide=0&amp;showinfo=0" frameborder="0" allowfullscreen=""></iframe></div>
<p>For more information and links to other Spark screencasts, check out the <a href="/documentation.html">Spark documentation page</a>.</p>
-
- </article><!-- #post -->
-
- </div><!-- #content -->
-
- <footer id="colophon" role="contentinfo">
- <div id="site-generator">
- <p style="padding-top: 0; padding-bottom: 15px;">
- Apache Spark is an effort undergoing incubation at The Apache Software Foundation.
- <a href="http://incubator.apache.org/" style="border: none;">
- <img style="vertical-align: middle; border: none;" src="/images/incubator-logo.png" alt="Apache Incubator" title="Apache Incubator" />
- </a>
- </p>
- </div>
-</footer><!-- #colophon -->
-
- </div><!-- #primary -->
- </div><!-- #main -->
-</div><!-- #page -->
-
-
-</body>
-</html>
diff --git a/site/screencasts/index.html b/site/screencasts/index.html
index c2ad85b99..e05eb0791 100644
--- a/site/screencasts/index.html
+++ b/site/screencasts/index.html
@@ -122,7 +122,7 @@
<article class="hentry">
<header class="entry-header">
- <h1 class="entry-title"><a href="/screencasts/4-a-standalone-job-in-spark.html">A Standalone Job in Scala - Spark Screencast #4</a></h1>
+ <h1 class="entry-title"><a href="/screencasts/4-a-standalone-job-in-spark.html">4 A Standalone Job In Spark</a></h1>
<div class="entry-meta">August 26, 2013</div>
</header>
<div class="entry-content"><p>In this Spark screencast, we create a standalone Apache Spark job in Scala. In the job, we create a spark context and read a file into an RDD of strings; then apply transformations and actions to the RDD and print out the results.</p>
@@ -132,7 +132,7 @@
<article class="hentry">
<header class="entry-header">
- <h1 class="entry-title"><a href="/screencasts/3-transformations-and-caching.html">Transformations and Caching - Spark Screencast #3</a></h1>
+ <h1 class="entry-title"><a href="/screencasts/3-transformations-and-caching.html">3 Transformations And Caching</a></h1>
<div class="entry-meta">April 16, 2013</div>
</header>
<div class="entry-content"><p>In this third Spark screencast, we demonstrate more advanced use of RDD actions and transformations, as well as caching RDDs in memory.</p>