summaryrefslogtreecommitdiff
path: root/site/releases
diff options
context:
space:
mode:
authorPatrick Wendell <pwendell@apache.org>2014-02-03 08:21:13 +0000
committerPatrick Wendell <pwendell@apache.org>2014-02-03 08:21:13 +0000
commit4d3c3750df6af496afbc5f8ec2a23514327e1f8d (patch)
treefed6724bf02ef2eae916111ad68eb2df34d26799 /site/releases
parentffa89cfbb0887db1e53b4e774359003daae7a62c (diff)
downloadspark-website-4d3c3750df6af496afbc5f8ec2a23514327e1f8d.tar.gz
spark-website-4d3c3750df6af496afbc5f8ec2a23514327e1f8d.tar.bz2
spark-website-4d3c3750df6af496afbc5f8ec2a23514327e1f8d.zip
Updates for Spark 0.9.0 release.
Diffstat (limited to 'site/releases')
-rw-r--r--site/releases/spark-release-0-3.html8
-rw-r--r--site/releases/spark-release-0-5-0.html14
-rw-r--r--site/releases/spark-release-0-5-1.html8
-rw-r--r--site/releases/spark-release-0-5-2.html6
-rw-r--r--site/releases/spark-release-0-6-0.html12
-rw-r--r--site/releases/spark-release-0-6-1.html6
-rw-r--r--site/releases/spark-release-0-6-2.html6
-rw-r--r--site/releases/spark-release-0-7-0.html10
-rw-r--r--site/releases/spark-release-0-7-2.html6
-rw-r--r--site/releases/spark-release-0-7-3.html6
-rw-r--r--site/releases/spark-release-0-8-0.html144
-rw-r--r--site/releases/spark-release-0-8-1.html92
-rw-r--r--site/releases/spark-release-0-9-0.html450
13 files changed, 609 insertions, 159 deletions
diff --git a/site/releases/spark-release-0-3.html b/site/releases/spark-release-0-3.html
index 6519b0218..6ad2ff970 100644
--- a/site/releases/spark-release-0-3.html
+++ b/site/releases/spark-release-0-3.html
@@ -124,6 +124,9 @@
<h5>Latest News</h5>
<ul class="list-unstyled">
+ <li><a href="/news/spark-0-9-0-released.html">Spark 0.9.0 released</a>
+ <span class="small">(Feb 02, 2014)</span></li>
+
<li><a href="/news/spark-0-8-1-released.html">Spark 0.8.1 released</a>
<span class="small">(Dec 19, 2013)</span></li>
@@ -133,9 +136,6 @@
<li><a href="/news/announcing-the-first-spark-summit.html">Announcing the first Spark Summit: December 2, 2013</a>
<span class="small">(Oct 08, 2013)</span></li>
- <li><a href="/news/spark-0-8-0-released.html">Spark 0.8.0 released</a>
- <span class="small">(Sep 25, 2013)</span></li>
-
</ul>
<p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p>
</div>
@@ -176,7 +176,7 @@
<h3>Native Types for SequenceFiles</h3>
-<p>In working with SequenceFiles, which store objects that implement Hadoop’s Writable interface, Spark will now let you use native types for certain common Writable types, like IntWritable and Text. For example:</p>
+<p>In working with SequenceFiles, which store objects that implement Hadoop&#8217;s Writable interface, Spark will now let you use native types for certain common Writable types, like IntWritable and Text. For example:</p>
<div class="code">
<span class="comment">// Will read a SequenceFile of (IntWritable, Text)</span><br />
diff --git a/site/releases/spark-release-0-5-0.html b/site/releases/spark-release-0-5-0.html
index ce64ef7bd..16a1044f1 100644
--- a/site/releases/spark-release-0-5-0.html
+++ b/site/releases/spark-release-0-5-0.html
@@ -124,6 +124,9 @@
<h5>Latest News</h5>
<ul class="list-unstyled">
+ <li><a href="/news/spark-0-9-0-released.html">Spark 0.9.0 released</a>
+ <span class="small">(Feb 02, 2014)</span></li>
+
<li><a href="/news/spark-0-8-1-released.html">Spark 0.8.1 released</a>
<span class="small">(Dec 19, 2013)</span></li>
@@ -133,9 +136,6 @@
<li><a href="/news/announcing-the-first-spark-summit.html">Announcing the first Spark Summit: December 2, 2013</a>
<span class="small">(Oct 08, 2013)</span></li>
- <li><a href="/news/spark-0-8-0-released.html">Spark 0.8.0 released</a>
- <span class="small">(Sep 25, 2013)</span></li>
-
</ul>
<p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p>
</div>
@@ -164,10 +164,10 @@
<h3>Mesos 0.9 Support</h3>
-<p>This release runs on <a href="http://www.mesosproject.org/">Apache Mesos 0.9</a>, the first Apache Incubator release of Mesos, which contains significant usability and stability improvements. Most notable are better memory accounting for applications with long-term memory use, easier access of old jobs’ traces and logs (by keeping a history of executed tasks on the web UI), and simpler installation.</p>
+<p>This release runs on <a href="http://www.mesosproject.org/">Apache Mesos 0.9</a>, the first Apache Incubator release of Mesos, which contains significant usability and stability improvements. Most notable are better memory accounting for applications with long-term memory use, easier access of old jobs&#8217; traces and logs (by keeping a history of executed tasks on the web UI), and simpler installation.</p>
<h3>Performance Improvements</h3>
-<p>Spark’s scheduling is more communication-efficient when sending out operations on RDDs with large lineage graphs. In addition, the cache replacement policy has been improved to more smartly replace data when an RDD does not fit in the cache, shuffles are more efficient, and the serializer used for shipping closures is now configurable, making it possible to use faster libraries than Java serialization there.</p>
+<p>Spark&#8217;s scheduling is more communication-efficient when sending out operations on RDDs with large lineage graphs. In addition, the cache replacement policy has been improved to more smartly replace data when an RDD does not fit in the cache, shuffles are more efficient, and the serializer used for shipping closures is now configurable, making it possible to use faster libraries than Java serialization there.</p>
<h3>Debug Improvements</h3>
@@ -179,11 +179,11 @@
<h3>EC2 Launch Script Improvements</h3>
-<p>Spark’s EC2 launch scripts are now included in the main package, and have the ability to discover and use the latest Spark AMI automatically instead of launching a hardcoded machine image ID.</p>
+<p>Spark&#8217;s EC2 launch scripts are now included in the main package, and have the ability to discover and use the latest Spark AMI automatically instead of launching a hardcoded machine image ID.</p>
<h3>New Hadoop API Support</h3>
-<p>You can now use Spark to read and write data to storage formats in the new <tt>org.apache.mapreduce</tt> packages (the “new Hadoop” API). In addition, this release fixes an issue caused by a HDFS initialization bug in some recent versions of HDFS.</p>
+<p>You can now use Spark to read and write data to storage formats in the new <tt>org.apache.mapreduce</tt> packages (the &#8220;new Hadoop&#8221; API). In addition, this release fixes an issue caused by a HDFS initialization bug in some recent versions of HDFS.</p>
<p>
diff --git a/site/releases/spark-release-0-5-1.html b/site/releases/spark-release-0-5-1.html
index 56957b368..d3c33b39a 100644
--- a/site/releases/spark-release-0-5-1.html
+++ b/site/releases/spark-release-0-5-1.html
@@ -124,6 +124,9 @@
<h5>Latest News</h5>
<ul class="list-unstyled">
+ <li><a href="/news/spark-0-9-0-released.html">Spark 0.9.0 released</a>
+ <span class="small">(Feb 02, 2014)</span></li>
+
<li><a href="/news/spark-0-8-1-released.html">Spark 0.8.1 released</a>
<span class="small">(Dec 19, 2013)</span></li>
@@ -133,9 +136,6 @@
<li><a href="/news/announcing-the-first-spark-summit.html">Announcing the first Spark Summit: December 2, 2013</a>
<span class="small">(Oct 08, 2013)</span></li>
- <li><a href="/news/spark-0-8-0-released.html">Spark 0.8.0 released</a>
- <span class="small">(Sep 25, 2013)</span></li>
-
</ul>
<p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p>
</div>
@@ -193,7 +193,7 @@
<h3>EC2 Improvements</h3>
-<p>Spark’s EC2 launch script now configures Spark’s memory limit automatically based on the machine’s available RAM.</p>
+<p>Spark&#8217;s EC2 launch script now configures Spark&#8217;s memory limit automatically based on the machine&#8217;s available RAM.</p>
<p>
diff --git a/site/releases/spark-release-0-5-2.html b/site/releases/spark-release-0-5-2.html
index c61ed4653..b3eefbfb4 100644
--- a/site/releases/spark-release-0-5-2.html
+++ b/site/releases/spark-release-0-5-2.html
@@ -124,6 +124,9 @@
<h5>Latest News</h5>
<ul class="list-unstyled">
+ <li><a href="/news/spark-0-9-0-released.html">Spark 0.9.0 released</a>
+ <span class="small">(Feb 02, 2014)</span></li>
+
<li><a href="/news/spark-0-8-1-released.html">Spark 0.8.1 released</a>
<span class="small">(Dec 19, 2013)</span></li>
@@ -133,9 +136,6 @@
<li><a href="/news/announcing-the-first-spark-summit.html">Announcing the first Spark Summit: December 2, 2013</a>
<span class="small">(Oct 08, 2013)</span></li>
- <li><a href="/news/spark-0-8-0-released.html">Spark 0.8.0 released</a>
- <span class="small">(Sep 25, 2013)</span></li>
-
</ul>
<p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p>
</div>
diff --git a/site/releases/spark-release-0-6-0.html b/site/releases/spark-release-0-6-0.html
index 3c047ff8c..c5e36a6b5 100644
--- a/site/releases/spark-release-0-6-0.html
+++ b/site/releases/spark-release-0-6-0.html
@@ -124,6 +124,9 @@
<h5>Latest News</h5>
<ul class="list-unstyled">
+ <li><a href="/news/spark-0-9-0-released.html">Spark 0.9.0 released</a>
+ <span class="small">(Feb 02, 2014)</span></li>
+
<li><a href="/news/spark-0-8-1-released.html">Spark 0.8.1 released</a>
<span class="small">(Dec 19, 2013)</span></li>
@@ -133,9 +136,6 @@
<li><a href="/news/announcing-the-first-spark-summit.html">Announcing the first Spark Summit: December 2, 2013</a>
<span class="small">(Oct 08, 2013)</span></li>
- <li><a href="/news/spark-0-8-0-released.html">Spark 0.8.0 released</a>
- <span class="small">(Sep 25, 2013)</span></li>
-
</ul>
<p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p>
</div>
@@ -172,11 +172,11 @@
<h3>Java API</h3>
-<p>Java programmers can now use Spark through a new <a href="/docs/0.6.0/java-programming-guide.html">Java API layer</a>. This layer makes available all of Spark’s features, including parallel transformations, distributed datasets, broadcast variables, and accumulators, in a Java-friendly manner.</p>
+<p>Java programmers can now use Spark through a new <a href="/docs/0.6.0/java-programming-guide.html">Java API layer</a>. This layer makes available all of Spark&#8217;s features, including parallel transformations, distributed datasets, broadcast variables, and accumulators, in a Java-friendly manner.</p>
<h3>Expanded Documentation</h3>
-<p>Spark’s <a href="/docs/0.6.0/">documentation</a> has been expanded with a new <a href="/docs/0.6.0/quick-start.html">quick start guide</a>, additional deployment instructions, configuration guide, tuning guide, and improved <a href="/docs/0.6.0/api/core">Scaladoc</a> API documentation.</p>
+<p>Spark&#8217;s <a href="/docs/0.6.0/">documentation</a> has been expanded with a new <a href="/docs/0.6.0/quick-start.html">quick start guide</a>, additional deployment instructions, configuration guide, tuning guide, and improved <a href="/docs/0.6.0/api/core">Scaladoc</a> API documentation.</p>
<h3>Engine Changes</h3>
@@ -199,7 +199,7 @@
<h3>Enhanced Debugging</h3>
-<p>Spark’s log now prints which operation in your program each RDD and job described in your logs belongs to, making it easier to tie back to which parts of your code experience problems.</p>
+<p>Spark&#8217;s log now prints which operation in your program each RDD and job described in your logs belongs to, making it easier to tie back to which parts of your code experience problems.</p>
<h3>Maven Artifacts</h3>
diff --git a/site/releases/spark-release-0-6-1.html b/site/releases/spark-release-0-6-1.html
index 293612b03..13436907b 100644
--- a/site/releases/spark-release-0-6-1.html
+++ b/site/releases/spark-release-0-6-1.html
@@ -124,6 +124,9 @@
<h5>Latest News</h5>
<ul class="list-unstyled">
+ <li><a href="/news/spark-0-9-0-released.html">Spark 0.9.0 released</a>
+ <span class="small">(Feb 02, 2014)</span></li>
+
<li><a href="/news/spark-0-8-1-released.html">Spark 0.8.1 released</a>
<span class="small">(Dec 19, 2013)</span></li>
@@ -133,9 +136,6 @@
<li><a href="/news/announcing-the-first-spark-summit.html">Announcing the first Spark Summit: December 2, 2013</a>
<span class="small">(Oct 08, 2013)</span></li>
- <li><a href="/news/spark-0-8-0-released.html">Spark 0.8.0 released</a>
- <span class="small">(Sep 25, 2013)</span></li>
-
</ul>
<p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p>
</div>
diff --git a/site/releases/spark-release-0-6-2.html b/site/releases/spark-release-0-6-2.html
index 624ac16b7..29e59dbc4 100644
--- a/site/releases/spark-release-0-6-2.html
+++ b/site/releases/spark-release-0-6-2.html
@@ -124,6 +124,9 @@
<h5>Latest News</h5>
<ul class="list-unstyled">
+ <li><a href="/news/spark-0-9-0-released.html">Spark 0.9.0 released</a>
+ <span class="small">(Feb 02, 2014)</span></li>
+
<li><a href="/news/spark-0-8-1-released.html">Spark 0.8.1 released</a>
<span class="small">(Dec 19, 2013)</span></li>
@@ -133,9 +136,6 @@
<li><a href="/news/announcing-the-first-spark-summit.html">Announcing the first Spark Summit: December 2, 2013</a>
<span class="small">(Oct 08, 2013)</span></li>
- <li><a href="/news/spark-0-8-0-released.html">Spark 0.8.0 released</a>
- <span class="small">(Sep 25, 2013)</span></li>
-
</ul>
<p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p>
</div>
diff --git a/site/releases/spark-release-0-7-0.html b/site/releases/spark-release-0-7-0.html
index 41f62b97f..d88b2859e 100644
--- a/site/releases/spark-release-0-7-0.html
+++ b/site/releases/spark-release-0-7-0.html
@@ -124,6 +124,9 @@
<h5>Latest News</h5>
<ul class="list-unstyled">
+ <li><a href="/news/spark-0-9-0-released.html">Spark 0.9.0 released</a>
+ <span class="small">(Feb 02, 2014)</span></li>
+
<li><a href="/news/spark-0-8-1-released.html">Spark 0.8.1 released</a>
<span class="small">(Dec 19, 2013)</span></li>
@@ -133,9 +136,6 @@
<li><a href="/news/announcing-the-first-spark-summit.html">Announcing the first Spark Summit: December 2, 2013</a>
<span class="small">(Oct 08, 2013)</span></li>
- <li><a href="/news/spark-0-8-0-released.html">Spark 0.8.0 released</a>
- <span class="small">(Sep 25, 2013)</span></li>
-
</ul>
<p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p>
</div>
@@ -186,7 +186,7 @@
<h3>New Operations</h3>
-<p>This release adds several RDD transformations, including <tt>keys</tt>, <tt>values</tt>, <tt>keyBy</tt>, <tt>subtract</tt>, <tt>coalesce</tt>, <tt>zip</tt>. It also adds <tt>SparkContext.hadoopConfiguration</tt> to allow programs to configure Hadoop input/output settings globally across operations. Finally, it adds the <tt>RDD.toDebugString()</tt> method, which can be used to print an RDD’s lineage graph for troubleshooting.</p>
+<p>This release adds several RDD transformations, including <tt>keys</tt>, <tt>values</tt>, <tt>keyBy</tt>, <tt>subtract</tt>, <tt>coalesce</tt>, <tt>zip</tt>. It also adds <tt>SparkContext.hadoopConfiguration</tt> to allow programs to configure Hadoop input/output settings globally across operations. Finally, it adds the <tt>RDD.toDebugString()</tt> method, which can be used to print an RDD&#8217;s lineage graph for troubleshooting.</p>
<h3>EC2 Improvements</h3>
@@ -223,7 +223,7 @@
<h3>Credits</h3>
-<p>Spark 0.7 was the work of many contributors from Berkeley and outside—in total, 31 different contributors, of which 20 were from outside Berkeley. Here are the people who contributed, along with areas they worked on:</p>
+<p>Spark 0.7 was the work of many contributors from Berkeley and outside&#8212;in total, 31 different contributors, of which 20 were from outside Berkeley. Here are the people who contributed, along with areas they worked on:</p>
<ul>
<li>Mikhail Bautin -- Maven build</li>
diff --git a/site/releases/spark-release-0-7-2.html b/site/releases/spark-release-0-7-2.html
index 19d55ece2..8099a0add 100644
--- a/site/releases/spark-release-0-7-2.html
+++ b/site/releases/spark-release-0-7-2.html
@@ -124,6 +124,9 @@
<h5>Latest News</h5>
<ul class="list-unstyled">
+ <li><a href="/news/spark-0-9-0-released.html">Spark 0.9.0 released</a>
+ <span class="small">(Feb 02, 2014)</span></li>
+
<li><a href="/news/spark-0-8-1-released.html">Spark 0.8.1 released</a>
<span class="small">(Dec 19, 2013)</span></li>
@@ -133,9 +136,6 @@
<li><a href="/news/announcing-the-first-spark-summit.html">Announcing the first Spark Summit: December 2, 2013</a>
<span class="small">(Oct 08, 2013)</span></li>
- <li><a href="/news/spark-0-8-0-released.html">Spark 0.8.0 released</a>
- <span class="small">(Sep 25, 2013)</span></li>
-
</ul>
<p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p>
</div>
diff --git a/site/releases/spark-release-0-7-3.html b/site/releases/spark-release-0-7-3.html
index 20390155d..d97412f88 100644
--- a/site/releases/spark-release-0-7-3.html
+++ b/site/releases/spark-release-0-7-3.html
@@ -124,6 +124,9 @@
<h5>Latest News</h5>
<ul class="list-unstyled">
+ <li><a href="/news/spark-0-9-0-released.html">Spark 0.9.0 released</a>
+ <span class="small">(Feb 02, 2014)</span></li>
+
<li><a href="/news/spark-0-8-1-released.html">Spark 0.8.1 released</a>
<span class="small">(Dec 19, 2013)</span></li>
@@ -133,9 +136,6 @@
<li><a href="/news/announcing-the-first-spark-summit.html">Announcing the first Spark Summit: December 2, 2013</a>
<span class="small">(Oct 08, 2013)</span></li>
- <li><a href="/news/spark-0-8-0-released.html">Spark 0.8.0 released</a>
- <span class="small">(Sep 25, 2013)</span></li>
-
</ul>
<p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p>
</div>
diff --git a/site/releases/spark-release-0-8-0.html b/site/releases/spark-release-0-8-0.html
index 1bb741885..b1abc7b57 100644
--- a/site/releases/spark-release-0-8-0.html
+++ b/site/releases/spark-release-0-8-0.html
@@ -124,6 +124,9 @@
<h5>Latest News</h5>
<ul class="list-unstyled">
+ <li><a href="/news/spark-0-9-0-released.html">Spark 0.9.0 released</a>
+ <span class="small">(Feb 02, 2014)</span></li>
+
<li><a href="/news/spark-0-8-1-released.html">Spark 0.8.1 released</a>
<span class="small">(Dec 19, 2013)</span></li>
@@ -133,9 +136,6 @@
<li><a href="/news/announcing-the-first-spark-summit.html">Announcing the first Spark Summit: December 2, 2013</a>
<span class="small">(Oct 08, 2013)</span></li>
- <li><a href="/news/spark-0-8-0-released.html">Spark 0.8.0 released</a>
- <span class="small">(Sep 25, 2013)</span></li>
-
</ul>
<p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p>
</div>
@@ -204,13 +204,13 @@
<li>The examples build has been isolated from the core build, substantially reducing the potential for dependency conflicts.</li>
<li>The Spark Streaming Twitter API has been updated to use OAuth authentication instead of the deprecated username/password authentication in Spark 0.7.0.</li>
<li>Several new example jobs have been added, including PageRank implementations in Java, Scala and Python, examples for accessing HBase and Cassandra, and MLlib examples.</li>
- <li>Support for running on Mesos has been improved – now you can deploy a Spark assembly JAR as part of the Mesos job, instead of having Spark pre-installed on each machine. The default Mesos version has also been updated to 0.13.</li>
+ <li>Support for running on Mesos has been improved &#8211; now you can deploy a Spark assembly JAR as part of the Mesos job, instead of having Spark pre-installed on each machine. The default Mesos version has also been updated to 0.13.</li>
<li>This release includes various optimizations to PySpark and to the job scheduler.</li>
</ul>
<h3 id="compatibility">Compatibility</h3>
<ul>
- <li><strong>This release changes Spark’s package name to ‘org.apache.spark’</strong>, so those upgrading from Spark 0.7 will need to adjust their imports accordingly. In addition, we’ve moved the <code>RDD</code> class to the org.apache.spark.rdd package (it was previously in the top-level package). The Spark artifacts published through Maven have also changed to the new package name.</li>
+ <li><strong>This release changes Spark’s package name to &#8216;org.apache.spark&#8217;</strong>, so those upgrading from Spark 0.7 will need to adjust their imports accordingly. In addition, we’ve moved the <code>RDD</code> class to the org.apache.spark.rdd package (it was previously in the top-level package). The Spark artifacts published through Maven have also changed to the new package name.</li>
<li>In the Java API, use of Scala’s <code>Option</code> class has been replaced with <code>Optional</code> from the Guava library.</li>
<li>Linking against Spark for arbitrary Hadoop versions is now possible by specifying a dependency on <code>hadoop-client</code>, instead of rebuilding <code>spark-core</code> against your version of Hadoop. See the documentation <a href="http://spark.incubator.apache.org/docs/0.8.0/scala-programming-guide.html#linking-with-spark">here</a> for details.</li>
<li>If you are building Spark, you’ll now need to run <code>sbt/sbt assembly</code> instead of <code>package</code>.</li>
@@ -220,73 +220,73 @@
<p>Spark 0.8.0 was the result of the largest team of contributors yet. The following developers contributed to this release:</p>
<ul>
- <li>Andrew Ash – documentation, code cleanup and logging improvements</li>
- <li>Mikhail Bautin – bug fix</li>
- <li>Konstantin Boudnik – Maven build, bug fixes, and documentation</li>
- <li>Ian Buss – sbt configuration improvement</li>
- <li>Evan Chan – API improvement, bug fix, and documentation</li>
- <li>Lian Cheng – bug fix</li>
- <li>Tathagata Das – performance improvement in streaming receiver and streaming bug fix</li>
- <li>Aaron Davidson – Python improvements, bug fix, and unit tests</li>
- <li>Giovanni Delussu – coalesced RDD feature</li>
- <li>Joseph E. Gonzalez – improvement to zipPartitions</li>
- <li>Karen Feng – several improvements to web UI</li>
- <li>Andy Feng – HDFS metrics</li>
- <li>Ali Ghodsi – configuration improvements and locality-aware coalesce</li>
- <li>Christoph Grothaus – bug fix</li>
- <li>Thomas Graves – support for secure YARN cluster and various YARN-related improvements</li>
- <li>Stephen Haberman – bug fix, documentation, and code cleanup</li>
- <li>Mark Hamstra – bug fixes and Maven build</li>
- <li>Benjamin Hindman – Mesos compatibility and documentation</li>
- <li>Liang-Chi Hsieh – bug fixes in build and in YARN mode</li>
- <li>Shane Huang – shuffle improvements, bug fix</li>
- <li>Ethan Jewett – Spark/HBase example</li>
- <li>Holden Karau – bug fix and EC2 improvement</li>
- <li>Kody Koeniger – JDBV RDD implementation</li>
- <li>Andy Konwinski – documentation</li>
- <li>Jey Kottalam – PySpark optimizations, Hadoop agnostic build (lead), and bug fixes</li>
- <li>Andrey Kouznetsov – Bug fix</li>
- <li>S. Kumar – Spark Streaming example</li>
- <li>Ryan LeCompte – topK method optimization and serialization improvements</li>
- <li>Gavin Li – compression codecs and pipe support</li>
- <li>Harold Lim – fair scheduler</li>
- <li>Dmitriy Lyubimov – bug fix</li>
- <li>Chris Mattmann – Apache mentor</li>
- <li>David McCauley – JSON API improvement</li>
- <li>Sean McNamara – added <code>takeOrdered</code> function, bug fixes, and a build fix</li>
- <li>Mridul Muralidharan – YARN integration (lead) and scheduler improvements</li>
- <li>Marc Mercer – improvements to UI json output</li>
- <li>Christopher Nguyen – bug fixes</li>
- <li>Erik van Oosten – example fix</li>
- <li>Kay Ousterhout – fix for scheduler regression and bug fixes</li>
- <li>Xinghao Pan – MLLib contributions</li>
- <li>Hiral Patel – bug fix</li>
- <li>James Phillpotts – updated Twitter API for Spark streaming</li>
- <li>Nick Pentreath – scala pageRank example, bagel improvement, and several Java examples</li>
- <li>Alexander Pivovarov – logging improvement and Maven build</li>
- <li>Mike Potts – configuration improvement</li>
- <li>Rohit Rai – Spark/Cassandra example</li>
- <li>Imran Rashid – bug fixes and UI improvement</li>
- <li>Charles Reiss – bug fixes, code cleanup, performance improvements</li>
- <li>Josh Rosen – Python API improvements, Java API improvements, EC2 scripts and bug fixes</li>
- <li>Henry Saputra – Apache mentor</li>
- <li>Jerry Shao – bug fixes, metrics system</li>
- <li>Prashant Sharma – documentation</li>
- <li>Mingfei Shi – joblogger and bug fix</li>
- <li>Andre Schumacher – several PySpark features</li>
- <li>Ginger Smith – MLLib contribution</li>
- <li>Evan Sparks – contributions to MLLib</li>
- <li>Ram Sriharsha – bug fix and RDD removal feature</li>
- <li>Ameet Talwalkar – MLlib contributions</li>
- <li>Roman Tkalenko – code refactoring and cleanup</li>
- <li>Chu Tong – Java PageRank algorithm and bug fix in bash scripts</li>
- <li>Shivaram Venkataraman – bug fixes, contributions to MLLib, netty shuffle fixes, and Java API additions</li>
- <li>Patrick Wendell – release manager, bug fixes, documentation, metrics system, and web UI</li>
- <li>Andrew Xia – fair scheduler (lead), metrics system, and ui improvements</li>
- <li>Reynold Xin – shuffle improvements, bug fixes, code refactoring, usability improvements, MLLib contributions</li>
- <li>Matei Zaharia – MLLib contributions, documentation, examples, UI improvements, PySpark improvements, and bug fixes</li>
- <li>Wu Zeming – bug fix in scheduler</li>
- <li>Bill Zhao – log message improvement</li>
+ <li>Andrew Ash &#8211; documentation, code cleanup and logging improvements</li>
+ <li>Mikhail Bautin &#8211; bug fix</li>
+ <li>Konstantin Boudnik &#8211; Maven build, bug fixes, and documentation</li>
+ <li>Ian Buss &#8211; sbt configuration improvement</li>
+ <li>Evan Chan &#8211; API improvement, bug fix, and documentation</li>
+ <li>Lian Cheng &#8211; bug fix</li>
+ <li>Tathagata Das &#8211; performance improvement in streaming receiver and streaming bug fix</li>
+ <li>Aaron Davidson &#8211; Python improvements, bug fix, and unit tests</li>
+ <li>Giovanni Delussu &#8211; coalesced RDD feature</li>
+ <li>Joseph E. Gonzalez &#8211; improvement to zipPartitions</li>
+ <li>Karen Feng &#8211; several improvements to web UI</li>
+ <li>Andy Feng &#8211; HDFS metrics</li>
+ <li>Ali Ghodsi &#8211; configuration improvements and locality-aware coalesce</li>
+ <li>Christoph Grothaus &#8211; bug fix</li>
+ <li>Thomas Graves &#8211; support for secure YARN cluster and various YARN-related improvements</li>
+ <li>Stephen Haberman &#8211; bug fix, documentation, and code cleanup</li>
+ <li>Mark Hamstra &#8211; bug fixes and Maven build</li>
+ <li>Benjamin Hindman &#8211; Mesos compatibility and documentation</li>
+ <li>Liang-Chi Hsieh &#8211; bug fixes in build and in YARN mode</li>
+ <li>Shane Huang &#8211; shuffle improvements, bug fix</li>
+ <li>Ethan Jewett &#8211; Spark/HBase example</li>
+ <li>Holden Karau &#8211; bug fix and EC2 improvement</li>
+ <li>Kody Koeniger &#8211; JDBV RDD implementation</li>
+ <li>Andy Konwinski &#8211; documentation</li>
+ <li>Jey Kottalam &#8211; PySpark optimizations, Hadoop agnostic build (lead), and bug fixes</li>
+ <li>Andrey Kouznetsov &#8211; Bug fix</li>
+ <li>S. Kumar &#8211; Spark Streaming example</li>
+ <li>Ryan LeCompte &#8211; topK method optimization and serialization improvements</li>
+ <li>Gavin Li &#8211; compression codecs and pipe support</li>
+ <li>Harold Lim &#8211; fair scheduler</li>
+ <li>Dmitriy Lyubimov &#8211; bug fix</li>
+ <li>Chris Mattmann &#8211; Apache mentor</li>
+ <li>David McCauley &#8211; JSON API improvement</li>
+ <li>Sean McNamara &#8211; added <code>takeOrdered</code> function, bug fixes, and a build fix</li>
+ <li>Mridul Muralidharan &#8211; YARN integration (lead) and scheduler improvements</li>
+ <li>Marc Mercer &#8211; improvements to UI json output</li>
+ <li>Christopher Nguyen &#8211; bug fixes</li>
+ <li>Erik van Oosten &#8211; example fix</li>
+ <li>Kay Ousterhout &#8211; fix for scheduler regression and bug fixes</li>
+ <li>Xinghao Pan &#8211; MLLib contributions</li>
+ <li>Hiral Patel &#8211; bug fix</li>
+ <li>James Phillpotts &#8211; updated Twitter API for Spark streaming</li>
+ <li>Nick Pentreath &#8211; scala pageRank example, bagel improvement, and several Java examples</li>
+ <li>Alexander Pivovarov &#8211; logging improvement and Maven build</li>
+ <li>Mike Potts &#8211; configuration improvement</li>
+ <li>Rohit Rai &#8211; Spark/Cassandra example</li>
+ <li>Imran Rashid &#8211; bug fixes and UI improvement</li>
+ <li>Charles Reiss &#8211; bug fixes, code cleanup, performance improvements</li>
+ <li>Josh Rosen &#8211; Python API improvements, Java API improvements, EC2 scripts and bug fixes</li>
+ <li>Henry Saputra &#8211; Apache mentor</li>
+ <li>Jerry Shao &#8211; bug fixes, metrics system</li>
+ <li>Prashant Sharma &#8211; documentation</li>
+ <li>Mingfei Shi &#8211; joblogger and bug fix</li>
+ <li>Andre Schumacher &#8211; several PySpark features</li>
+ <li>Ginger Smith &#8211; MLLib contribution</li>
+ <li>Evan Sparks &#8211; contributions to MLLib</li>
+ <li>Ram Sriharsha &#8211; bug fix and RDD removal feature</li>
+ <li>Ameet Talwalkar &#8211; MLlib contributions</li>
+ <li>Roman Tkalenko &#8211; code refactoring and cleanup</li>
+ <li>Chu Tong &#8211; Java PageRank algorithm and bug fix in bash scripts</li>
+ <li>Shivaram Venkataraman &#8211; bug fixes, contributions to MLLib, netty shuffle fixes, and Java API additions</li>
+ <li>Patrick Wendell &#8211; release manager, bug fixes, documentation, metrics system, and web UI</li>
+ <li>Andrew Xia &#8211; fair scheduler (lead), metrics system, and ui improvements</li>
+ <li>Reynold Xin &#8211; shuffle improvements, bug fixes, code refactoring, usability improvements, MLLib contributions</li>
+ <li>Matei Zaharia &#8211; MLLib contributions, documentation, examples, UI improvements, PySpark improvements, and bug fixes</li>
+ <li>Wu Zeming &#8211; bug fix in scheduler</li>
+ <li>Bill Zhao &#8211; log message improvement</li>
</ul>
<p>Thanks to everyone who contributed!
diff --git a/site/releases/spark-release-0-8-1.html b/site/releases/spark-release-0-8-1.html
index 56c1c1bc5..4f42ceacd 100644
--- a/site/releases/spark-release-0-8-1.html
+++ b/site/releases/spark-release-0-8-1.html
@@ -124,6 +124,9 @@
<h5>Latest News</h5>
<ul class="list-unstyled">
+ <li><a href="/news/spark-0-9-0-released.html">Spark 0.9.0 released</a>
+ <span class="small">(Feb 02, 2014)</span></li>
+
<li><a href="/news/spark-0-8-1-released.html">Spark 0.8.1 released</a>
<span class="small">(Dec 19, 2013)</span></li>
@@ -133,9 +136,6 @@
<li><a href="/news/announcing-the-first-spark-summit.html">Announcing the first Spark Summit: December 2, 2013</a>
<span class="small">(Oct 08, 2013)</span></li>
- <li><a href="/news/spark-0-8-0-released.html">Spark 0.8.0 released</a>
- <span class="small">(Sep 25, 2013)</span></li>
-
</ul>
<p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p>
</div>
@@ -163,7 +163,7 @@
<p>Apache Spark 0.8.1 is a maintenance and performance release for the Scala 2.9 version of Spark. It also adds several new features, such as standalone mode high availability, that will appear in Spark 0.9 but developers wanted to have in Scala 2.9. Contributions to 0.8.1 came from 41 developers.</p>
<h3 id="yarn-22-support">YARN 2.2 Support</h3>
-<p>Support has been added for running Spark on YARN 2.2 and newer. Due to a change in the YARN API between previous versions and 2.2+, this was not supported in Spark 0.8.0. See the <a href="/docs/0.8.1/running-on-yarn.html">YARN documentation</a> for specific instructions on how to build Spark for YARN 2.2+. We’ve also included a pre-compiled binary for YARN 2.2.</p>
+<p>Support has been added for running Spark on YARN 2.2 and newer. Due to a change in the YARN API between previous versions and 2.2+, this was not supported in Spark 0.8.0. See the <a href="/docs/0.8.1/running-on-yarn.html">YARN documentation</a> for specific instructions on how to build Spark for YARN 2.2+. We&#8217;ve also included a pre-compiled binary for YARN 2.2.</p>
<h3 id="high-availability-mode-for-standalone-cluster-manager">High Availability Mode for Standalone Cluster Manager</h3>
<p>The standalone cluster manager now has a high availability (H/A) mode which can tolerate master failures. This is particularly useful for long-running applications such as streaming jobs and the shark server, where the scheduler master previously represented a single point of failure. Instructions for deploying H/A mode are included <a href="/docs/0.8.1/spark-standalone.html#high-availability">in the documentation</a>. The current implementation uses Zookeeper for coordination.</p>
@@ -174,7 +174,7 @@
<ul>
<li>Optimized hashtables for shuffle data - reduces memory and CPU consumption</li>
<li>Efficient encoding for JobConfs - improves latency for stages reading large numbers of blocks from HDFS, S3, and HBase</li>
- <li>Shuffle file consolidation (off by default) - reduces the number of files created in large shuffles for better filesystem performance. This change works best on filesystems newer than ext3 (we recommend ext4 or XFS), and it will be the default in Spark 0.9, but we’ve left it off by default for compatibility. We recommend users turn this on unless they are using ext3 by setting <code>spark.shuffle.consolidateFiles</code> to “true”.</li>
+ <li>Shuffle file consolidation (off by default) - reduces the number of files created in large shuffles for better filesystem performance. This change works best on filesystems newer than ext3 (we recommend ext4 or XFS), and it will be the default in Spark 0.9, but we’ve left it off by default for compatibility. We recommend users turn this on unless they are using ext3 by setting <code>spark.shuffle.consolidateFiles</code> to &#8220;true&#8221;.</li>
<li>Torrent broadcast (off by default) - a faster broadcast implementation for large objects.</li>
<li>Support for fetching large result sets - allows tasks to return large results without tuning Akka buffer sizes.</li>
</ul>
@@ -211,47 +211,47 @@
<h3 id="credits">Credits</h3>
<ul>
- <li>Michael Armbrust – build fix</li>
- <li>Pierre Borckmans – typo fix in documentation</li>
- <li>Evan Chan – <code>local://</code> scheme for dependency jars</li>
- <li>Ewen Cheslack-Postava – <code>add</code> method for python accumulators, support for setting config properties in python</li>
- <li>Mosharaf Chowdhury – optimized broadcast implementation</li>
- <li>Frank Dai – documentation fix</li>
- <li>Aaron Davidson – shuffle file consolidation, H/A mode for standalone scheduler, cleaned up representation of block IDs, several improvements and bug fixes</li>
- <li>Tathagata Das – new streaming operators, fix for kafka concurrency bug</li>
- <li>Ankur Dave – support for pausing spot clusters on EC2</li>
- <li>Harvey Feng – optimization to JobConf broadcasts, bug fixes, YARN 2.2 build</li>
- <li>Ali Ghodsi – YARN 2.2 build</li>
- <li>Thomas Graves – Spark YARN integration including secure HDFS access over YARN</li>
- <li>Li Guoqiang – fix for Maven build</li>
- <li>Stephen Haberman – bug fix</li>
- <li>Haidar Hadi – documentation fix</li>
- <li>Nathan Howell – bug fix relating to YARN</li>
- <li>Holden Karau – Java version of <code>mapPartitionsWithIndex</code></li>
- <li>Du Li – bug fix in make-distrubion.sh</li>
- <li>Raymond Liu – work on YARN 2.2 build</li>
- <li>Xi Liu – bug fix and code clean-up</li>
- <li>David McCauley – bug fix in standalone mode JSON output</li>
- <li>Michael (wannabeast) – bug fix in memory store</li>
- <li>Fabrizio Milo – typos in documentation, clean-up in DAGScheduler, typo in scaladoc</li>
- <li>Mridul Muralidharan – fixes to metadata cleaner and speculative execution</li>
- <li>Sundeep Narravula – build fix, bug fixes in scheduler and tests, code clean-up</li>
- <li>Kay Ousterhout – optimized result fetching, new information in UI, scheduler clean-up and bug fixes</li>
- <li>Nick Pentreath – implicit feedback variant of ALS algorithm</li>
- <li>Imran Rashid – improvement to executor launch</li>
- <li>Ahir Reddy – spark support for SIMR</li>
- <li>Josh Rosen – memory use optimization, clean up of BlockManager code, Java and Python clean-up/fixes</li>
- <li>Henry Saputra – build fix</li>
- <li>Jerry Shao – refactoring of fair scheduler, support for running Spark as a specific user, bug fix</li>
- <li>Mingfei Shi – documentation for JobLogger</li>
- <li>Andre Schumacher – sortByKey in PySpark and associated changes</li>
- <li>Karthik Tunga – bug fix in launch script</li>
- <li>Patrick Wendell – <code>repartition</code> operator, shuffle write metrics, various fixes and release management</li>
- <li>Neal Wiggins – import clean-up, documentation fixes</li>
- <li>Andrew Xia – bug fix in UI</li>
- <li>Reynold Xin – task killing, support for setting job properties in Spark shell, logging improvements, Kryo improvements, several bug fixes</li>
- <li>Matei Zaharia – optimized hashmap for shuffle data, PySpark documentation, optimizations to Kryo serializer</li>
- <li>Wu Zeming – bug fix in executors UI</li>
+ <li>Michael Armbrust &#8211; build fix</li>
+ <li>Pierre Borckmans &#8211; typo fix in documentation</li>
+ <li>Evan Chan &#8211; <code>local://</code> scheme for dependency jars</li>
+ <li>Ewen Cheslack-Postava &#8211; <code>add</code> method for python accumulators, support for setting config properties in python</li>
+ <li>Mosharaf Chowdhury &#8211; optimized broadcast implementation</li>
+ <li>Frank Dai &#8211; documentation fix</li>
+ <li>Aaron Davidson &#8211; shuffle file consolidation, H/A mode for standalone scheduler, cleaned up representation of block IDs, several improvements and bug fixes</li>
+ <li>Tathagata Das &#8211; new streaming operators, fix for kafka concurrency bug</li>
+ <li>Ankur Dave &#8211; support for pausing spot clusters on EC2</li>
+ <li>Harvey Feng &#8211; optimization to JobConf broadcasts, bug fixes, YARN 2.2 build</li>
+ <li>Ali Ghodsi &#8211; YARN 2.2 build</li>
+ <li>Thomas Graves &#8211; Spark YARN integration including secure HDFS access over YARN</li>
+ <li>Li Guoqiang &#8211; fix for Maven build</li>
+ <li>Stephen Haberman &#8211; bug fix</li>
+ <li>Haidar Hadi &#8211; documentation fix</li>
+ <li>Nathan Howell &#8211; bug fix relating to YARN</li>
+ <li>Holden Karau &#8211; Java version of <code>mapPartitionsWithIndex</code></li>
+ <li>Du Li &#8211; bug fix in make-distrubion.sh</li>
+ <li>Raymond Liu &#8211; work on YARN 2.2 build</li>
+ <li>Xi Liu &#8211; bug fix and code clean-up</li>
+ <li>David McCauley &#8211; bug fix in standalone mode JSON output</li>
+ <li>Michael (wannabeast) &#8211; bug fix in memory store</li>
+ <li>Fabrizio Milo &#8211; typos in documentation, clean-up in DAGScheduler, typo in scaladoc</li>
+ <li>Mridul Muralidharan &#8211; fixes to metadata cleaner and speculative execution</li>
+ <li>Sundeep Narravula &#8211; build fix, bug fixes in scheduler and tests, code clean-up</li>
+ <li>Kay Ousterhout &#8211; optimized result fetching, new information in UI, scheduler clean-up and bug fixes</li>
+ <li>Nick Pentreath &#8211; implicit feedback variant of ALS algorithm</li>
+ <li>Imran Rashid &#8211; improvement to executor launch</li>
+ <li>Ahir Reddy &#8211; spark support for SIMR</li>
+ <li>Josh Rosen &#8211; memory use optimization, clean up of BlockManager code, Java and Python clean-up/fixes</li>
+ <li>Henry Saputra &#8211; build fix</li>
+ <li>Jerry Shao &#8211; refactoring of fair scheduler, support for running Spark as a specific user, bug fix</li>
+ <li>Mingfei Shi &#8211; documentation for JobLogger</li>
+ <li>Andre Schumacher &#8211; sortByKey in PySpark and associated changes</li>
+ <li>Karthik Tunga &#8211; bug fix in launch script</li>
+ <li>Patrick Wendell &#8211; <code>repartition</code> operator, shuffle write metrics, various fixes and release management</li>
+ <li>Neal Wiggins &#8211; import clean-up, documentation fixes</li>
+ <li>Andrew Xia &#8211; bug fix in UI</li>
+ <li>Reynold Xin &#8211; task killing, support for setting job properties in Spark shell, logging improvements, Kryo improvements, several bug fixes</li>
+ <li>Matei Zaharia &#8211; optimized hashmap for shuffle data, PySpark documentation, optimizations to Kryo serializer</li>
+ <li>Wu Zeming &#8211; bug fix in executors UI</li>
</ul>
<p>Thanks to everyone who contributed!</p>
diff --git a/site/releases/spark-release-0-9-0.html b/site/releases/spark-release-0-9-0.html
new file mode 100644
index 000000000..06bb365b8
--- /dev/null
+++ b/site/releases/spark-release-0-9-0.html
@@ -0,0 +1,450 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+ <meta charset="utf-8">
+ <meta http-equiv="X-UA-Compatible" content="IE=edge">
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+ <title>
+ Spark Release 0.9.0 | Apache Spark
+
+ </title>
+
+
+
+ <!-- Bootstrap core CSS -->
+ <link href="/css/cerulean.min.css" rel="stylesheet">
+ <link href="/css/custom.css" rel="stylesheet">
+
+ <script type="text/javascript">
+ <!-- Google Analytics initialization -->
+ var _gaq = _gaq || [];
+ _gaq.push(['_setAccount', 'UA-32518208-2']);
+ _gaq.push(['_trackPageview']);
+ (function() {
+ var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+ ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+ var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+ })();
+
+ <!-- Adds slight delay to links to allow async reporting -->
+ function trackOutboundLink(link, category, action) {
+ try {
+ _gaq.push(['_trackEvent', category , action]);
+ } catch(err){}
+
+ setTimeout(function() {
+ document.location.href = link.href;
+ }, 100);
+ }
+ </script>
+
+ <!-- HTML5 shim and Respond.js IE8 support of HTML5 elements and media queries -->
+ <!--[if lt IE 9]>
+ <script src="https://oss.maxcdn.com/libs/html5shiv/3.7.0/html5shiv.js"></script>
+ <script src="https://oss.maxcdn.com/libs/respond.js/1.3.0/respond.min.js"></script>
+ <![endif]-->
+</head>
+
+<body>
+
+<div class="container" style="max-width: 1200px;">
+
+<div class="masthead">
+
+ <p class="lead">
+ <a href="/">
+ <img src="/images/spark-logo.png"
+ style="height:100px; width:auto; vertical-align: bottom; margin-top: 20px;"></a><span class="tagline">
+ Lightning-fast cluster computing
+ </span>
+ </p>
+
+</div>
+
+<nav class="navbar navbar-default" role="navigation">
+ <!-- Brand and toggle get grouped for better mobile display -->
+ <div class="navbar-header">
+ <button type="button" class="navbar-toggle" data-toggle="collapse"
+ data-target="#navbar-collapse-1">
+ <span class="sr-only">Toggle navigation</span>
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ </button>
+ </div>
+
+ <!-- Collect the nav links, forms, and other content for toggling -->
+ <div class="collapse navbar-collapse" id="navbar-collapse-1">
+ <ul class="nav navbar-nav">
+ <li><a href="/downloads.html">Download</a></li>
+ <li class="dropdown">
+ <a href="#" class="dropdown-toggle" data-toggle="dropdown">
+ Related Projects <b class="caret"></b>
+ </a>
+ <ul class="dropdown-menu">
+
+ <li><a href="http://shark.cs.berkeley.edu">Shark (SQL)</a></li>
+ <li><a href="/streaming/">Spark Streaming</a></li>
+ <li><a href="/mllib/">MLlib (machine learning)</a></li>
+ <li><a href="http://amplab.github.io/graphx/">GraphX (graph)</a></li>
+ </ul>
+ </li>
+ <li class="dropdown">
+ <a href="#" class="dropdown-toggle" data-toggle="dropdown">
+ Documentation <b class="caret"></b>
+ </a>
+ <ul class="dropdown-menu">
+ <li><a href="/documentation.html">Overview</a></li>
+ <li><a href="/docs/latest/">Latest Release</a></li>
+ <li><a href="/examples.html">Examples</a></li>
+ </ul>
+ </li>
+ <li class="dropdown">
+ <a href="#" class="dropdown-toggle" data-toggle="dropdown">
+ Community <b class="caret"></b>
+ </a>
+ <ul class="dropdown-menu">
+ <li><a href="/community.html">Mailing Lists</a></li>
+ <li><a href="/community.html#events">Events and Meetups</a></li>
+ <li><a href="/community.html#history">Project History</a></li>
+ <li><a href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark">Powered By</a></li>
+ </ul>
+ </li>
+ <li><a href="/faq.html">FAQ</a></li>
+ </ul>
+ </div>
+ <!-- /.navbar-collapse -->
+</nav>
+
+
+<div class="row">
+ <div class="col-md-3 col-md-push-9">
+ <div class="news" style="margin-bottom: 20px;">
+ <h5>Latest News</h5>
+ <ul class="list-unstyled">
+
+ <li><a href="/news/spark-0-9-0-released.html">Spark 0.9.0 released</a>
+ <span class="small">(Feb 02, 2014)</span></li>
+
+ <li><a href="/news/spark-0-8-1-released.html">Spark 0.8.1 released</a>
+ <span class="small">(Dec 19, 2013)</span></li>
+
+ <li><a href="/news/spark-summit-2013-is-a-wrap.html">Spark Summit 2013 is a Wrap</a>
+ <span class="small">(Dec 15, 2013)</span></li>
+
+ <li><a href="/news/announcing-the-first-spark-summit.html">Announcing the first Spark Summit: December 2, 2013</a>
+ <span class="small">(Oct 08, 2013)</span></li>
+
+ </ul>
+ <p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p>
+ </div>
+ <div class="hidden-xs hidden-sm">
+ <a href="/downloads.html" class="btn btn-success btn-lg btn-block" style="margin-bottom: 30px;">
+ Download Spark
+ </a>
+ <p style="font-size: 16px; font-weight: 500; color: #555;">
+ Related Projects:
+ </p>
+ <ul class="list-narrow">
+
+ <li><a href="http://shark.cs.berkeley.edu">Shark (SQL)</a></li>
+ <li><a href="/streaming/">Spark Streaming</a></li>
+ <li><a href="/mllib/">MLlib (machine learning)</a></li>
+ <li><a href="http://amplab.github.io/graphx/">GraphX (graph)</a></li>
+ </ul>
+ </div>
+ </div>
+
+ <div class="col-md-9 col-md-pull-3">
+ <h2>Spark Release 0.9.0</h2>
+
+
+<p>Spark 0.9.0 is a major release that adds significant new features. It updates Spark to Scala 2.10, simplifies high availability, and updates numerous components of the project. This release includes a first version of <a href="http://amplab.github.io/graphx/">GraphX</a>, a powerful new framework for graph processing that comes with a library of standard algorithms. In addition, <a href="/streaming/">Spark Streaming</a> is now out of alpha, and includes significant optimizations and simplified high availability deployment.</p>
+
+<h3 id="scala-210-support">Scala 2.10 Support</h3>
+
+<p>Spark now runs on Scala 2.10, letting users benefit from the language and library improvements in this version.</p>
+
+<h3 id="configuration-system">Configuration System</h3>
+
+<p>The new <a href="/docs/latest/api/core/index.html#org.apache.spark.SparkConf">SparkConf</a> class is now the preferred way to configure advanced settings on your SparkContext, though the previous Java system property method still works. SparkConf is especially useful in tests to make sure properties don’t stay set across tests.</p>
+
+<h3 id="spark-streaming-improvements">Spark Streaming Improvements</h3>
+
+<p>Spark Streaming is now out of alpha, and comes with simplified high availability and several optimizations.</p>
+
+<ul>
+ <li>When running on a Spark standalone cluster with the <a href="/docs/0.9.0/spark-standalone.html#high-availability">standalone cluster high availability mode</a>, you can submit a Spark Streaming driver application to the cluster and have it automatically recovered if either the driver or the cluster master crashes.</li>
+ <li>Windowed operators have been sped up by 30-50%.</li>
+ <li>Spark Streaming’s input source plugins (e.g. for Twitter, Kafka and Flume) are now separate Maven modules, making it easier to pull in only the dependencies you need.</li>
+ <li>A new <a href="/docs/0.9.0/api/streaming/index.html#org.apache.spark.streaming.scheduler.StreamingListener">StreamingListener</a> interface has been added for monitoring statistics about the streaming computation.</li>
+ <li>A few aspects of the API have been improved:
+ <ul>
+ <li><code>DStream</code> and <code>PairDStream</code> classes have been moved from <code>org.apache.spark.streaming</code> to <code>org.apache.spark.streaming.dstream</code> to keep it consistent with <code>org.apache.spark.rdd.RDD</code>.</li>
+ <li><code>DStream.foreach</code> has been renamed to <code>foreachRDD</code> to make it explicit that it works for every RDD, not every element</li>
+ <li><code>StreamingContext.awaitTermination()</code> allows you wait for context shutdown and catch any exception that occurs in the streaming computation.
+ *<code>StreamingContext.stop()</code> now allows stopping of StreamingContext without stopping the underlying SparkContext.</li>
+ </ul>
+ </li>
+</ul>
+
+<h3 id="graphx-alpha">GraphX Alpha</h3>
+
+<p><a href="http://amplab.github.io/graphx/">GraphX</a> is a new framework for graph processing that uses recent advances in graph-parallel computation. It lets you build a graph within a Spark program using the standard Spark operators, then process it with new graph operators that are optimized for distributed computation. It includes <a href="/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.Graph">basic transformations</a>, a <a href="/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.Pregel$">Pregel API</a> for iterative computation, and a standard library of <a href="/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.util.GraphGenerators$">graph loaders</a> and <a href="/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.package">analytics algorithms</a>. By offering these features <em>within</em> the Spark engine, GraphX can significantly speed up processing pipelines compared to workflows that use different engines.</p>
+
+<p>GraphX features in this release include:</p>
+
+<ul>
+ <li>Building graphs from arbitrary Spark RDDs</li>
+ <li>Basic operations to transform graphs or extract subgraphs</li>
+ <li>An optimized Pregel API that takes advantage of graph partitioning and indexing</li>
+ <li>Standard algorithms including <a href="/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.PageRank$">PageRank</a>, <a href="/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.ConnectedComponents$">connected components</a>, <a href="/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.StronglyConnectedComponents$">strongly connected components</a>, <a href="/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.SVDPlusPlus$">SVD++</a>, and <a href="/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.TriangleCount$">triangle counting</a></li>
+ <li>Interactive use from the Spark shell</li>
+</ul>
+
+<p>GraphX is still marked as alpha in this first release, but we recommend for new users to use it instead of the more limited Bagel API.</p>
+
+<h3 id="mllib-improvements">MLlib Improvements</h3>
+
+<ul>
+ <li>Spark’s machine learning library (MLlib) is now <a href="/docs/0.9.0/mllib-guide.html#using-mllib-in-python">available in Python</a>, where it operates on NumPy data (currently requires Python 2.7 and NumPy 1.7)</li>
+ <li>A new algorithm has been added for <a href="/docs/0.9.0/api/mllib/index.html#org.apache.spark.mllib.classification.NaiveBayes">Naive Bayes classification</a></li>
+ <li>Alternating Least Squares models can now be used to predict ratings for multiple items in parallel</li>
+ <li>MLlib’s documentation was expanded to include more examples in Scala, Java and Python</li>
+</ul>
+
+<h3 id="python-changes">Python Changes</h3>
+
+<ul>
+ <li>Python users can now use MLlib (requires Python 2.7 and NumPy 1.7)</li>
+ <li>PySpark now shows the call sites of running jobs in the Spark application UI (http://<driver>:4040), making it easy to see which part of your code is running</driver></li>
+ <li>IPython integration has been updated to work with newer versions</li>
+</ul>
+
+<h3 id="packaging">Packaging</h3>
+
+<ul>
+ <li>Spark’s scripts have been organized into “bin” and “sbin” directories to make it easier to separate admin scripts from user ones and install Spark on standard Linux paths.</li>
+ <li>Log configuration has been improved so that Spark finds a default log4j.properties file if you don’t specify one.</li>
+</ul>
+
+<h3 id="core-engine">Core Engine</h3>
+
+<ul>
+ <li>Spark’s standalone mode now supports submitting a driver program to run on the cluster instead of on the external machine submitting it. You can access this functionality through the <a href="/docs/0.9.0/spark-standalone.html#launching-applications-inside-the-cluster">org.apache.spark.deploy.Client</a> class.</li>
+ <li>Large reduce operations now automatically spill data to disk if it does not fit in memory.</li>
+ <li>Users of standalone mode can now limit how many cores an application will use by default if the application writer didn’t configure its size. Previously, such applications took all available cores on the cluster.</li>
+ <li><code>spark-shell</code> now supports the <code>-i</code> option to run a script on startup.</li>
+ <li>New <code>histogram</code> and <code>countDistinctApprox</code> operators have been added for working with numerical data.</li>
+ <li>YARN mode now supports distributing extra files with the application, and several bugs have been fixed.</li>
+</ul>
+
+<h3 id="compatibility">Compatibility</h3>
+
+<p>This release is compatible with the previous APIs in stable components, but several language versions and script locations have changed.</p>
+
+<ul>
+ <li>Scala programs now need to use Scala 2.10 instead of 2.9.</li>
+ <li>Scripts such as <code>spark-shell</code> and <code>pyspark</code> have been moved into the <code>bin</code> folder, while administrative scripts to start and stop standalone clusters have been moved into <code>sbin</code>.</li>
+ <li>Spark Streaming’s API has been changed to move external input sources into separate modules, <code>DStream</code> and <code>PairDStream</code> has been moved to package <code>org.apache.spark.streaming.dstream</code> and <code>DStream.foreach</code> has been renamed to <code>foreachRDD</code>. We expect the current API to be stable now that Spark Streaming is out of alpha.</li>
+ <li>While the old method of configuring Spark through Java system properties still works, we recommend that users update to the new [SparkConf], which is easier to inspect and use.</li>
+</ul>
+
+<p>We expect all of the current APIs and script locations in Spark 0.9 to remain stable when we release Spark 1.0. We wanted to make these updates early to give users a chance to switch to the new API.</p>
+
+<h3 id="contributors">Contributors</h3>
+<p>The following developers contributed to this release:</p>
+
+<p>Andrew Ash &#8211; documentation improvements</p>
+
+<p>Pierre Borckmans &#8211; documentation fix</p>
+
+<p>Russell Cardullo &#8211; graphite sink for metrics</p>
+
+<p>Evan Chan &#8211; local:// URI feature</p>
+
+<p>Vadim Chekan &#8211; bug fix</p>
+
+<p>Lian Cheng &#8211; refactoring and code clean-up in several locations, bug fixes</p>
+
+<p>Ewen Cheslack-Postava &#8211; Spark EC2 and PySpark improvements</p>
+
+<p>Mosharaf Chowdhury &#8211; optimized broadcast</p>
+
+<p>CodingCat &#8211; documentation improvements</p>
+
+<p>Dan Crankshaw &#8211; GraphX contributions</p>
+
+<p>Haider Haidi &#8211; documentation fix</p>
+
+<p>Frank Dai &#8211; Naive Bayes classifier in MLlib, documentation improvements</p>
+
+<p>Tathagata Das &#8211; new operators, fixes, and improvements to Spark Streaming (lead)</p>
+
+<p>Ankur Dave &#8211; GraphX contributions</p>
+
+<p>Henry Davidge &#8211; warning for large tasks</p>
+
+<p>Aaron Davidson &#8211; shuffle file consolidation, H/A mode for standalone scheduler, various improvements and fixes</p>
+
+<p>Kyle Ellrott &#8211; GraphX contributions</p>
+
+<p>Hossein Falaki &#8211; new statistical operators, Scala and Python examples in MLlib</p>
+
+<p>Harvey Feng &#8211; hadoop file optimizations and YARN integration</p>
+
+<p>Ali Ghodsi &#8211; support for SIMR</p>
+
+<p>Joseph E. Gonzalez &#8211; GraphX contributions</p>
+
+<p>Thomas Graves &#8211; fixes and improvements for YARN support (lead)</p>
+
+<p>Rong Gu &#8211; documentation fix</p>
+
+<p>Stephen Haberman &#8211; bug fixes</p>
+
+<p>Walker Hamilton &#8211; bug fix</p>
+
+<p>Mark Hamstra &#8211; scheduler improvements and fixes, build fixes</p>
+
+<p>Damien Hardy &#8211; Debian build fix</p>
+
+<p>Nathan Howell &#8211; sbt upgrade</p>
+
+<p>Grace Huang &#8211; improvements to metrics code</p>
+
+<p>Shane Huang &#8211; separation of admin and user scripts:</p>
+
+<p>Prabeesh K &#8211; MQTT example</p>
+
+<p>Holden Karau &#8211; sbt build improvements and Java API extensions</p>
+
+<p>KarthikTunga &#8211; bug fix</p>
+
+<p>Grega Kespret &#8211; bug fix</p>
+
+<p>Marek Kolodziej &#8211; optimized random number generator</p>
+
+<p>Jey Kottalam &#8211; EC2 script improvements</p>
+
+<p>Du Li &#8211; bug fixes</p>
+
+<p>Haoyuan Li &#8211; tachyon support in EC2</p>
+
+<p>LiGuoqiang &#8211; fixes to build and YARN integration</p>
+
+<p>Raymond Liu &#8211; build improvement and various fixes for YARN support</p>
+
+<p>George Loentiev &#8211; Maven build fixes</p>
+
+<p>Akihiro Matsukawa &#8211; GraphX contributions</p>
+
+<p>David McCauley &#8211; improvements to json endpoint</p>
+
+<p>Mike &#8211; bug fixes</p>
+
+<p>Fabrizio (Misto) Milo &#8211; bug fix</p>
+
+<p>Mridul Muralidharan &#8211; speculation improvements, several bug fixes</p>
+
+<p>Tor Myklebust &#8211; Python mllib bindings, instrumentation for task serailization</p>
+
+<p>Sundeep Narravula &#8211; bug fix</p>
+
+<p>Binh Nguyen &#8211; Java API improvements and version upgrades</p>
+
+<p>Adam Novak &#8211; bug fix</p>
+
+<p>Andrew Or &#8211; external sorting</p>
+
+<p>Kay Ousterhout &#8211; several bug fixes and improvements to Spark scheduler</p>
+
+<p>Sean Owen &#8211; style fixes</p>
+
+<p>Nick Pentreath &#8211; ALS implicit feedback algorithm</p>
+
+<p>Pillis &#8211; <code>Vector.random()</code> method</p>
+
+<p>Imran Rashid &#8211; bug fix</p>
+
+<p>Ahir Reddy &#8211; support for SIMR</p>
+
+<p>Luca Rosellini &#8211; script loading for Scala shell</p>
+
+<p>Josh Rosen &#8211; fixes, clean-up, and extensions to scala and Java API’s</p>
+
+<p>Henry Saputra &#8211; style improvements and clean-up</p>
+
+<p>Andre Schumacher &#8211; Python improvements and bug fixes</p>
+
+<p>Jerry Shao &#8211; multi-user support, various fixes and improvements</p>
+
+<p>Prashant Sharma &#8211; Scala 2.10 support, configuration system, several smaller fixes</p>
+
+<p>Shiyun &#8211; style fix</p>
+
+<p>Wangda Tan &#8211; UI improvement and bug fixes</p>
+
+<p>Matthew Taylor &#8211; bug fix</p>
+
+<p>Jyun-Fan Tsai &#8211; documentation fix</p>
+
+<p>Takuya Ueshin &#8211; bug fix</p>
+
+<p>Shivaram Venkataraman &#8211; sbt build optimization, EC2 improvements, Java and Python API</p>
+
+<p>Jianping J Wang &#8211; GraphX contributions</p>
+
+<p>Martin Weindel &#8211; build fix</p>
+
+<p>Patrick Wendell &#8211; standalone driver submission, various fixes, release manager</p>
+
+<p>Neal Wiggins &#8211; bug fix</p>
+
+<p>Reynold Xin &#8211; GraphX contributions, task killing, various fixes, improvements and optimizations</p>
+
+<p>Haitao Yao &#8211; bug fix</p>
+
+<p>Xusen Yin &#8211; bug fix</p>
+
+<p>Fengdong Yu &#8211; documentation fixes</p>
+
+<p>Matei Zaharia &#8211; new configuration system, Python MLlib bindings, scheduler improvements, various fixes and optimizations</p>
+
+<p>Wu Zeming &#8211; bug fix</p>
+
+<p>Andrew Xia &#8211; bug fixes and code cleanup</p>
+
+<p>Dong Yan &#8211; bug fix</p>
+
+<p><em>Thanks to everyone who contributed!</em></p>
+
+
+<p>
+<br/>
+<a href="/news/">Spark News Archive</a>
+</p>
+
+ </div>
+</div>
+
+
+
+<footer class="small">
+ <hr>
+ Apache Spark is an effort undergoing incubation at The Apache Software Foundation.
+ <a href="http://incubator.apache.org/" style="border: none;">
+ <img style="vertical-align: middle; float: right; margin-bottom: 15px;"
+ src="/images/incubator-logo.png" alt="Apache Incubator" title="Apache Incubator" />
+ </a>
+</footer>
+
+</div>
+
+<script src="https://code.jquery.com/jquery.js"></script>
+<script src="//netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js"></script>
+<script src="/js/lang-tabs.js"></script>
+
+</body>
+</html>