summaryrefslogtreecommitdiff
path: root/releases
diff options
context:
space:
mode:
authorSean Owen <sowen@cloudera.com>2016-11-11 19:56:10 +0000
committerSean Owen <sowen@cloudera.com>2016-11-15 17:56:22 +0100
commitd82e3722043aa2c2c2d5af6d1e68f16a83101d73 (patch)
tree3520325adf57b6265ecf3f676544a19acb1a1813 /releases
parent4e10a1ac10fa773f891422c7c1a3727e47feca8e (diff)
downloadspark-website-d82e3722043aa2c2c2d5af6d1e68f16a83101d73.tar.gz
spark-website-d82e3722043aa2c2c2d5af6d1e68f16a83101d73.tar.bz2
spark-website-d82e3722043aa2c2c2d5af6d1e68f16a83101d73.zip
Use site.baseurl, not site.url, to work with Jekyll 3.3. Require Jekyll 3.3. Again commit HTML consistent with Jekyll 3.3 output. Fix date problem with news posts that set date: by removing date:.
Diffstat (limited to 'releases')
-rw-r--r--releases/_posts/2013-09-25-spark-release-0-8-0.md2
-rw-r--r--releases/_posts/2013-12-19-spark-release-0-8-1.md4
-rw-r--r--releases/_posts/2014-02-02-spark-release-0-9-0.md18
-rw-r--r--releases/_posts/2014-05-30-spark-release-1-0-0.md14
-rw-r--r--releases/_posts/2014-09-11-spark-release-1-1-0.md2
-rw-r--r--releases/_posts/2014-11-26-spark-release-1-1-1.md2
-rw-r--r--releases/_posts/2014-12-18-spark-release-1-2-0.md2
-rw-r--r--releases/_posts/2015-02-09-spark-release-1-2-1.md2
-rw-r--r--releases/_posts/2015-03-13-spark-release-1-3-0.md2
-rw-r--r--releases/_posts/2015-04-17-spark-release-1-2-2.md2
-rw-r--r--releases/_posts/2015-04-17-spark-release-1-3-1.md2
-rw-r--r--releases/_posts/2015-06-11-spark-release-1-4-0.md2
-rw-r--r--releases/_posts/2015-07-15-spark-release-1-4-1.md2
-rw-r--r--releases/_posts/2015-09-09-spark-release-1-5-0.md2
14 files changed, 29 insertions, 29 deletions
diff --git a/releases/_posts/2013-09-25-spark-release-0-8-0.md b/releases/_posts/2013-09-25-spark-release-0-8-0.md
index 6ca6ecb37..d74806a0e 100644
--- a/releases/_posts/2013-09-25-spark-release-0-8-0.md
+++ b/releases/_posts/2013-09-25-spark-release-0-8-0.md
@@ -19,7 +19,7 @@ You can download Spark 0.8.0 as either a <a href="http://spark-project.org/downl
Spark now displays a variety of monitoring data in a web UI (by default at port 4040 on the driver node). A new job dashboard contains information about running, succeeded, and failed jobs, including percentile statistics covering task runtime, shuffled data, and garbage collection. The existing storage dashboard has been extended, and additional pages have been added to display total storage and task information per-executor. Finally, a new metrics library exposes internal Spark metrics through various API’s including JMX and Ganglia.
<p style="text-align: center;">
-<img src="{{site.url}}images/0.8.0-ui-screenshot.png" style="width:90%;">
+<img src="{{site.baseurl}}/images/0.8.0-ui-screenshot.png" style="width:90%;">
</p>
### Machine Learning Library
diff --git a/releases/_posts/2013-12-19-spark-release-0-8-1.md b/releases/_posts/2013-12-19-spark-release-0-8-1.md
index 89248d9b6..4dbe34c88 100644
--- a/releases/_posts/2013-12-19-spark-release-0-8-1.md
+++ b/releases/_posts/2013-12-19-spark-release-0-8-1.md
@@ -15,10 +15,10 @@ meta:
Apache Spark 0.8.1 is a maintenance and performance release for the Scala 2.9 version of Spark. It also adds several new features, such as standalone mode high availability, that will appear in Spark 0.9 but developers wanted to have in Scala 2.9. Contributions to 0.8.1 came from 41 developers.
### YARN 2.2 Support
-Support has been added for running Spark on YARN 2.2 and newer. Due to a change in the YARN API between previous versions and 2.2+, this was not supported in Spark 0.8.0. See the <a href="{{site.url}}docs/0.8.1/running-on-yarn.html">YARN documentation</a> for specific instructions on how to build Spark for YARN 2.2+. We've also included a pre-compiled binary for YARN 2.2.
+Support has been added for running Spark on YARN 2.2 and newer. Due to a change in the YARN API between previous versions and 2.2+, this was not supported in Spark 0.8.0. See the <a href="{{site.baseurl}}/docs/0.8.1/running-on-yarn.html">YARN documentation</a> for specific instructions on how to build Spark for YARN 2.2+. We've also included a pre-compiled binary for YARN 2.2.
### High Availability Mode for Standalone Cluster Manager
-The standalone cluster manager now has a high availability (H/A) mode which can tolerate master failures. This is particularly useful for long-running applications such as streaming jobs and the shark server, where the scheduler master previously represented a single point of failure. Instructions for deploying H/A mode are included <a href="{{site.url}}docs/0.8.1/spark-standalone.html#high-availability">in the documentation</a>. The current implementation uses Zookeeper for coordination.
+The standalone cluster manager now has a high availability (H/A) mode which can tolerate master failures. This is particularly useful for long-running applications such as streaming jobs and the shark server, where the scheduler master previously represented a single point of failure. Instructions for deploying H/A mode are included <a href="{{site.baseurl}}/docs/0.8.1/spark-standalone.html#high-availability">in the documentation</a>. The current implementation uses Zookeeper for coordination.
### Performance Optimizations
This release adds several performance optimizations:
diff --git a/releases/_posts/2014-02-02-spark-release-0-9-0.md b/releases/_posts/2014-02-02-spark-release-0-9-0.md
index edcce3a27..7f9e10767 100644
--- a/releases/_posts/2014-02-02-spark-release-0-9-0.md
+++ b/releases/_posts/2014-02-02-spark-release-0-9-0.md
@@ -11,7 +11,7 @@ meta:
_wpas_done_all: '1'
---
-Spark 0.9.0 is a major release that adds significant new features. It updates Spark to Scala 2.10, simplifies high availability, and updates numerous components of the project. This release includes a first version of [GraphX]({{site.url}}graphx/), a powerful new framework for graph processing that comes with a library of standard algorithms. In addition, [Spark Streaming]({{site.url}}streaming/) is now out of alpha, and includes significant optimizations and simplified high availability deployment.
+Spark 0.9.0 is a major release that adds significant new features. It updates Spark to Scala 2.10, simplifies high availability, and updates numerous components of the project. This release includes a first version of [GraphX]({{site.baseurl}}/graphx/), a powerful new framework for graph processing that comes with a library of standard algorithms. In addition, [Spark Streaming]({{site.baseurl}}/streaming/) is now out of alpha, and includes significant optimizations and simplified high availability deployment.
You can download Spark 0.9.0 as either a
<a href="http://d3kbcqa49mib13.cloudfront.net/spark-0.9.0-incubating.tgz" onClick="trackOutboundLink(this, 'Release Download Links', 'cloudfront_spark-0.9.0-incubating.tgz'); return false;">source package</a>
@@ -27,16 +27,16 @@ Spark now runs on Scala 2.10, letting users benefit from the language and librar
### Configuration System
-The new [SparkConf]({{site.url}}docs/latest/api/core/index.html#org.apache.spark.SparkConf) class is now the preferred way to configure advanced settings on your SparkContext, though the previous Java system property method still works. SparkConf is especially useful in tests to make sure properties don’t stay set across tests.
+The new [SparkConf]({{site.baseurl}}/docs/latest/api/core/index.html#org.apache.spark.SparkConf) class is now the preferred way to configure advanced settings on your SparkContext, though the previous Java system property method still works. SparkConf is especially useful in tests to make sure properties don’t stay set across tests.
### Spark Streaming Improvements
Spark Streaming is now out of alpha, and comes with simplified high availability and several optimizations.
-* When running on a Spark standalone cluster with the [standalone cluster high availability mode]({{site.url}}docs/0.9.0/spark-standalone.html#high-availability), you can submit a Spark Streaming driver application to the cluster and have it automatically recovered if either the driver or the cluster master crashes.
+* When running on a Spark standalone cluster with the [standalone cluster high availability mode]({{site.baseurl}}/docs/0.9.0/spark-standalone.html#high-availability), you can submit a Spark Streaming driver application to the cluster and have it automatically recovered if either the driver or the cluster master crashes.
* Windowed operators have been sped up by 30-50%.
* Spark Streaming’s input source plugins (e.g. for Twitter, Kafka and Flume) are now separate Maven modules, making it easier to pull in only the dependencies you need.
-* A new [StreamingListener]({{site.url}}docs/0.9.0/api/streaming/index.html#org.apache.spark.streaming.scheduler.StreamingListener) interface has been added for monitoring statistics about the streaming computation.
+* A new [StreamingListener]({{site.baseurl}}/docs/0.9.0/api/streaming/index.html#org.apache.spark.streaming.scheduler.StreamingListener) interface has been added for monitoring statistics about the streaming computation.
* A few aspects of the API have been improved:
* `DStream` and `PairDStream` classes have been moved from `org.apache.spark.streaming` to `org.apache.spark.streaming.dstream` to keep it consistent with `org.apache.spark.rdd.RDD`.
* `DStream.foreach` has been renamed to `foreachRDD` to make it explicit that it works for every RDD, not every element
@@ -45,22 +45,22 @@ Spark Streaming is now out of alpha, and comes with simplified high availability
### GraphX Alpha
-[GraphX]({{site.url}}graphx/) is a new framework for graph processing that uses recent advances in graph-parallel computation. It lets you build a graph within a Spark program using the standard Spark operators, then process it with new graph operators that are optimized for distributed computation. It includes [basic transformations]({{site.url}}docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.Graph), a [Pregel API]({{site.url}}docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.Pregel$) for iterative computation, and a standard library of [graph loaders]({{site.url}}docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.util.GraphGenerators$) and [analytics algorithms]({{site.url}}docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.package). By offering these features *within* the Spark engine, GraphX can significantly speed up processing pipelines compared to workflows that use different engines.
+[GraphX]({{site.baseurl}}/graphx/) is a new framework for graph processing that uses recent advances in graph-parallel computation. It lets you build a graph within a Spark program using the standard Spark operators, then process it with new graph operators that are optimized for distributed computation. It includes [basic transformations]({{site.baseurl}}/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.Graph), a [Pregel API]({{site.baseurl}}/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.Pregel$) for iterative computation, and a standard library of [graph loaders]({{site.baseurl}}/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.util.GraphGenerators$) and [analytics algorithms]({{site.baseurl}}/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.package). By offering these features *within* the Spark engine, GraphX can significantly speed up processing pipelines compared to workflows that use different engines.
GraphX features in this release include:
* Building graphs from arbitrary Spark RDDs
* Basic operations to transform graphs or extract subgraphs
* An optimized Pregel API that takes advantage of graph partitioning and indexing
-* Standard algorithms including [PageRank]({{site.url}}docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.PageRank$), [connected components]({{site.url}}docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.ConnectedComponents$), [strongly connected components]({{site.url}}docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.StronglyConnectedComponents$), [SVD++]({{site.url}}docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.SVDPlusPlus$), and [triangle counting]({{site.url}}docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.TriangleCount$)
+* Standard algorithms including [PageRank]({{site.baseurl}}/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.PageRank$), [connected components]({{site.baseurl}}/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.ConnectedComponents$), [strongly connected components]({{site.baseurl}}/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.StronglyConnectedComponents$), [SVD++]({{site.baseurl}}/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.SVDPlusPlus$), and [triangle counting]({{site.baseurl}}/docs/0.9.0/api/graphx/index.html#org.apache.spark.graphx.lib.TriangleCount$)
* Interactive use from the Spark shell
GraphX is still marked as alpha in this first release, but we recommend for new users to use it instead of the more limited Bagel API.
### MLlib Improvements
-* Spark’s machine learning library (MLlib) is now [available in Python]({{site.url}}docs/0.9.0/mllib-guide.html#using-mllib-in-python), where it operates on NumPy data (currently requires Python 2.7 and NumPy 1.7)
-* A new algorithm has been added for [Naive Bayes classification]({{site.url}}docs/0.9.0/api/mllib/index.html#org.apache.spark.mllib.classification.NaiveBayes)
+* Spark’s machine learning library (MLlib) is now [available in Python]({{site.baseurl}}/docs/0.9.0/mllib-guide.html#using-mllib-in-python), where it operates on NumPy data (currently requires Python 2.7 and NumPy 1.7)
+* A new algorithm has been added for [Naive Bayes classification]({{site.baseurl}}/docs/0.9.0/api/mllib/index.html#org.apache.spark.mllib.classification.NaiveBayes)
* Alternating Least Squares models can now be used to predict ratings for multiple items in parallel
* MLlib’s documentation was expanded to include more examples in Scala, Java and Python
@@ -77,7 +77,7 @@ GraphX is still marked as alpha in this first release, but we recommend for new
### Core Engine
-* Spark’s standalone mode now supports submitting a driver program to run on the cluster instead of on the external machine submitting it. You can access this functionality through the [org.apache.spark.deploy.Client]({{site.url}}docs/0.9.0/spark-standalone.html#launching-applications-inside-the-cluster) class.
+* Spark’s standalone mode now supports submitting a driver program to run on the cluster instead of on the external machine submitting it. You can access this functionality through the [org.apache.spark.deploy.Client]({{site.baseurl}}/docs/0.9.0/spark-standalone.html#launching-applications-inside-the-cluster) class.
* Large reduce operations now automatically spill data to disk if it does not fit in memory.
* Users of standalone mode can now limit how many cores an application will use by default if the application writer didn’t configure its size. Previously, such applications took all available cores on the cluster.
* `spark-shell` now supports the `-i` option to run a script on startup.
diff --git a/releases/_posts/2014-05-30-spark-release-1-0-0.md b/releases/_posts/2014-05-30-spark-release-1-0-0.md
index acb6b3e61..22d59f684 100644
--- a/releases/_posts/2014-05-30-spark-release-1-0-0.md
+++ b/releases/_posts/2014-05-30-spark-release-1-0-0.md
@@ -11,7 +11,7 @@ meta:
_wpas_done_all: '1'
---
-Spark 1.0.0 is a major release marking the start of the 1.X line. This release brings both a variety of new features and strong API compatibility guarantees throughout the 1.X line. Spark 1.0 adds a new major component, [Spark SQL]({{site.url}}docs/latest/sql-programming-guide.html), for loading and manipulating structured data in Spark. It includes major extensions to all of Spark’s existing standard libraries ([ML]({{site.url}}docs/latest/mllib-guide.html), [Streaming]({{site.url}}docs/latest/streaming-programming-guide.html), and [GraphX]({{site.url}}docs/latest/graphx-programming-guide.html)) while also enhancing language support in Java and Python. Finally, Spark 1.0 brings operational improvements including full support for the Hadoop/YARN security model and a unified submission process for all supported cluster managers.
+Spark 1.0.0 is a major release marking the start of the 1.X line. This release brings both a variety of new features and strong API compatibility guarantees throughout the 1.X line. Spark 1.0 adds a new major component, [Spark SQL]({{site.baseurl}}/docs/latest/sql-programming-guide.html), for loading and manipulating structured data in Spark. It includes major extensions to all of Spark’s existing standard libraries ([ML]({{site.baseurl}}/docs/latest/mllib-guide.html), [Streaming]({{site.baseurl}}/docs/latest/streaming-programming-guide.html), and [GraphX]({{site.baseurl}}/docs/latest/graphx-programming-guide.html)) while also enhancing language support in Java and Python. Finally, Spark 1.0 brings operational improvements including full support for the Hadoop/YARN security model and a unified submission process for all supported cluster managers.
You can download Spark 1.0.0 as either a
<a href="http://d3kbcqa49mib13.cloudfront.net/spark-1.0.0.tgz" onClick="trackOutboundLink(this, 'Release Download Links', 'cloudfront_spark-1.0.0.tgz'); return false;">source package</a>
@@ -28,13 +28,13 @@ Spark 1.0.0 is the first release in the 1.X major line. Spark is guaranteeing st
For users running in secured Hadoop environments, Spark now integrates with the Hadoop/YARN security model. Spark will authenticate job submission, securely transfer HDFS credentials, and authenticate communication between components.
### Operational and Packaging Improvements
-This release significantly simplifies the process of bundling and submitting a Spark application. A new [spark-submit tool]({{site.url}}docs/latest/submitting-applications.html) allows users to submit an application to any Spark cluster, including local clusters, Mesos, or YARN, through a common process. The documentation for bundling Spark applications has been substantially expanded. We’ve also added a history server for Spark’s web UI, allowing users to view Spark application data after individual applications are finished.
+This release significantly simplifies the process of bundling and submitting a Spark application. A new [spark-submit tool]({{site.baseurl}}/docs/latest/submitting-applications.html) allows users to submit an application to any Spark cluster, including local clusters, Mesos, or YARN, through a common process. The documentation for bundling Spark applications has been substantially expanded. We’ve also added a history server for Spark’s web UI, allowing users to view Spark application data after individual applications are finished.
### Spark SQL
-This release introduces [Spark SQL]({{site.url}}docs/latest/sql-programming-guide.html) as a new alpha component. Spark SQL provides support for loading and manipulating structured data in Spark, either from external structured data sources (currently Hive and Parquet) or by adding a schema to an existing RDD. Spark SQL’s API interoperates with the RDD data model, allowing users to interleave Spark code with SQL statements. Under the hood, Spark SQL uses the Catalyst optimizer to choose an efficient execution plan, and can automatically push predicates into storage formats like Parquet. In future releases, Spark SQL will also provide a common API to other storage systems.
+This release introduces [Spark SQL]({{site.baseurl}}/docs/latest/sql-programming-guide.html) as a new alpha component. Spark SQL provides support for loading and manipulating structured data in Spark, either from external structured data sources (currently Hive and Parquet) or by adding a schema to an existing RDD. Spark SQL’s API interoperates with the RDD data model, allowing users to interleave Spark code with SQL statements. Under the hood, Spark SQL uses the Catalyst optimizer to choose an efficient execution plan, and can automatically push predicates into storage formats like Parquet. In future releases, Spark SQL will also provide a common API to other storage systems.
### MLlib Improvements
-In 1.0.0, Spark’s MLlib adds support for sparse feature vectors in Scala, Java, and Python. It takes advantage of sparsity in both storage and computation in linear methods, k-means, and naive Bayes. In addition, this release adds several new algorithms: scalable decision trees for both classification and regression, distributed matrix algorithms including SVD and PCA, model evaluation functions, and L-BFGS as an optimization primitive. The [MLlib programming guide]({{site.url}}docs/latest/mllib-guide.html) and code examples have also been greatly expanded.
+In 1.0.0, Spark’s MLlib adds support for sparse feature vectors in Scala, Java, and Python. It takes advantage of sparsity in both storage and computation in linear methods, k-means, and naive Bayes. In addition, this release adds several new algorithms: scalable decision trees for both classification and regression, distributed matrix algorithms including SVD and PCA, model evaluation functions, and L-BFGS as an optimization primitive. The [MLlib programming guide]({{site.baseurl}}/docs/latest/mllib-guide.html) and code examples have also been greatly expanded.
### GraphX and Streaming Improvements
In addition to usability and maintainability improvements, GraphX in Spark 1.0 brings substantial performance boosts in graph loading, edge reversal, and neighborhood computation. These operations now require less communication and produce simpler RDD graphs. Spark’s Streaming module has added performance optimizations for stateful stream transformations, along with improved Flume support, and automated state cleanup for long running jobs.
@@ -43,7 +43,7 @@ In addition to usability and maintainability improvements, GraphX in Spark 1.0 b
Spark 1.0 adds support for Java 8 [new lambda syntax](http://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html) in its Java bindings. Java 8 supports a concise syntax for writing anonymous functions, similar to the closure syntax in Scala and Python. This change requires small changes for users of the current Java API, which are noted in the documentation. Spark’s Python API has been extended to support several new functions. We’ve also included several stability improvements in the Python API, particularly for large datasets. PySpark now supports running on YARN as well.
### Documentation
-Spark's [programming guide]({{site.url}}docs/latest/programming-guide.html) has been significantly expanded to centrally cover all supported languages and discuss more operators and aspects of the development life cycle. The [MLlib guide]({{site.url}}docs/latest/mllib-guide.html) has also been expanded with significantly more detail and examples for each algorithm, while documents on configuration, YARN and Mesos have also been revamped.
+Spark's [programming guide]({{site.baseurl}}/docs/latest/programming-guide.html) has been significantly expanded to centrally cover all supported languages and discuss more operators and aspects of the development life cycle. The [MLlib guide]({{site.baseurl}}/docs/latest/mllib-guide.html) has also been expanded with significantly more detail and examples for each algorithm, while documents on configuration, YARN and Mesos have also been revamped.
### Smaller Changes
- PySpark now works with more Python versions than before -- Python 2.6+ instead of 2.7+, and NumPy 1.4+ instead of 1.7+.
@@ -52,12 +52,12 @@ Spark's [programming guide]({{site.url}}docs/latest/programming-guide.html) has
- Support for off-heap storage in Tachyon has been added via a special build target.
- Datasets persisted with `DISK_ONLY` now write directly to disk, significantly improving memory usage for large datasets.
- Intermediate state created during a Spark job is now garbage collected when the corresponding RDDs become unreferenced, improving performance.
-- Spark now includes a [Javadoc version]({{site.url}}docs/latest/api/java/index.html) of all its API docs and a [unified Scaladoc]({{site.url}}docs/latest/api/scala/index.html) for all modules.
+- Spark now includes a [Javadoc version]({{site.baseurl}}/docs/latest/api/java/index.html) of all its API docs and a [unified Scaladoc]({{site.baseurl}}/docs/latest/api/scala/index.html) for all modules.
- A new SparkContext.wholeTextFiles method lets you operate on small text files as individual records.
### Migrating to Spark 1.0
-While most of the Spark API remains the same as in 0.x versions, a few changes have been made for long-term flexibility, especially in the Java API (to support Java 8 lambdas). The documentation includes [migration information]({{site.url}}docs/latest/programming-guide.html#migrating-from-pre-10-versions-of-spark) to upgrade your applications.
+While most of the Spark API remains the same as in 0.x versions, a few changes have been made for long-term flexibility, especially in the Java API (to support Java 8 lambdas). The documentation includes [migration information]({{site.baseurl}}/docs/latest/programming-guide.html#migrating-from-pre-10-versions-of-spark) to upgrade your applications.
### Contributors
The following developers contributed to this release:
diff --git a/releases/_posts/2014-09-11-spark-release-1-1-0.md b/releases/_posts/2014-09-11-spark-release-1-1-0.md
index f4878a688..b12a72785 100644
--- a/releases/_posts/2014-09-11-spark-release-1-1-0.md
+++ b/releases/_posts/2014-09-11-spark-release-1-1-0.md
@@ -13,7 +13,7 @@ meta:
Spark 1.1.0 is the first minor release on the 1.X line. This release brings operational and performance improvements in Spark core along with significant extensions to Spark’s newest libraries: MLlib and Spark SQL. It also builds out Spark’s Python support and adds new components to the Spark Streaming module. Spark 1.1 represents the work of 171 contributors, the most to ever contribute to a Spark release!
-To download Spark 1.1 visit the <a href="{{site.url}}downloads.html">downloads</a> page.
+To download Spark 1.1 visit the <a href="{{site.baseurl}}/downloads.html">downloads</a> page.
### Performance and Usability Improvements
Across the board, Spark 1.1 adds features for improved stability and performance, particularly for large-scale workloads. Spark now performs [disk spilling for skewed blocks](https://issues.apache.org/jira/browse/SPARK-1777) during cache operations, guarding against memory overflows if a single RDD partition is large. Disk spilling during aggregations, introduced in Spark 1.0, has been [ported to PySpark](https://issues.apache.org/jira/browse/SPARK-2538). This release introduces a [new shuffle implementation](https://issues.apache.org/jira/browse/SPARK-2045) optimized for very large scale shuffles. This “sort-based shuffle” will be become the default in the next release, and is now available to users. For jobs with large numbers of reducers, we recommend turning this on. This release also adds several usability improvements for monitoring the performance of long running or complex jobs. Among the changes are better [named accumulators](https://issues.apache.org/jira/browse/SPARK-2380) that display in Spark’s UI, [dynamic updating of metrics](https://issues.apache.org/jira/browse/SPARK-2099) for progress tasks, and [reporting of input metrics](https://issues.apache.org/jira/browse/SPARK-1683) for tasks that read input data.
diff --git a/releases/_posts/2014-11-26-spark-release-1-1-1.md b/releases/_posts/2014-11-26-spark-release-1-1-1.md
index 415394204..ab067ee9b 100644
--- a/releases/_posts/2014-11-26-spark-release-1-1-1.md
+++ b/releases/_posts/2014-11-26-spark-release-1-1-1.md
@@ -13,7 +13,7 @@ meta:
Spark 1.1.1 is a maintenance release with bug fixes. This release is based on the [branch-1.1](https://github.com/apache/spark/tree/branch-1.1) maintenance branch of Spark. We recommend all 1.1.0 users to upgrade to this stable release. Contributions to this release came from 55 developers.
-To download Spark 1.1.1 visit the <a href="{{site.url}}downloads.html">downloads</a> page.
+To download Spark 1.1.1 visit the <a href="{{site.baseurl}}/downloads.html">downloads</a> page.
### Fixes
Spark 1.1.1 contains bug fixes in several components. Some of the more important fixes are highlighted below. You can visit the [Spark issue tracker](http://s.apache.org/z9h) for the full list of fixes.
diff --git a/releases/_posts/2014-12-18-spark-release-1-2-0.md b/releases/_posts/2014-12-18-spark-release-1-2-0.md
index d9dab5ccb..bb9a01ce9 100644
--- a/releases/_posts/2014-12-18-spark-release-1-2-0.md
+++ b/releases/_posts/2014-12-18-spark-release-1-2-0.md
@@ -13,7 +13,7 @@ meta:
Spark 1.2.0 is the third release on the 1.X line. This release brings performance and usability improvements in Spark’s core engine, a major new API for MLlib, expanded ML support in Python, a fully H/A mode in Spark Streaming, and much more. GraphX has seen major performance and API improvements and graduates from an alpha component. Spark 1.2 represents the work of 172 contributors from more than 60 institutions in more than 1000 individual patches.
-To download Spark 1.2 visit the <a href="{{site.url}}downloads.html">downloads</a> page.
+To download Spark 1.2 visit the <a href="{{site.baseurl}}/downloads.html">downloads</a> page.
### Spark Core
In 1.2 Spark core upgrades two major subsystems to improve the performance and stability of very large scale shuffles. The first is Spark’s communication manager used during bulk transfers, which upgrades to a [netty-based implementation](https://issues.apache.org/jira/browse/SPARK-2468). The second is Spark’s shuffle mechanism, which upgrades to the [“sort based” shuffle initially released in Spark 1.1](https://issues.apache.org/jira/browse/SPARK-3280). These both improve the performance and stability of very large scale shuffles. Spark also adds an [elastic scaling mechanism](https://issues.apache.org/jira/browse/SPARK-3174) designed to improve cluster utilization during long running ETL-style jobs. This is currently supported on YARN and will make its way to other cluster managers in future versions. Finally, Spark 1.2 adds support for Scala 2.11. For instructions on building for Scala 2.11 see the [build documentation](/docs/1.2.0/building-spark.html#building-for-scala-211).
diff --git a/releases/_posts/2015-02-09-spark-release-1-2-1.md b/releases/_posts/2015-02-09-spark-release-1-2-1.md
index 8bd5aef8e..3f5c57967 100644
--- a/releases/_posts/2015-02-09-spark-release-1-2-1.md
+++ b/releases/_posts/2015-02-09-spark-release-1-2-1.md
@@ -13,7 +13,7 @@ meta:
Spark 1.2.1 is a maintenance release containing stability fixes. This release is based on the [branch-1.2](https://github.com/apache/spark/tree/branch-1.2) maintenance branch of Spark. We recommend all 1.2.0 users to upgrade to this stable release. Contributions to this release came from 69 developers.
-To download Spark 1.2.1 visit the <a href="{{site.url}}downloads.html">downloads</a> page.
+To download Spark 1.2.1 visit the <a href="{{site.baseurl}}/downloads.html">downloads</a> page.
### Fixes
Spark 1.2.1 contains bug fixes in several components. Some of the more important fixes are highlighted below. You can visit the [Spark issue tracker](http://s.apache.org/Mpn) for the full list of fixes.
diff --git a/releases/_posts/2015-03-13-spark-release-1-3-0.md b/releases/_posts/2015-03-13-spark-release-1-3-0.md
index bc9c4db84..03230fac0 100644
--- a/releases/_posts/2015-03-13-spark-release-1-3-0.md
+++ b/releases/_posts/2015-03-13-spark-release-1-3-0.md
@@ -13,7 +13,7 @@ meta:
Spark 1.3.0 is the fourth release on the 1.X line. This release brings a new DataFrame API alongside the graduation of Spark SQL from an alpha project. It also brings usability improvements in Spark’s core engine and expansion of MLlib and Spark Streaming. Spark 1.3 represents the work of 174 contributors from more than 60 institutions in more than 1000 individual patches.
-To download Spark 1.3 visit the <a href="{{site.url}}downloads.html">downloads</a> page.
+To download Spark 1.3 visit the <a href="{{site.baseurl}}/downloads.html">downloads</a> page.
### Spark Core
Spark 1.3 sees a handful of usability improvements in the core engine. The core API now supports [multi level aggregation trees](https://issues.apache.org/jira/browse/SPARK-5430) to help speed up expensive reduce operations. [Improved error reporting](https://issues.apache.org/jira/browse/SPARK-5063) has been added for certain gotcha operations. Spark's Jetty dependency is [now shaded](https://issues.apache.org/jira/browse/SPARK-3996) to help avoid conflicts with user programs. Spark now supports [SSL encryption](https://issues.apache.org/jira/browse/SPARK-3883) for some communication endpoints. Finaly, realtime [GC metrics](https://issues.apache.org/jira/browse/SPARK-3428) and [record counts](https://issues.apache.org/jira/browse/SPARK-4874) have been added to the UI.
diff --git a/releases/_posts/2015-04-17-spark-release-1-2-2.md b/releases/_posts/2015-04-17-spark-release-1-2-2.md
index e118849be..2bc397456 100644
--- a/releases/_posts/2015-04-17-spark-release-1-2-2.md
+++ b/releases/_posts/2015-04-17-spark-release-1-2-2.md
@@ -13,7 +13,7 @@ meta:
Spark 1.2.2 is a maintenance release containing stability fixes. This release is based on the [branch-1.2](https://github.com/apache/spark/tree/branch-1.2) maintenance branch of Spark. We recommend all 1.2.1 users to upgrade to this stable release. Contributions to this release came from 39 developers.
-To download Spark 1.2.2 visit the <a href="{{site.url}}downloads.html">downloads</a> page.
+To download Spark 1.2.2 visit the <a href="{{site.baseurl}}/downloads.html">downloads</a> page.
### Fixes
Spark 1.2.2 contains bug fixes in several components. Some of the more important fixes are highlighted below. You can visit the [Spark issue tracker](https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%201.2.2%20ORDER%20BY%20priority%2C%20component) for the full list of fixes.
diff --git a/releases/_posts/2015-04-17-spark-release-1-3-1.md b/releases/_posts/2015-04-17-spark-release-1-3-1.md
index dc7c5d423..40ce957ab 100644
--- a/releases/_posts/2015-04-17-spark-release-1-3-1.md
+++ b/releases/_posts/2015-04-17-spark-release-1-3-1.md
@@ -13,7 +13,7 @@ meta:
Spark 1.3.1 is a maintenance release containing stability fixes. This release is based on the [branch-1.3](https://github.com/apache/spark/tree/branch-1.3) maintenance branch of Spark. We recommend all 1.3.0 users to upgrade to this stable release. Contributions to this release came from 60 developers.
-To download Spark 1.3.1 visit the <a href="{{site.url}}downloads.html">downloads</a> page.
+To download Spark 1.3.1 visit the <a href="{{site.baseurl}}/downloads.html">downloads</a> page.
### Fixes
Spark 1.3.1 contains several bug fixes in Spark SQL and assorted fixes in other components. Some of the more important fixes are highlighted below. You can visit the [Spark issue tracker](https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%201.3.1%20ORDER%20BY%20priority%2C%20component) for the full list of fixes.
diff --git a/releases/_posts/2015-06-11-spark-release-1-4-0.md b/releases/_posts/2015-06-11-spark-release-1-4-0.md
index b7c315a35..e02310fab 100644
--- a/releases/_posts/2015-06-11-spark-release-1-4-0.md
+++ b/releases/_posts/2015-06-11-spark-release-1-4-0.md
@@ -13,7 +13,7 @@ meta:
Spark 1.4.0 is the fifth release on the 1.X line. This release brings an R API to Spark. It also brings usability improvements in Spark’s core engine and expansion of MLlib and Spark Streaming. Spark 1.4 represents the work of more than 210 contributors from more than 70 institutions in more than 1000 individual patches.
-To download Spark 1.4 visit the <a href="{{site.url}}downloads.html">downloads</a> page.
+To download Spark 1.4 visit the <a href="{{site.baseurl}}/downloads.html">downloads</a> page.
### SparkR
Spark 1.4 is the first release to package SparkR, an R binding for Spark based
diff --git a/releases/_posts/2015-07-15-spark-release-1-4-1.md b/releases/_posts/2015-07-15-spark-release-1-4-1.md
index 58b53b820..766435564 100644
--- a/releases/_posts/2015-07-15-spark-release-1-4-1.md
+++ b/releases/_posts/2015-07-15-spark-release-1-4-1.md
@@ -13,7 +13,7 @@ meta:
Spark 1.4.1 is a maintenance release containing stability fixes. This release is based on the [branch-1.4](https://github.com/apache/spark/tree/branch-1.4) maintenance branch of Spark. We recommend all 1.4.0 users to upgrade to this stable release. 85 developers contributed to this release.
-To download Spark 1.4.1 visit the <a href="{{site.url}}downloads.html">downloads</a> page.
+To download Spark 1.4.1 visit the <a href="{{site.baseurl}}/downloads.html">downloads</a> page.
### Fixes
Spark 1.4.1 contains several bug fixes in Spark's DataFrame and data source support and assorted fixes in other components. Some of the more important fixes are highlighted below. You can visit the [Spark issue tracker](https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%201.4.1%20ORDER%20BY%20priority%2C%20component) for the full list of fixes.
diff --git a/releases/_posts/2015-09-09-spark-release-1-5-0.md b/releases/_posts/2015-09-09-spark-release-1-5-0.md
index b527f7f9d..70d336850 100644
--- a/releases/_posts/2015-09-09-spark-release-1-5-0.md
+++ b/releases/_posts/2015-09-09-spark-release-1-5-0.md
@@ -11,7 +11,7 @@ meta:
_wpas_done_all: '1'
---
-Spark 1.5.0 is the sixth release on the 1.x line. This release represents 1400+ patches from 230+ contributors and 80+ institutions. To download Spark 1.5.0 visit the <a href="{{site.url}}downloads.html">downloads</a> page.
+Spark 1.5.0 is the sixth release on the 1.x line. This release represents 1400+ patches from 230+ contributors and 80+ institutions. To download Spark 1.5.0 visit the <a href="{{site.baseurl}}/downloads.html">downloads</a> page.
You can consult JIRA for the [detailed changes](https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315420&version=12332078). We have curated a list of high level changes here: