summaryrefslogtreecommitdiff
path: root/releases/_posts/2014-09-11-spark-release-1-1-0.md
diff options
context:
space:
mode:
authorPatrick Wendell <pwendell@apache.org>2014-09-17 21:21:22 +0000
committerPatrick Wendell <pwendell@apache.org>2014-09-17 21:21:22 +0000
commit4f85fb7842a7a7da32706f0f0bdab4d57db925f4 (patch)
treea35c18ab19b508dc8d659e4f00f4ea236d0541f1 /releases/_posts/2014-09-11-spark-release-1-1-0.md
parent46b88d22e7dccacf73b9c53c8e5a1016a8593a08 (diff)
downloadspark-website-4f85fb7842a7a7da32706f0f0bdab4d57db925f4.tar.gz
spark-website-4f85fb7842a7a7da32706f0f0bdab4d57db925f4.tar.bz2
spark-website-4f85fb7842a7a7da32706f0f0bdab4d57db925f4.zip
Typos and omissions in 1.1.0 release notes
Diffstat (limited to 'releases/_posts/2014-09-11-spark-release-1-1-0.md')
-rw-r--r--releases/_posts/2014-09-11-spark-release-1-1-0.md5
1 files changed, 3 insertions, 2 deletions
diff --git a/releases/_posts/2014-09-11-spark-release-1-1-0.md b/releases/_posts/2014-09-11-spark-release-1-1-0.md
index c559abde5..f4878a688 100644
--- a/releases/_posts/2014-09-11-spark-release-1-1-0.md
+++ b/releases/_posts/2014-09-11-spark-release-1-1-0.md
@@ -22,7 +22,7 @@ Across the board, Spark 1.1 adds features for improved stability and performance
Spark SQL adds a number of new features and performance improvements in this release. A [JDBC/ODBC server](http://spark.apache.org/docs/1.1.0/sql-programming-guide.html#running-the-thrift-jdbc-server) allows users to connect to SparkSQL from many different applications and provides shared access to cached tables. A new module provides [support for loading JSON data](http://spark.apache.org/docs/1.1.0/sql-programming-guide.html#json-datasets) directly into Spark’s SchemaRDD format, including automatic schema inference. Spark SQL introduces [dynamic bytecode generation](http://spark.apache.org/docs/1.1.0/sql-programming-guide.html#other-configuration-options) in this release, a technique which significantly speeds up execution for queries that perform complex expression evaluation. This release also adds support for registering Python, Scala, and Java lambda functions as UDFs, which can then be called directly in SQL. Spark 1.1 adds a [public types API to allow users to create SchemaRDD’s from custom data sources](http://spark.apache.org/docs/1.1.0/sql-programming-guide.html#programmatically-specifying-the-schema). Finally, many optimizations have been added to the native Parquet support as well as throughout the engine.
### MLlib
-MLlib adds several new algorithms and optimizations in this release. 1.1 introduces a [new library of statistical packages](https://issues.apache.org/jira/browse/SPARK-2359) which provides exploratory analytic functions. These include stratified sampling, correlations, chi-squared tests and support for creating random datasets. This release adds utilities for feature extraction ([Word2Vec](https://issues.apache.org/jira/browse/SPARK-2510) and [TF-IDF](https://issues.apache.org/jira/browse/SPARK-2511)) and feature transformation ([normalization and standard scaling](https://issues.apache.org/jira/browse/SPARK-2272)). Also new are support for [nonnegative matrix factorization](https://issues.apache.org/jira/browse/SPARK-1553) and [SVG via Lanczos](https://issues.apache.org/jira/browse/SPARK-1782). The decision tree algorithm has been [added in Python and Java](https://issues.apache.org/jira/browse/SPARK-2478). A tree aggregation primitive has been added to help optimize many existing algorithms. Performance improves across the board in MLlib 1.1, with improvements of around 2-3X for many algorithms and up to 5X for large scale decision tree problems.
+MLlib adds several new algorithms and optimizations in this release. 1.1 introduces a [new library of statistical packages](https://issues.apache.org/jira/browse/SPARK-2359) which provides exploratory analytic functions. These include stratified sampling, correlations, chi-squared tests and support for creating random datasets. This release adds utilities for feature extraction ([Word2Vec](https://issues.apache.org/jira/browse/SPARK-2510) and [TF-IDF](https://issues.apache.org/jira/browse/SPARK-2511)) and feature transformation ([normalization and standard scaling](https://issues.apache.org/jira/browse/SPARK-2272)). Also new are support for [nonnegative matrix factorization](https://issues.apache.org/jira/browse/SPARK-1553) and [SVD via Lanczos](https://issues.apache.org/jira/browse/SPARK-1782). The decision tree algorithm has been [added in Python and Java](https://issues.apache.org/jira/browse/SPARK-2478). A tree aggregation primitive has been added to help optimize many existing algorithms. Performance improves across the board in MLlib 1.1, with improvements of around 2-3X for many algorithms and up to 5X for large scale decision tree problems.
### GraphX and Spark Streaming
Spark streaming adds a new data source [Amazon Kinesis](https://issues.apache.org/jira/browse/SPARK-1981). For the Apache Flume, a new mode is supported which [pulls data from Flume](https://issues.apache.org/jira/browse/SPARK-1729), simplifying deployment and providing high availability. The first of a set of [streaming machine learning algorithms](https://issues.apache.org/jira/browse/SPARK-2438) is introduced with streaming linear regression. Finally, [rate limiting](https://issues.apache.org/jira/browse/SPARK-1341) has been added for streaming inputs. GraphX adds [custom storage levels for vertices and edges](https://issues.apache.org/jira/browse/SPARK-1991) along with [improved numerical precision](https://issues.apache.org/jira/browse/SPARK-2748) across the board. Finally, GraphX adds a new label propagation algorithm.
@@ -58,7 +58,7 @@ Spark 1.1.0 is backwards compatible with Spark 1.0.X. Some configuration option
* Alex Liu -- bug fix
* Ali Ghodsi -- doc fix
* Allan Douglas R. de Oliveira -- EC2 script enhancements and coGroup fix
- * Ameet Talwalker -- MLlib docs
+ * Ameet Talwalkar -- MLlib docs
* Anand Avati -- build and SQL fixes
* Anant -- Python and doc fixes
* Anatoli Fomenko -- MLlib doc fix
@@ -102,6 +102,7 @@ Spark 1.1.0 is backwards compatible with Spark 1.0.X. Some configuration option
* Gabriel Nizzoli -- bug fix in Spark Streaming
* Gang Bai -- MLlib fix
* Gera Shegalov -- bug fix
+ * Gil Vernik -- OpenStack Swift documentation
* Guancheng Chen -- doc fix
* Guillaume Ballet -- build fix
* GuoQiang Li -- bug fixes in Spark core and MLlib