|
|
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>
Spark Release 1.0.1 | Apache Spark
</title>
<!-- Bootstrap core CSS -->
<link href="/css/cerulean.min.css" rel="stylesheet">
<link href="/css/custom.css" rel="stylesheet">
<!-- Code highlighter CSS -->
<link href="/css/pygments-default.css" rel="stylesheet">
<script type="text/javascript">
<!-- Google Analytics initialization -->
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-32518208-2']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
<!-- Adds slight delay to links to allow async reporting -->
function trackOutboundLink(link, category, action) {
try {
_gaq.push(['_trackEvent', category , action]);
} catch(err){}
setTimeout(function() {
document.location.href = link.href;
}, 100);
}
</script>
<!-- HTML5 shim and Respond.js IE8 support of HTML5 elements and media queries -->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/libs/html5shiv/3.7.0/html5shiv.js"></script>
<script src="https://oss.maxcdn.com/libs/respond.js/1.3.0/respond.min.js"></script>
<![endif]-->
</head>
<body>
<script src="https://code.jquery.com/jquery.js"></script>
<script src="//netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js"></script>
<script src="/js/lang-tabs.js"></script>
<script src="/js/downloads.js"></script>
<div class="container" style="max-width: 1200px;">
<div class="masthead">
<p class="lead">
<a href="/">
<img src="/images/spark-logo-trademark.png"
style="height:100px; width:auto; vertical-align: bottom; margin-top: 20px;"></a><span class="tagline">
Lightning-fast cluster computing
</span>
</p>
</div>
<nav class="navbar navbar-default" role="navigation">
<!-- Brand and toggle get grouped for better mobile display -->
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse"
data-target="#navbar-collapse-1">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
</div>
<!-- Collect the nav links, forms, and other content for toggling -->
<div class="collapse navbar-collapse" id="navbar-collapse-1">
<ul class="nav navbar-nav">
<li><a href="/downloads.html">Download</a></li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">
Libraries <b class="caret"></b>
</a>
<ul class="dropdown-menu">
<li><a href="/sql/">SQL and DataFrames</a></li>
<li><a href="/streaming/">Spark Streaming</a></li>
<li><a href="/mllib/">MLlib (machine learning)</a></li>
<li><a href="/graphx/">GraphX (graph)</a></li>
<li class="divider"></li>
<li><a href="https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects">Third-Party Packages</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">
Documentation <b class="caret"></b>
</a>
<ul class="dropdown-menu">
<li><a href="/docs/latest/">Latest Release (Spark 2.0.1)</a></li>
<li><a href="/documentation.html">Older Versions and Other Resources</a></li>
</ul>
</li>
<li><a href="/examples.html">Examples</a></li>
<li class="dropdown">
<a href="/community.html" class="dropdown-toggle" data-toggle="dropdown">
Community <b class="caret"></b>
</a>
<ul class="dropdown-menu">
<li><a href="/community.html">Mailing Lists</a></li>
<li><a href="/community.html#events">Events and Meetups</a></li>
<li><a href="/community.html#history">Project History</a></li>
<li><a href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark">Powered By</a></li>
<li><a href="https://cwiki.apache.org/confluence/display/SPARK/Committers">Project Committers</a></li>
<li><a href="https://issues.apache.org/jira/browse/SPARK">Issue Tracker</a></li>
</ul>
</li>
<li><a href="/faq.html">FAQ</a></li>
</ul>
<ul class="nav navbar-nav navbar-right">
<li class="dropdown">
<a href="http://www.apache.org/" class="dropdown-toggle" data-toggle="dropdown">
Apache Software Foundation <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="http://www.apache.org/">Apache Homepage</a></li>
<li><a href="http://www.apache.org/licenses/">License</a></li>
<li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li>
<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
<li><a href="http://www.apache.org/security/">Security</a></li>
</ul>
</li>
</ul>
</div>
<!-- /.navbar-collapse -->
</nav>
<div class="row">
<div class="col-md-3 col-md-push-9">
<div class="news" style="margin-bottom: 20px;">
<h5>Latest News</h5>
<ul class="list-unstyled">
<li><a href="/news/spark-2-0-1-released.html">Spark 2.0.1 released</a>
<span class="small">(Oct 03, 2016)</span></li>
<li><a href="/news/spark-2-0-0-released.html">Spark 2.0.0 released</a>
<span class="small">(Jul 26, 2016)</span></li>
<li><a href="/news/spark-1-6-2-released.html">Spark 1.6.2 released</a>
<span class="small">(Jun 25, 2016)</span></li>
<li><a href="/news/submit-talks-to-spark-summit-eu-2016.html">Call for Presentations for Spark Summit EU is Open</a>
<span class="small">(Jun 16, 2016)</span></li>
</ul>
<p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p>
</div>
<div class="hidden-xs hidden-sm">
<a href="/downloads.html" class="btn btn-success btn-lg btn-block" style="margin-bottom: 30px;">
Download Spark
</a>
<p style="font-size: 16px; font-weight: 500; color: #555;">
Built-in Libraries:
</p>
<ul class="list-none">
<li><a href="/sql/">SQL and DataFrames</a></li>
<li><a href="/streaming/">Spark Streaming</a></li>
<li><a href="/mllib/">MLlib (machine learning)</a></li>
<li><a href="/graphx/">GraphX (graph)</a></li>
</ul>
<a href="https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects">Third-Party Packages</a>
</div>
</div>
<div class="col-md-9 col-md-pull-3">
<h2>Spark Release 1.0.1</h2>
<p>Spark 1.0.1 is a maintenance release with several stability fixes and a few new features in Spark’s SQL (alpha) library. This release is based on the <a href="https://github.com/apache/spark/tree/branch-1.0">branch-1.0</a> maintenance branch of Spark. We recommend users follow the head of this branch to get the most recent stable version of Spark.</p>
<p>You can download Spark 1.0.1 as either a
<a href="http://d3kbcqa49mib13.cloudfront.net/spark-1.0.1.tgz" onclick="trackOutboundLink(this, 'Release Download Links', 'cloudfront_spark-1.0.1.tgz'); return false;">source package</a>
(5 MB tgz) or a prebuilt package for
<a href="http://d3kbcqa49mib13.cloudfront.net/spark-1.0.1-bin-hadoop1.tgz" onclick="trackOutboundLink(this, 'Release Download Links', 'cloudfront_spark-1.0.1-bin-hadoop1.tgz'); return false;">Hadoop 1 / CDH3</a>,
<a href="http://d3kbcqa49mib13.cloudfront.net/spark-1.0.1-bin-cdh4.tgz" onclick="trackOutboundLink(this, 'Release Download Links', 'cloudfront_spark-1.0.1-bin-cdh4.tgz'); return false;">CDH4</a>, or
<a href="http://d3kbcqa49mib13.cloudfront.net/spark-1.0.1-bin-hadoop2.tgz" onclick="trackOutboundLink(this, 'Release Download Links', 'cloudfront_spark-1.0.1-bin-hadoop2.tgz'); return false;">Hadoop 2 / CDH5 / HDP2</a>
(160 MB tgz). Release signatures and checksums are available at the official <a href="http://www.apache.org/dist/spark/spark-1.0.1/">Apache download site</a>.</p>
<h3 id="fixes">Fixes</h3>
<p>Spark 1.0.1 contains stability fixes in several components. Some of the more important fixes are highlighted below. You can visit the <a href="http://s.apache.org/5zh">Spark issue tracker</a> for an exhaustive list of fixes.</p>
<h4 id="spark-core">Spark Core</h4>
<ul>
<li>Issue with missing keys during external aggregations (<a href="https://issues.apache.org/jira/browse/SPARK-2043">SPARK-2043</a>)</li>
<li>Issue during job failures in Mesos mode (<a href="https://issues.apache.org/jira/browse/SPARK-1749">SPARK-1749</a>)</li>
<li>Error when defining case classes in Scala shell (<a href="https://issues.apache.org/jira/browse/SPARK-1199">SPARK-1199</a>)</li>
<li>Proper support for r3.xlarge instances on AWS (<a href="https://issues.apache.org/jira/browse/SPARK-1790">SPARK-1790</a>)</li>
</ul>
<h4 id="pyspark">PySpark</h4>
<ul>
<li>Issue causing crashes when large numbers of tasks finish quickly (<a href="https://issues.apache.org/jira/browse/SPARK-2282">SPARK-2282</a>)</li>
<li>Issue importing MLlib in YARN-client mode (<a href="https://issues.apache.org/jira/browse/SPARK-2172">SPARK-2172</a>)</li>
<li>Incorrect behavior when hashing None (<a href="https://issues.apache.org/jira/browse/SPARK-1468">SPARK-1468</a>)</li>
</ul>
<h4 id="mllib">MLlib</h4>
<ul>
<li>Added compatibility for numpy 1.4 (<a href="https://issues.apache.org/jira/browse/SPARK-2091">SPARK-2091</a>)</li>
<li>Concurrency issue in random sampler (<a href="https://issues.apache.org/jira/browse/SPARK-2251">SPARK-2251</a>)</li>
<li>NotSerailizable exception in ALS (<a href="https://issues.apache.org/jira/browse/SPARK-1977">SPARK-1977</a>)</li>
</ul>
<h4 id="streaming">Streaming</h4>
<ul>
<li>Key not found when slow receiver starts (<a href="https://issues.apache.org/jira/browse/SPARK-2009">SPARK-2009</a>)</li>
<li>Resource clean-up with KafkaInputDStream (<a href="https://issues.apache.org/jira/browse/SPARK-2034">SPARK-2034</a>)</li>
<li>Issue with Flume events larger than 1020 bytes (<a href="https://issues.apache.org/jira/browse/SPARK-1916">SPARK-1916</a>)</li>
</ul>
<h3 id="sparksql-features">SparkSQL Features</h3>
<ul>
<li>Support for querying JSON datasets (<a href="https://issues.apache.org/jira/browse/SPARK-2060">SPARK-2060</a>).</li>
<li>Improved reading and writing Parquet data, including support for nested records and arrays (<a href="https://issues.apache.org/jira/browse/SPARK-1293">SPARK-1293</a>, <a href="https://issues.apache.org/jira/browse/SPARK-2195">SPARK-2195</a>, <a href="https://issues.apache.org/jira/browse/SPARK-1913">SPARK-1913</a>, and <a href="https://issues.apache.org/jira/browse/SPARK-1487">SPARK-1487</a>).</li>
<li>Improved support for SQL commands (<code>CACHE TABLE</code>, <code>DESCRIBE</code>, SHOW TABLES) (<a href="https://issues.apache.org/jira/browse/SPARK-1968">SPARK-1968</a>, <a href="https://issues.apache.org/jira/browse/SPARK-2128">SPARK-2128</a>, and <a href="https://issues.apache.org/jira/browse/SPARK-1704">SPARK-1704</a>).</li>
<li>Support for SQL specific configuration (initially used for setting number of partitions) (<a href="https://issues.apache.org/jira/browse/SPARK-1508">SPARK-1508</a>).</li>
<li>Idempotence for DDL operations (<a href="https://issues.apache.org/jira/browse/SPARK-2191">SPARK-2191</a>).</li>
</ul>
<h3 id="known-issues">Known Issues</h3>
<p>This release contains one known issue: multi-statement lines the REPL with internal references (<code>> val x = 10; val y = x + 10</code>) produce exceptions (<a href="https://issues.apache.org/jira/browse/SPARK-2452">SPARK-2452</a>). This will be fixed shortly on the 1.0 branch; the fix will be included in the 1.0.2 release.</p>
<h3 id="contributors">Contributors</h3>
<p>The following developers contributed to this release:</p>
<ul>
<li>Aaron Davidson – bug fixes in PySpark and Spark core</li>
<li>Ali Ghodsi – documentation update</li>
<li>Anant – compatibility fix for spark-ec2 script</li>
<li>Anatoli Fomenko – MLlib doc fix</li>
<li>Andre Schumacher – nested Parquet data</li>
<li>Andrew Ash – documentation</li>
<li>Andrew Or – bug fixes and documentation</li>
<li>Ankur Dave – bug fixes</li>
<li>Arkadiusz Komarzewski – doc fix</li>
<li>Baishuo – sql fix</li>
<li>Chen Chao – comment fix and bug fix</li>
<li>Cheng Hao – SQL features</li>
<li>Cheng Lian – SQL features</li>
<li>Christian Tzolov – build improvmenet</li>
<li>Clément MATHIEU – doc updates </li>
<li>CodingCat – doc updates and bug fix </li>
<li>Colin McCabe – bug fix</li>
<li>Daoyuan – SQL joins</li>
<li>David Lemieux – bug fix</li>
<li>Derek Ma – bug fix</li>
<li>Doris Xin – bug fix</li>
<li>Erik Selin – PySpark fix</li>
<li>Gang Bai – bug fix</li>
<li>Guoqiang Li – bug fixes</li>
<li>Henry Saputra – documentation</li>
<li>Jiang – doc fix</li>
<li>Joy Yoj – bug fix</li>
<li>Jyotiska NK – test improvement</li>
<li>Kan Zhang – PySpark SQL features</li>
<li>Kay Ousterhout – documentation fix</li>
<li>LY Lai – bug fix</li>
<li>Lars Albertsson – bug fix </li>
<li>Lei Zhang – SQL fix and feature</li>
<li>Mark Hamstra – bug fix</li>
<li>Matei Zaharia – doc updates and bug fix</li>
<li>Matthew Farrellee – bug fixes</li>
<li>Michael Armbrust – sql features and fixes</li>
<li>Neville Li – buf fix</li>
<li>Nick Chammas – doc fix</li>
<li>Ori Kremer – bug fix</li>
<li>Patrick Wendell – documentation and release manager</li>
<li>Prashant Sharma – bug and doc fixes</li>
<li>Qiuzhuang.Lian – bug fix</li>
<li>Raymond Liu – bug fix</li>
<li>Ravikanth Nawada – bug fixes</li>
<li>Reynold Xin – bug and doc fixes</li>
<li>Sameer Agarwal – optimization</li>
<li>Sandy Ryza – doc fix</li>
<li>Sean Owen – bug fix</li>
<li>Sebastien Rainville – bug fix</li>
<li>Shixiong Zhu – code clean-up</li>
<li>Szul, Piotr – bug fix</li>
<li>Takuya UESHIN – bug fixes and SQL features</li>
<li>Thomas Graves – bug fix </li>
<li>Uri Laserson – bug fix</li>
<li>Vadim Chekan – bug fix</li>
<li>Varakhedi Sujeet – ec2 r3 support</li>
<li>Vlad – doc fix</li>
<li>Wang Lianhui – bug fix</li>
<li>Wenchen Fan – optimization</li>
<li>William Benton – SQL feature</li>
<li>Xi Liu – SQL feature</li>
<li>Xiangrui Meng – bug fix</li>
<li>Ximo Guanter Gonzalbez – SQL feature</li>
<li>Yadid Ayzenberg – doc fix</li>
<li>Yijie Shen – buf fix</li>
<li>Yin Huai – JSON support and bug fixes</li>
<li>Zhen Peng – bug fix</li>
<li>Zichuan Ye – ec2 fixes</li>
<li>Zongheng Yang – sql fixes</li>
</ul>
<p><em>Thanks to everyone who contributed!</em></p>
<p>
<br/>
<a href="/news/">Spark News Archive</a>
</p>
</div>
</div>
<footer class="small">
<hr>
Apache Spark, Spark, Apache, and the Spark logo are <a href="/trademarks.html">trademarks</a> of
<a href="http://www.apache.org">The Apache Software Foundation</a>.
</footer>
</div>
</body>
</html>
|