summaryrefslogtreecommitdiff
path: root/site/releases/spark-release-1-0-0.html
diff options
context:
space:
mode:
Diffstat (limited to 'site/releases/spark-release-1-0-0.html')
-rw-r--r--site/releases/spark-release-1-0-0.html238
1 files changed, 119 insertions, 119 deletions
diff --git a/site/releases/spark-release-1-0-0.html b/site/releases/spark-release-1-0-0.html
index 5862e2113..76649de87 100644
--- a/site/releases/spark-release-1-0-0.html
+++ b/site/releases/spark-release-1-0-0.html
@@ -192,11 +192,11 @@
<p>Spark 1.0 adds support for Java 8 <a href="http://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html">new lambda syntax</a> in its Java bindings. Java 8 supports a concise syntax for writing anonymous functions, similar to the closure syntax in Scala and Python. This change requires small changes for users of the current Java API, which are noted in the documentation. Spark’s Python API has been extended to support several new functions. We’ve also included several stability improvements in the Python API, particularly for large datasets. PySpark now supports running on YARN as well.</p>
<h3 id="documentation">Documentation</h3>
-<p>Spark&#8217;s <a href="/docs/latest/programming-guide.html">programming guide</a> has been significantly expanded to centrally cover all supported languages and discuss more operators and aspects of the development life cycle. The <a href="/docs/latest/mllib-guide.html">MLlib guide</a> has also been expanded with significantly more detail and examples for each algorithm, while documents on configuration, YARN and Mesos have also been revamped.</p>
+<p>Spark’s <a href="/docs/latest/programming-guide.html">programming guide</a> has been significantly expanded to centrally cover all supported languages and discuss more operators and aspects of the development life cycle. The <a href="/docs/latest/mllib-guide.html">MLlib guide</a> has also been expanded with significantly more detail and examples for each algorithm, while documents on configuration, YARN and Mesos have also been revamped.</p>
<h3 id="smaller-changes">Smaller Changes</h3>
<ul>
- <li>PySpark now works with more Python versions than before &#8211; Python 2.6+ instead of 2.7+, and NumPy 1.4+ instead of 1.7+.</li>
+ <li>PySpark now works with more Python versions than before – Python 2.6+ instead of 2.7+, and NumPy 1.4+ instead of 1.7+.</li>
<li>Spark has upgraded to Avro 1.7.6, adding support for Avro specific types.</li>
<li>Internal instrumentation has been added to allow applications to monitor and instrument Spark jobs.</li>
<li>Support for off-heap storage in Tachyon has been added via a special build target.</li>
@@ -213,123 +213,123 @@
<p>The following developers contributed to this release:</p>
<ul>
- <li>Aaron Davidson &#8211; packaging and deployment improvements, several bug fixes, local[*] mode</li>
- <li>Aaron Kimball &#8211; documentation improvements</li>
- <li>Abhishek Kumar &#8211; Python configuration fixes</li>
- <li>Ahir Reddy &#8211; PySpark build, fixes, and cancellation support</li>
- <li>Allan Douglas R. de Oliveira &#8211; Improvements to spark-ec2 scripts</li>
- <li>Andre Schumacher &#8211; Parquet support and optimizations</li>
- <li>Andrew Ash &#8211; Mesos documentation and other doc improvements, bug fixes</li>
- <li>Andrew Or &#8211; history server (lead), garbage collection (lead), spark-submit, PySpark and YARN improvements</li>
- <li>Andrew Tulloch &#8211; MLlib contributions and code clean-up</li>
- <li>Andy Konwinski &#8211; documentation fix</li>
- <li>Anita Tailor &#8211; Cassandra example</li>
- <li>Ankur Dave &#8211; GraphX (lead) optimizations, documentation, and usability</li>
- <li>Archer Shao &#8211; bug fixes</li>
- <li>Arun Ramakrishnan &#8211; improved random sampling</li>
- <li>Baishuo &#8211; test improvements</li>
- <li>Bernardo Gomez Palacio &#8211; spark-shell improvements and Mesos updates</li>
- <li>Bharath Bhushan &#8211; bug fix</li>
- <li>Bijay Bisht &#8211; bug fixes</li>
- <li>Binh Nguyen &#8211; dependency fix</li>
- <li>Bouke van der Bijl &#8211; fixes for PySpark on Mesos and other Mesos fixes</li>
- <li>Bryn Keller &#8211; improvement to HBase support and unit tests</li>
- <li>Chen Chao &#8211; documentation, bug fix, and code clean-up</li>
- <li>Cheng Hao &#8211; performance and feature improvements in Spark SQL</li>
- <li>Cheng Lian &#8211; column storage and other improvements in Spark SQL</li>
- <li>Christian Lundgren &#8211; improvement to spark-ec2 scripts</li>
- <li>DB Tsai &#8211; L-BGFS optimizer in MLlib, MLlib documentation and fixes</li>
- <li>Dan McClary &#8211; Improvement to stats counter</li>
- <li>Daniel Darabos &#8211; GraphX performance improvement</li>
- <li>Davis Shepherd &#8211; bug fix</li>
- <li>Diana Carroll &#8211; documentation and bug fix</li>
- <li>Egor Pakhomov &#8211; local iterator for RDD’s</li>
- <li>Emtiaz Ahmed &#8211; bug fix</li>
- <li>Erik Selin &#8211; bug fix</li>
- <li>Ethan Jewett &#8211; documentation improvement</li>
- <li>Evan Chan &#8211; automatic clean-up of application data</li>
- <li>Evan Sparks &#8211; MLlib optimizations and doc improvement</li>
- <li>Frank Dai &#8211; code clean-up in MLlib</li>
- <li>Guoquiang Li &#8211; build improvements and several bug fixes</li>
- <li>Ghidireac &#8211; bug fix</li>
- <li>Haoyuan Li &#8211; Tachyon storage level for RDD’s</li>
- <li>Harvey Feng &#8211; spark-ec2 update</li>
- <li>Henry Saputra &#8211; code clean-up</li>
- <li>Henry Cook &#8211; Spark SQL improvements</li>
- <li>Holden Karau &#8211; cross validation in MLlib, Python and core engine improvements</li>
- <li>Ivan Wick &#8211; Mesos bug fix</li>
- <li>Jey Kottalam &#8211; sbt build improvement</li>
- <li>Jerry Shao &#8211; Spark metrics and Spark SQL improvements</li>
- <li>Jiacheng Guo &#8211; bug fix</li>
- <li>Jianghan &#8211; bug fix</li>
- <li>Jianping J Wang &#8211; JBLAS support in MLlib</li>
- <li>Joseph E. Gonzalez &#8211; GraphX improvements, fixes, and documentation</li>
- <li>Josh Rosen &#8211; PySpark improvements and bug fixes</li>
- <li>Jyotiska NK &#8211; documentation, test improvements, and bug fix</li>
- <li>Kan Zhang &#8211; bug fixes in Spark core, SQL, and PySpark</li>
- <li>Kay Ousterhout &#8211; bug fixes and code refactoring in scheduler</li>
- <li>Kelvin Chu &#8211; automatic clean-up of application data</li>
- <li>Kevin Mader &#8211; example fix</li>
- <li>Koert Kuipers &#8211; code visibility fix</li>
- <li>Kousuke Saruta &#8211; documentation and build fixes</li>
- <li>Kyle Ellrott &#8211; improved memory usage for DISK_ONLY persistence</li>
- <li>Larva Boy &#8211; approximate counts in Spark SQL</li>
- <li>Madhu Siddalingaiah &#8211; ec2 fixes</li>
- <li>Manish Amde &#8211; decision trees in MLlib</li>
- <li>Marcelo Vanzin &#8211; improvements and fixes to YARN support, dependency clean-up</li>
- <li>Mark Grover &#8211; build fixes</li>
- <li>Mark Hamstra &#8211; build and dependency improvements, scheduler bug fixes</li>
- <li>Margin Jaggi &#8211; MLlib documentation improvements</li>
- <li>Matei Zaharia &#8211; Python versions of several MLlib algorithms, spark-submit improvements, bug fixes, and documentation improvements</li>
- <li>Michael Armbrust &#8211; Spark SQL (lead), including schema support for RDD’s, catalyst optimizer, and Hive support</li>
- <li>Mridul Muralidharan &#8211; code visibility changes and bug fixes</li>
- <li>Nan Zhu &#8211; bug and stability fixes, code clean-up, documentation, and new features</li>
- <li>Neville Li &#8211; bug fix</li>
- <li>Nick Lanham &#8211; Tachyon bundling in distribution script</li>
- <li>Nirmal Reddy &#8211; code clean-up</li>
- <li>OuYang Jin &#8211; local mode and json improvements</li>
- <li>Patrick Wendell &#8211; release manager, build improvements, bug fixes, and code clean-up</li>
- <li>Petko Nikolov &#8211; new utility functions</li>
- <li>Prabeesh K &#8211; typo fix</li>
- <li>Prabin Banka &#8211; new PySpark API’s</li>
- <li>Prashant Sharma &#8211; PySpark improvements, Java 8 lambda support, and build improvements</li>
- <li>Punya Biswal &#8211; Java API improvements</li>
- <li>Qiuzhuang Lian &#8211; bug fixes</li>
- <li>Rahul Singhal &#8211; build improvements, bug fixes</li>
- <li>Raymond Liu &#8211; YARN build fixes and UI improvements</li>
- <li>Reynold Xin &#8211; bug fixes, internal changes, Spark SQL improvements, build fixes, and style improvements</li>
- <li>Reza Zadeh &#8211; SVD implementation in MLlib and other MLlib contributions</li>
- <li>Roman Pastukhov &#8211; clean-up of broadcast files</li>
- <li>Rong Gu &#8211; Tachyon storage level for RDD’s</li>
- <li>Sandeep Sing &#8211; several bug fixes, MLLib improvements and fixes to Spark examples</li>
- <li>Sandy Ryza &#8211; spark-submit script and several YARN improvements</li>
- <li>Saurabh Rawat &#8211; Java API improvements</li>
- <li>Sean Owen &#8211; several build improvements, code clean-up, and MLlib fixes</li>
- <li>Semih Salihoglu &#8211; GraphX improvements</li>
- <li>Shaocun Tian &#8211; bug fix in MLlib</li>
- <li>Shivaram Venkataraman &#8211; bug fixes</li>
- <li>Shixiong Zhu &#8211; code style and correctness fixes</li>
- <li>Shiyun Wxm &#8211; typo fix</li>
- <li>Stevo Slavic &#8211; bug fix</li>
- <li>Sumedh Mungee &#8211; documentation fix</li>
- <li>Sundeep Narravula &#8211; “cancel” button in Spark UI</li>
- <li>Takayu Ueshin &#8211; bug fixes and improvements to Spark SQL</li>
- <li>Tathagata Das &#8211; web UI and other improvements to Spark Streaming (lead), bug fixes, state clean-up, and release manager</li>
- <li>Timothy Chen &#8211; Spark SQL improvements</li>
- <li>Ted Malaska &#8211; improved Flume support</li>
- <li>Tom Graves &#8211; Hadoop security integration (lead) and YARN support</li>
- <li>Tianshuo Deng &#8211; Bug fix</li>
- <li>Tor Myklebust &#8211; improvements to ALS</li>
- <li>Wangfei &#8211; Spark SQL docs</li>
- <li>Wang Tao &#8211; code clean-up</li>
- <li>William Bendon &#8211; JSON support changes and bug fixes</li>
- <li>Xiangrui Meng &#8211; several improvements to MLlib (lead)</li>
- <li>Xuan Nguyen &#8211; build fix</li>
- <li>Xusen Yin &#8211; MLlib contributions and bug fix</li>
- <li>Ye Xianjin &#8211; test fixes</li>
- <li>Yinan Li &#8211; addFile improvement</li>
- <li>Yin Hua &#8211; Spark SQL improvements</li>
- <li>Zheng Peng &#8211; bug fixes</li>
+ <li>Aaron Davidson – packaging and deployment improvements, several bug fixes, local[*] mode</li>
+ <li>Aaron Kimball – documentation improvements</li>
+ <li>Abhishek Kumar – Python configuration fixes</li>
+ <li>Ahir Reddy – PySpark build, fixes, and cancellation support</li>
+ <li>Allan Douglas R. de Oliveira – Improvements to spark-ec2 scripts</li>
+ <li>Andre Schumacher – Parquet support and optimizations</li>
+ <li>Andrew Ash – Mesos documentation and other doc improvements, bug fixes</li>
+ <li>Andrew Or – history server (lead), garbage collection (lead), spark-submit, PySpark and YARN improvements</li>
+ <li>Andrew Tulloch – MLlib contributions and code clean-up</li>
+ <li>Andy Konwinski – documentation fix</li>
+ <li>Anita Tailor – Cassandra example</li>
+ <li>Ankur Dave – GraphX (lead) optimizations, documentation, and usability</li>
+ <li>Archer Shao – bug fixes</li>
+ <li>Arun Ramakrishnan – improved random sampling</li>
+ <li>Baishuo – test improvements</li>
+ <li>Bernardo Gomez Palacio – spark-shell improvements and Mesos updates</li>
+ <li>Bharath Bhushan – bug fix</li>
+ <li>Bijay Bisht – bug fixes</li>
+ <li>Binh Nguyen – dependency fix</li>
+ <li>Bouke van der Bijl – fixes for PySpark on Mesos and other Mesos fixes</li>
+ <li>Bryn Keller – improvement to HBase support and unit tests</li>
+ <li>Chen Chao – documentation, bug fix, and code clean-up</li>
+ <li>Cheng Hao – performance and feature improvements in Spark SQL</li>
+ <li>Cheng Lian – column storage and other improvements in Spark SQL</li>
+ <li>Christian Lundgren – improvement to spark-ec2 scripts</li>
+ <li>DB Tsai – L-BGFS optimizer in MLlib, MLlib documentation and fixes</li>
+ <li>Dan McClary – Improvement to stats counter</li>
+ <li>Daniel Darabos – GraphX performance improvement</li>
+ <li>Davis Shepherd – bug fix</li>
+ <li>Diana Carroll – documentation and bug fix</li>
+ <li>Egor Pakhomov – local iterator for RDD’s</li>
+ <li>Emtiaz Ahmed – bug fix</li>
+ <li>Erik Selin – bug fix</li>
+ <li>Ethan Jewett – documentation improvement</li>
+ <li>Evan Chan – automatic clean-up of application data</li>
+ <li>Evan Sparks – MLlib optimizations and doc improvement</li>
+ <li>Frank Dai – code clean-up in MLlib</li>
+ <li>Guoquiang Li – build improvements and several bug fixes</li>
+ <li>Ghidireac – bug fix</li>
+ <li>Haoyuan Li – Tachyon storage level for RDD’s</li>
+ <li>Harvey Feng – spark-ec2 update</li>
+ <li>Henry Saputra – code clean-up</li>
+ <li>Henry Cook – Spark SQL improvements</li>
+ <li>Holden Karau – cross validation in MLlib, Python and core engine improvements</li>
+ <li>Ivan Wick – Mesos bug fix</li>
+ <li>Jey Kottalam – sbt build improvement</li>
+ <li>Jerry Shao – Spark metrics and Spark SQL improvements</li>
+ <li>Jiacheng Guo – bug fix</li>
+ <li>Jianghan – bug fix</li>
+ <li>Jianping J Wang – JBLAS support in MLlib</li>
+ <li>Joseph E. Gonzalez – GraphX improvements, fixes, and documentation</li>
+ <li>Josh Rosen – PySpark improvements and bug fixes</li>
+ <li>Jyotiska NK – documentation, test improvements, and bug fix</li>
+ <li>Kan Zhang – bug fixes in Spark core, SQL, and PySpark</li>
+ <li>Kay Ousterhout – bug fixes and code refactoring in scheduler</li>
+ <li>Kelvin Chu – automatic clean-up of application data</li>
+ <li>Kevin Mader – example fix</li>
+ <li>Koert Kuipers – code visibility fix</li>
+ <li>Kousuke Saruta – documentation and build fixes</li>
+ <li>Kyle Ellrott – improved memory usage for DISK_ONLY persistence</li>
+ <li>Larva Boy – approximate counts in Spark SQL</li>
+ <li>Madhu Siddalingaiah – ec2 fixes</li>
+ <li>Manish Amde – decision trees in MLlib</li>
+ <li>Marcelo Vanzin – improvements and fixes to YARN support, dependency clean-up</li>
+ <li>Mark Grover – build fixes</li>
+ <li>Mark Hamstra – build and dependency improvements, scheduler bug fixes</li>
+ <li>Margin Jaggi – MLlib documentation improvements</li>
+ <li>Matei Zaharia – Python versions of several MLlib algorithms, spark-submit improvements, bug fixes, and documentation improvements</li>
+ <li>Michael Armbrust – Spark SQL (lead), including schema support for RDD’s, catalyst optimizer, and Hive support</li>
+ <li>Mridul Muralidharan – code visibility changes and bug fixes</li>
+ <li>Nan Zhu – bug and stability fixes, code clean-up, documentation, and new features</li>
+ <li>Neville Li – bug fix</li>
+ <li>Nick Lanham – Tachyon bundling in distribution script</li>
+ <li>Nirmal Reddy – code clean-up</li>
+ <li>OuYang Jin – local mode and json improvements</li>
+ <li>Patrick Wendell – release manager, build improvements, bug fixes, and code clean-up</li>
+ <li>Petko Nikolov – new utility functions</li>
+ <li>Prabeesh K – typo fix</li>
+ <li>Prabin Banka – new PySpark API’s</li>
+ <li>Prashant Sharma – PySpark improvements, Java 8 lambda support, and build improvements</li>
+ <li>Punya Biswal – Java API improvements</li>
+ <li>Qiuzhuang Lian – bug fixes</li>
+ <li>Rahul Singhal – build improvements, bug fixes</li>
+ <li>Raymond Liu – YARN build fixes and UI improvements</li>
+ <li>Reynold Xin – bug fixes, internal changes, Spark SQL improvements, build fixes, and style improvements</li>
+ <li>Reza Zadeh – SVD implementation in MLlib and other MLlib contributions</li>
+ <li>Roman Pastukhov – clean-up of broadcast files</li>
+ <li>Rong Gu – Tachyon storage level for RDD’s</li>
+ <li>Sandeep Sing – several bug fixes, MLLib improvements and fixes to Spark examples</li>
+ <li>Sandy Ryza – spark-submit script and several YARN improvements</li>
+ <li>Saurabh Rawat – Java API improvements</li>
+ <li>Sean Owen – several build improvements, code clean-up, and MLlib fixes</li>
+ <li>Semih Salihoglu – GraphX improvements</li>
+ <li>Shaocun Tian – bug fix in MLlib</li>
+ <li>Shivaram Venkataraman – bug fixes</li>
+ <li>Shixiong Zhu – code style and correctness fixes</li>
+ <li>Shiyun Wxm – typo fix</li>
+ <li>Stevo Slavic – bug fix</li>
+ <li>Sumedh Mungee – documentation fix</li>
+ <li>Sundeep Narravula – “cancel” button in Spark UI</li>
+ <li>Takayu Ueshin – bug fixes and improvements to Spark SQL</li>
+ <li>Tathagata Das – web UI and other improvements to Spark Streaming (lead), bug fixes, state clean-up, and release manager</li>
+ <li>Timothy Chen – Spark SQL improvements</li>
+ <li>Ted Malaska – improved Flume support</li>
+ <li>Tom Graves – Hadoop security integration (lead) and YARN support</li>
+ <li>Tianshuo Deng – Bug fix</li>
+ <li>Tor Myklebust – improvements to ALS</li>
+ <li>Wangfei – Spark SQL docs</li>
+ <li>Wang Tao – code clean-up</li>
+ <li>William Bendon – JSON support changes and bug fixes</li>
+ <li>Xiangrui Meng – several improvements to MLlib (lead)</li>
+ <li>Xuan Nguyen – build fix</li>
+ <li>Xusen Yin – MLlib contributions and bug fix</li>
+ <li>Ye Xianjin – test fixes</li>
+ <li>Yinan Li – addFile improvement</li>
+ <li>Yin Hua – Spark SQL improvements</li>
+ <li>Zheng Peng – bug fixes</li>
</ul>
<p><em>Thanks to everyone who contributed!</em></p>