diff options
Diffstat (limited to 'news/_posts/2016-11-15-spark-wins-cloudsort-100tb-benchmark.md')
-rw-r--r-- | news/_posts/2016-11-15-spark-wins-cloudsort-100tb-benchmark.md | 22 |
1 files changed, 22 insertions, 0 deletions
diff --git a/news/_posts/2016-11-15-spark-wins-cloudsort-100tb-benchmark.md b/news/_posts/2016-11-15-spark-wins-cloudsort-100tb-benchmark.md new file mode 100644 index 000000000..19939bb0c --- /dev/null +++ b/news/_posts/2016-11-15-spark-wins-cloudsort-100tb-benchmark.md @@ -0,0 +1,22 @@ +--- +layout: post +title: Spark wins CloudSort Benchmark as the most efficient engine +categories: +- News +tags: [] +status: publish +type: post +published: true +meta: + _edit_last: '4' + _wpas_done_all: '1' +--- + +We are proud to announce that Apache Spark won the <a href="http://sortbenchmark.org/">2016 CloudSort Benchmark</a> (both Daytona and Indy category). A joint team from Nanjing University, Alibaba Group, and Databricks Inc. entered the competition using NADSort, a distributed sorting program built on top of Spark, and set a new world record as the most cost-efficient way to sort 100TB of data. + +They sorted 100TB of data using only $144 USD worth of public cloud resources, beating the previous record that cost $451 USD by the University of California, San Diego. + +This adds to the 2014 GraySort record Spark won, and validates Spark as the most efficient data processing engine. + +For more information, see the <a href="https://databricks.com/blog/2016/11/14/setting-new-world-record-apache-spark.html">Databricks blog article (in English)</a> written by Spark committer Reynold Xin, or the Nanjing University <a href="http://scit.nju.edu.cn/Item/1193.aspx">press release (in Chinese)</a>. + |