SPARK-1767: Prefer HDFS-cached replicas when scheduling data-local tasks

This change reorders the replicas returned by HadoopRDD#getPreferredLocations so that replicas cached by HDFS are at the start of the list. This requires Hadoop 2.5 or higher; previous versions of Hadoop do not expose the information needed to determine whether a replica is cached. Author: Colin Patrick Mccabe <cmccabe@cloudera.com> Closes #1486 from cmccabe/SPARK-1767 and squashes the following commits: 338d4f8 [Colin Patrick Mccabe] SPARK-1767: Prefer HDFS-cached replicas when scheduling data-local tasks
author: Colin Patrick Mccabe <cmccabe@cloudera.com> 2014-10-02 00:29:31 -0700
committer: Patrick Wendell <pwendell@gmail.com> 2014-10-02 00:29:31 -0700
commit: 6e27cb630de69fa5acb510b4e2f6b980742b1957 (patch)
tree: 720a0c40776c9829a761022e0a9a6da502667ebb /project
parent: bbdf1de84ffdd3bd172f17975d2f1422a9bcf2c6 (diff)
download: spark-6e27cb630de69fa5acb510b4e2f6b980742b1957.tar.gz
spark-6e27cb630de69fa5acb510b4e2f6b980742b1957.tar.bz2
spark-6e27cb630de69fa5acb510b4e2f6b980742b1957.zip
1 files changed, 2 insertions, 0 deletions
diff --git a/project/MimaExcludes.scala b/project/MimaExcludes.scala
index 4076ebc6fc..d499302124 100644
--- a/project/MimaExcludes.scala
+++ b/project/MimaExcludes.scala
@@ -41,6 +41,8 @@ object MimaExcludes {
           MimaBuild.excludeSparkClass("mllib.linalg.Matrix") ++
           MimaBuild.excludeSparkClass("mllib.linalg.Vector") ++
           Seq(
+            ProblemFilters.exclude[IncompatibleTemplateDefProblem](
+              "org.apache.spark.scheduler.TaskLocation"),
             // Added normL1 and normL2 to trait MultivariateStatisticalSummary
             ProblemFilters.exclude[MissingMethodProblem](
               "org.apache.spark.mllib.stat.MultivariateStatisticalSummary.normL1"),
author	Colin Patrick Mccabe <cmccabe@cloudera.com>	2014-10-02 00:29:31 -0700
committer	Patrick Wendell <pwendell@gmail.com>	2014-10-02 00:29:31 -0700
commit	6e27cb630de69fa5acb510b4e2f6b980742b1957 (patch)
tree	720a0c40776c9829a761022e0a9a6da502667ebb /project
parent	bbdf1de84ffdd3bd172f17975d2f1422a9bcf2c6 (diff)
download	spark-6e27cb630de69fa5acb510b4e2f6b980742b1957.tar.gz spark-6e27cb630de69fa5acb510b4e2f6b980742b1957.tar.bz2 spark-6e27cb630de69fa5acb510b4e2f6b980742b1957.zip