aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/rdd.py
diff options
context:
space:
mode:
authorPatrick Wendell <pwendell@gmail.com>2013-12-14 00:22:45 -0800
committerPatrick Wendell <pwendell@gmail.com>2013-12-14 00:22:45 -0800
commit97ac06018206b593600594605be241d0cd706e08 (patch)
treea956d15687b75e4613bf6b2a6e3c20b335061cf7 /python/pyspark/rdd.py
parentd2efe13574090e93c600adeacc7f6356bc196e6c (diff)
parent7ac944fc27805e2a76285ce3a31f3b2ecf4a7b78 (diff)
downloadspark-97ac06018206b593600594605be241d0cd706e08.tar.gz
spark-97ac06018206b593600594605be241d0cd706e08.tar.bz2
spark-97ac06018206b593600594605be241d0cd706e08.zip
Merge pull request #259 from pwendell/scala-2.10
Migration to Scala 2.10 == Below description was written by Prashant Sharma == This PR migrates spark to scala 2.10. Summary of changes apart from scala 2.10 migration: (has no implications for user.) 1. Migrated Akka to 2.2.3. Does not use remote death watch for it has a bug, where it tries to send message to dead node infinitely. Uses an indestructible actorsystem which tolerates errors only on executors. (Might be useful for user.) 4. New configuration settings introduced: System.getProperty("spark.akka.heartbeat.pauses", "600") System.getProperty("spark.akka.failure-detector.threshold", "300.0") System.getProperty("spark.akka.heartbeat.interval", "1000") Defaults for these are fairly large to only disable Failure detector that comes with akka. The reason for doing so is we have our own failure detector like mechanism in place and then this is just an overhead on top of that + it leads to a lot of false positives. But with these properties it is possible to enable them. A good use case for enabling it could be when someone wants spark to be sensitive (in a controllable manner ofc.) to GC pauses/Network lags and quickly evict executors that experienced it. More information is included in configuration.md Once we have the SPARK-544 merged, I had like to deprecate atleast these akka properties and may be others too. This PR is duplicate of #221(where all the discussion happened.) for that one pointed to master this one points to scala-2.10 branch.
Diffstat (limited to 'python/pyspark/rdd.py')
-rw-r--r--python/pyspark/rdd.py4
1 files changed, 2 insertions, 2 deletions
diff --git a/python/pyspark/rdd.py b/python/pyspark/rdd.py
index d8da02072c..61720dcf1a 100644
--- a/python/pyspark/rdd.py
+++ b/python/pyspark/rdd.py
@@ -978,7 +978,7 @@ class PipelinedRDD(RDD):
[x._jbroadcast for x in self.ctx._pickled_broadcast_vars],
self.ctx._gateway._gateway_client)
self.ctx._pickled_broadcast_vars.clear()
- class_manifest = self._prev_jrdd.classManifest()
+ class_tag = self._prev_jrdd.classTag()
env = MapConverter().convert(self.ctx.environment,
self.ctx._gateway._gateway_client)
includes = ListConverter().convert(self.ctx._python_includes,
@@ -986,7 +986,7 @@ class PipelinedRDD(RDD):
python_rdd = self.ctx._jvm.PythonRDD(self._prev_jrdd.rdd(),
bytearray(pickled_command), env, includes, self.preservesPartitioning,
self.ctx.pythonExec, broadcast_vars, self.ctx._javaAccumulator,
- class_manifest)
+ class_tag)
self._jrdd_val = python_rdd.asJavaRDD()
return self._jrdd_val