[SPARK-3796] Create external service which can serve shuffle files

This patch introduces the tooling necessary to construct an external shuffle service which is independent of Spark executors, and then use this service inside Spark. An example (just for the sake of this PR) of the service creation can be found in Worker, and the service itself is used by plugging in the StandaloneShuffleClient as Spark's ShuffleClient (setup in BlockManager). This PR continues the work from #2753, which extracted out the transport layer of Spark's block transfer into an independent package within Spark. A new package was created which contains the Spark business logic necessary to retrieve the actual shuffle data, which is completely independent of the transport layer introduced in the previous patch. Similar to the transport layer, this package must not depend on Spark as we anticipate plugging this service as a lightweight process within, say, the YARN NodeManager, and do not wish to include Spark's dependencies (including Scala itself). There are several outstanding tasks which must be complete before this PR can be merged: - [x] Complete unit testing of network/shuffle package. - [x] Performance and correctness testing on a real cluster. - [x] Remove example service instantiation from Worker.scala. There are even more shortcomings of this PR which should be addressed in followup patches: - Don't use Java serializer for RPC layer! It is not cross-version compatible. - Handle shuffle file cleanup for dead executors once the application terminates or the ContextCleaner triggers. - Documentation of the feature in the Spark docs. - Improve behavior if the shuffle service itself goes down (right now we don't blacklist it, and new executors cannot spawn on that machine). - SSL and SASL integration - Nice to have: Handle shuffle file consolidation (this would requires changes to Spark's implementation). Author: Aaron Davidson <aaron@databricks.com> Closes #3001 from aarondav/shuffle-service and squashes the following commits: 4d1f8c1 [Aaron Davidson] Remove changes to Worker 705748f [Aaron Davidson] Rename Standalone* to External* fd3928b [Aaron Davidson] Do not unregister executor outputs unduly 9883918 [Aaron Davidson] Make suggested build changes 3d62679 [Aaron Davidson] Add Spark integration test 7fe51d5 [Aaron Davidson] Fix SBT integration 56caa50 [Aaron Davidson] Address comments c8d1ac3 [Aaron Davidson] Add unit tests 2f70c0c [Aaron Davidson] Fix unit tests 5483e96 [Aaron Davidson] Fix unit tests 46a70bf [Aaron Davidson] Whoops, bracket 5ea4df6 [Aaron Davidson] [SPARK-3796] Create external service which can serve shuffle files
author: Aaron Davidson <aaron@databricks.com> 2014-11-01 14:37:45 -0700
committer: Reynold Xin <rxin@databricks.com> 2014-11-01 14:37:45 -0700
commit: f55218aeb1e9d638df6229b36a59a15ce5363482 (patch)
tree: 84e4454c224b3f14b7fcbe8259c90d06b6fd969b /pom.xml
parent: 1d4f3552037cb667971bea2e5078d8b3ce6c2eae (diff)
download: spark-f55218aeb1e9d638df6229b36a59a15ce5363482.tar.gz
spark-f55218aeb1e9d638df6229b36a59a15ce5363482.tar.bz2
spark-f55218aeb1e9d638df6229b36a59a15ce5363482.zip
1 files changed, 1 insertions, 0 deletions
diff --git a/pom.xml b/pom.xml
index 4c7806c416..61a508a0ea 100644
--- a/pom.xml
+++ b/pom.xml
@@ -92,6 +92,7 @@
     <module>mllib</module>
     <module>tools</module>
     <module>network/common</module>
+    <module>network/shuffle</module>
     <module>streaming</module>
     <module>sql/catalyst</module>
     <module>sql/core</module>
author	Aaron Davidson <aaron@databricks.com>	2014-11-01 14:37:45 -0700
committer	Reynold Xin <rxin@databricks.com>	2014-11-01 14:37:45 -0700
commit	f55218aeb1e9d638df6229b36a59a15ce5363482 (patch)
tree	84e4454c224b3f14b7fcbe8259c90d06b6fd969b /pom.xml
parent	1d4f3552037cb667971bea2e5078d8b3ce6c2eae (diff)
download	spark-f55218aeb1e9d638df6229b36a59a15ce5363482.tar.gz spark-f55218aeb1e9d638df6229b36a59a15ce5363482.tar.bz2 spark-f55218aeb1e9d638df6229b36a59a15ce5363482.zip