diff options
author | Li Zhihui <zhihui.li@intel.com> | 2014-10-24 13:01:36 -0700 |
---|---|---|
committer | Andrew Or <andrew@databricks.com> | 2014-10-24 13:01:36 -0700 |
commit | 7aacb7bfad4ec73fd8f18555c72ef6962c14358f (patch) | |
tree | 27d2484547f3ae665baf6d2ce67829c54ff96b74 /sbin/spark-config.sh | |
parent | 6a40a76848203d7266c134a26191579138c76903 (diff) | |
download | spark-7aacb7bfad4ec73fd8f18555c72ef6962c14358f.tar.gz spark-7aacb7bfad4ec73fd8f18555c72ef6962c14358f.tar.bz2 spark-7aacb7bfad4ec73fd8f18555c72ef6962c14358f.zip |
[SPARK-2713] Executors of same application in same host should only download files & jars once
If Spark lunched multiple executors in one host for one application, every executor would download it dependent files and jars (if not using local: url) independently. It maybe result in huge latency. In my case, it result in 20 seconds latency to download dependent jars(size about 17M) when I lunched 32 executors in every host(total 4 hosts).
This patch will cache downloaded files and jars for executors to reduce network throughput and download latency. In my case, the latency was reduced from 20 seconds to less than 1 second.
Author: Li Zhihui <zhihui.li@intel.com>
Author: li-zhihui <zhihui.li@intel.com>
Closes #1616 from li-zhihui/cachefiles and squashes the following commits:
36940df [Li Zhihui] Close cache for local mode
935fed6 [Li Zhihui] Clean code.
f9330d4 [Li Zhihui] Clean code again
7050d46 [Li Zhihui] Clean code
074a422 [Li Zhihui] Fix: deal with spark.files.overwrite
03ed3a8 [li-zhihui] rename cache file name as XXXXXXXXX_cache
2766055 [li-zhihui] Use url.hashCode + timestamp as cachedFileName
76a7b66 [Li Zhihui] Clean code & use applcation work directory as cache directory
3510eb0 [Li Zhihui] Keep fetchFile private
2ffd742 [Li Zhihui] add comment for FileLock
e0ebd48 [Li Zhihui] Try and finally lock.release
7fb7c0b [Li Zhihui] Release lock before copy files
6b997bf [Li Zhihui] Executors of same application in same host should only download files & jars once
Diffstat (limited to 'sbin/spark-config.sh')
0 files changed, 0 insertions, 0 deletions