diff options
author | Sandy Ryza <sandy@cloudera.com> | 2015-02-05 10:15:55 -0800 |
---|---|---|
committer | Josh Rosen <joshrosen@databricks.com> | 2015-02-05 10:15:55 -0800 |
commit | c4b1108c3f9658adebbdf8508d325528c3206f16 (patch) | |
tree | a68e76a3f0ca50d2599d7047ed33537d1e150a51 /build/mvn | |
parent | 6580929fa029c4010dd4170de9be9f18516f8e5a (diff) | |
download | spark-c4b1108c3f9658adebbdf8508d325528c3206f16.tar.gz spark-c4b1108c3f9658adebbdf8508d325528c3206f16.tar.bz2 spark-c4b1108c3f9658adebbdf8508d325528c3206f16.zip |
SPARK-4687. Add a recursive option to the addFile API
This adds a recursive option to the addFile API to satisfy Hive's needs. It only allows specifying HDFS dirs that will be copied down on every executor.
There are a couple outstanding questions.
* Should we allow specifying local dirs as well? The best way to do this would probably be to archive them. The drawback is that it would require a fair bit of code that I don't know of any current use cases for.
* The addFiles implementation has a caching component that I don't entirely understand. What events are we caching between? AFAICT it's users calling addFile on the same file in the same app at different times? Do we want/need to add something similar for addDirectory.
* The addFiles implementation will check to see if an added file already exists and has the same contents. I imagine we want the same behavior, so planning to add this unless people think otherwise.
I plan to add some tests if people are OK with the approach.
Author: Sandy Ryza <sandy@cloudera.com>
Closes #3670 from sryza/sandy-spark-4687 and squashes the following commits:
f9fc77f [Sandy Ryza] Josh's comments
70cd24d [Sandy Ryza] Add another test
13da824 [Sandy Ryza] Revert executor changes
38bf94d [Sandy Ryza] Marcelo's comments
ca83849 [Sandy Ryza] Add addFile test
1941be3 [Sandy Ryza] Fix test and avoid HTTP server in local mode
31f15a9 [Sandy Ryza] Use cache recursively and fix some compile errors
0239c3d [Sandy Ryza] Change addDirectory to addFile with recursive
46fe70a [Sandy Ryza] SPARK-4687. Add a addDirectory API
Diffstat (limited to 'build/mvn')
0 files changed, 0 insertions, 0 deletions