[SPARK-3595] Respect configured OutputCommitters when calling saveAsHadoopFile - spark

diff options

author	Ian Hummel <ian@themodernlife.net>	2014-09-21 13:04:36 -0700
committer	Patrick Wendell <pwendell@gmail.com>	2014-09-21 13:04:36 -0700
commit	a0454efe21e5c7ffe1b9bb7b18021a5580952e69 (patch)
tree	9c7df79201b003b81e0c54cb07283a69088860dd /.gitignore
parent	d112a6c79dee7b5d8459696f97d329190e8d09a5 (diff)
download	spark-a0454efe21e5c7ffe1b9bb7b18021a5580952e69.tar.gz spark-a0454efe21e5c7ffe1b9bb7b18021a5580952e69.tar.bz2 spark-a0454efe21e5c7ffe1b9bb7b18021a5580952e69.zip

[SPARK-3595] Respect configured OutputCommitters when calling saveAsHadoopFile

Addresses the issue in https://issues.apache.org/jira/browse/SPARK-3595, namely saveAsHadoopFile hardcoding the OutputCommitter. This is not ideal when running Spark jobs that write to S3, especially when running them from an EMR cluster where the default OutputCommitter is a DirectOutputCommitter. Author: Ian Hummel <ian@themodernlife.net> Closes #2450 from themodernlife/spark-3595 and squashes the following commits: f37a0e5 [Ian Hummel] Update based on comments from pwendell a11d9f3 [Ian Hummel] Fix formatting 4359664 [Ian Hummel] Add an example showing usage 8b6be94 [Ian Hummel] Add ability to specify OutputCommitter, espcially useful when writing to an S3 bucket from an EMR cluster

Diffstat (limited to '.gitignore')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: