diff options
author | jiangxingbo <jiangxb1987@gmail.com> | 2016-11-08 09:41:01 -0800 |
---|---|---|
committer | Reynold Xin <rxin@databricks.com> | 2016-11-08 09:41:01 -0800 |
commit | 9c419698fe110a805570031cac3387a51957d9d1 (patch) | |
tree | 847284e6313c49aedd0a864d3931cacaf92ea425 /yarn/src | |
parent | 73feaa30ebfb62c81c7ce2c60ce2163611dd8852 (diff) | |
download | spark-9c419698fe110a805570031cac3387a51957d9d1.tar.gz spark-9c419698fe110a805570031cac3387a51957d9d1.tar.bz2 spark-9c419698fe110a805570031cac3387a51957d9d1.zip |
[SPARK-18191][CORE] Port RDD API to use commit protocol
## What changes were proposed in this pull request?
This PR port RDD API to use commit protocol, the changes made here:
1. Add new internal helper class that saves an RDD using a Hadoop OutputFormat named `SparkNewHadoopWriter`, it's similar with `SparkHadoopWriter` but uses commit protocol. This class supports the newer `mapreduce` API, instead of the old `mapred` API which is supported by `SparkHadoopWriter`;
2. Rewrite `PairRDDFunctions.saveAsNewAPIHadoopDataset` function, so it uses commit protocol now.
## How was this patch tested?
Exsiting test cases.
Author: jiangxingbo <jiangxb1987@gmail.com>
Closes #15769 from jiangxb1987/rdd-commit.
Diffstat (limited to 'yarn/src')
0 files changed, 0 insertions, 0 deletions