diff options
author | jeanlyn <jeanlyn92@gmail.com> | 2015-06-21 00:13:40 -0700 |
---|---|---|
committer | Cheng Lian <lian@databricks.com> | 2015-06-21 00:13:40 -0700 |
commit | a1e3649c8775d71ca78796b6544284e942ac1331 (patch) | |
tree | 041bb25b0bb2aecee5447ad62bd7b98defdb1c5f /docker | |
parent | 41ab2853f41de2abc415358b69671f37a0653533 (diff) | |
download | spark-a1e3649c8775d71ca78796b6544284e942ac1331.tar.gz spark-a1e3649c8775d71ca78796b6544284e942ac1331.tar.bz2 spark-a1e3649c8775d71ca78796b6544284e942ac1331.zip |
[SPARK-8379] [SQL] avoid speculative tasks write to the same file
The issue link [SPARK-8379](https://issues.apache.org/jira/browse/SPARK-8379)
Currently,when we insert data to the dynamic partition with speculative tasks we will get the Exception
```
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
Lease mismatch on /tmp/hive-jeanlyn/hive_2015-06-15_15-20-44_734_8801220787219172413-1/-ext-10000/ds=2015-06-15/type=2/part-00301.lzo
owned by DFSClient_attempt_201506031520_0011_m_000189_0_-1513487243_53
but is accessed by DFSClient_attempt_201506031520_0011_m_000042_0_-1275047721_57
```
This pr try to write the data to temporary dir when using dynamic parition avoid the speculative tasks writing the same file
Author: jeanlyn <jeanlyn92@gmail.com>
Closes #6833 from jeanlyn/speculation and squashes the following commits:
64bbfab [jeanlyn] use FileOutputFormat.getTaskOutputPath to get the path
8860af0 [jeanlyn] remove the never using code
e19a3bd [jeanlyn] avoid speculative tasks write same file
Diffstat (limited to 'docker')
0 files changed, 0 insertions, 0 deletions