diff options
author | hongshen <shenh062326@126.com> | 2016-08-12 09:58:02 +0100 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2016-08-12 09:58:02 +0100 |
commit | 993923c8f5ca719daf905285738b7fdcaf944d8c (patch) | |
tree | c215c4beeda67201b8ded63c4e669d1718a6fd10 /docs/streaming-programming-guide.md | |
parent | 00e103a6edd1a1f001a94d41dd1f7acc40a1e30f (diff) | |
download | spark-993923c8f5ca719daf905285738b7fdcaf944d8c.tar.gz spark-993923c8f5ca719daf905285738b7fdcaf944d8c.tar.bz2 spark-993923c8f5ca719daf905285738b7fdcaf944d8c.zip |
[SPARK-16985] Change dataFormat from yyyyMMddHHmm to yyyyMMddHHmmss
## What changes were proposed in this pull request?
In our cluster, sometimes the sql output maybe overrided. When I submit some sql, all insert into the same table, and the sql will cost less one minute, here is the detail,
1 sql1, 11:03 insert into table.
2 sql2, 11:04:11 insert into table.
3 sql3, 11:04:48 insert into table.
4 sql4, 11:05 insert into table.
5 sql5, 11:06 insert into table.
The sql3's output file will override the sql2's output file. here is the log:
```
16/05/04 11:04:11 INFO hive.SparkHiveHadoopWriter: XXfinalPath=hdfs://tl-sng-gdt-nn-tdw.tencent-distribute.com:54310/tmp/assorz/tdw-tdwadmin/20160504/04559505496526517_-1_1204544348/10000/_tmp.p_20160428/attempt_201605041104_0001_m_000000_1
16/05/04 11:04:48 INFO hive.SparkHiveHadoopWriter: XXfinalPath=hdfs://tl-sng-gdt-nn-tdw.tencent-distribute.com:54310/tmp/assorz/tdw-tdwadmin/20160504/04559505496526517_-1_212180468/10000/_tmp.p_20160428/attempt_201605041104_0001_m_000000_1
```
The reason is the output file use SimpleDateFormat("yyyyMMddHHmm"), if two sql insert into the same table in the same minute, the output will be overrite. I think we should change dateFormat to "yyyyMMddHHmmss", in our cluster, we can't finished a sql in one second.
## How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
Author: hongshen <shenh062326@126.com>
Closes #14574 from shenh062326/SPARK-16985.
Diffstat (limited to 'docs/streaming-programming-guide.md')
0 files changed, 0 insertions, 0 deletions