diff options
author | Dongjoon Hyun <dongjoon@apache.org> | 2016-08-16 10:01:30 -0700 |
---|---|---|
committer | Davies Liu <davies.liu@gmail.com> | 2016-08-16 10:01:30 -0700 |
commit | 12a89e55cbd630fa2986da984e066cd07d3bf1f7 (patch) | |
tree | 3bfcd749953b0e17b25374b971f3b44bf7dc175e /R/pkg/inst | |
parent | 6f0988b1293a5e5ee3620b2727ed969155d7ac0d (diff) | |
download | spark-12a89e55cbd630fa2986da984e066cd07d3bf1f7.tar.gz spark-12a89e55cbd630fa2986da984e066cd07d3bf1f7.tar.bz2 spark-12a89e55cbd630fa2986da984e066cd07d3bf1f7.zip |
[SPARK-17035] [SQL] [PYSPARK] Improve Timestamp not to lose precision for all cases
## What changes were proposed in this pull request?
`PySpark` loses `microsecond` precision for some corner cases during converting `Timestamp` into `Long`. For example, for the following `datetime.max` value should be converted a value whose last 6 digits are '999999'. This PR improves the logic not to lose precision for all cases.
**Corner case**
```python
>>> datetime.datetime.max
datetime.datetime(9999, 12, 31, 23, 59, 59, 999999)
```
**Before**
```python
>>> from datetime import datetime
>>> from pyspark.sql import Row
>>> from pyspark.sql.types import StructType, StructField, TimestampType
>>> schema = StructType([StructField("dt", TimestampType(), False)])
>>> [schema.toInternal(row) for row in [{"dt": datetime.max}]]
[(253402329600000000,)]
```
**After**
```python
>>> [schema.toInternal(row) for row in [{"dt": datetime.max}]]
[(253402329599999999,)]
```
## How was this patch tested?
Pass the Jenkins test with a new test case.
Author: Dongjoon Hyun <dongjoon@apache.org>
Closes #14631 from dongjoon-hyun/SPARK-17035.
Diffstat (limited to 'R/pkg/inst')
0 files changed, 0 insertions, 0 deletions