diff options
author | sureshthalamati <suresh.thalamati@gmail.com> | 2016-03-01 17:34:21 -0800 |
---|---|---|
committer | Reynold Xin <rxin@databricks.com> | 2016-03-01 17:34:21 -0800 |
commit | e42724b12b976b3276accc1132f446fa67f7f981 (patch) | |
tree | 67d5384e145cd8f71cba208828f39e190cf191b0 /python/pyspark/ml/classification.py | |
parent | a640c5b4fbd653919a5897a7b11f16328f2094eb (diff) | |
download | spark-e42724b12b976b3276accc1132f446fa67f7f981.tar.gz spark-e42724b12b976b3276accc1132f446fa67f7f981.tar.bz2 spark-e42724b12b976b3276accc1132f446fa67f7f981.zip |
[SPARK-13167][SQL] Include rows with null values for partition column when reading from JDBC datasources.
Rows with null values in partition column are not included in the results because none of the partition
where clause specify is null predicate on the partition column. This fix adds is null predicate on the partition column to the first JDBC partition where clause.
Example:
JDBCPartition(THEID < 1 or THEID is null, 0),JDBCPartition(THEID >= 1 AND THEID < 2,1),
JDBCPartition(THEID >= 2, 2)
Author: sureshthalamati <suresh.thalamati@gmail.com>
Closes #11063 from sureshthalamati/nullable_jdbc_part_col_spark-13167.
Diffstat (limited to 'python/pyspark/ml/classification.py')
0 files changed, 0 insertions, 0 deletions