aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/ml/clustering.py
diff options
context:
space:
mode:
authorsureshthalamati <suresh.thalamati@gmail.com>2016-03-01 17:34:21 -0800
committerReynold Xin <rxin@databricks.com>2016-03-01 17:34:21 -0800
commite42724b12b976b3276accc1132f446fa67f7f981 (patch)
tree67d5384e145cd8f71cba208828f39e190cf191b0 /python/pyspark/ml/clustering.py
parenta640c5b4fbd653919a5897a7b11f16328f2094eb (diff)
downloadspark-e42724b12b976b3276accc1132f446fa67f7f981.tar.gz
spark-e42724b12b976b3276accc1132f446fa67f7f981.tar.bz2
spark-e42724b12b976b3276accc1132f446fa67f7f981.zip
[SPARK-13167][SQL] Include rows with null values for partition column when reading from JDBC datasources.
Rows with null values in partition column are not included in the results because none of the partition where clause specify is null predicate on the partition column. This fix adds is null predicate on the partition column to the first JDBC partition where clause. Example: JDBCPartition(THEID < 1 or THEID is null, 0),JDBCPartition(THEID >= 1 AND THEID < 2,1), JDBCPartition(THEID >= 2, 2) Author: sureshthalamati <suresh.thalamati@gmail.com> Closes #11063 from sureshthalamati/nullable_jdbc_part_col_spark-13167.
Diffstat (limited to 'python/pyspark/ml/clustering.py')
0 files changed, 0 insertions, 0 deletions