[SPARK-13167][SQL] Include rows with null values for partition column when reading from JDBC datasources. - spark

diff options

author	sureshthalamati <suresh.thalamati@gmail.com>	2016-03-01 17:34:21 -0800
committer	Reynold Xin <rxin@databricks.com>	2016-03-01 17:34:21 -0800
commit	e42724b12b976b3276accc1132f446fa67f7f981 (patch)
tree	67d5384e145cd8f71cba208828f39e190cf191b0 /python/pyspark/ml/clustering.py
parent	a640c5b4fbd653919a5897a7b11f16328f2094eb (diff)
download	spark-e42724b12b976b3276accc1132f446fa67f7f981.tar.gz spark-e42724b12b976b3276accc1132f446fa67f7f981.tar.bz2 spark-e42724b12b976b3276accc1132f446fa67f7f981.zip

[SPARK-13167][SQL] Include rows with null values for partition column when reading from JDBC datasources.

Rows with null values in partition column are not included in the results because none of the partition where clause specify is null predicate on the partition column. This fix adds is null predicate on the partition column to the first JDBC partition where clause. Example: JDBCPartition(THEID < 1 or THEID is null, 0),JDBCPartition(THEID >= 1 AND THEID < 2,1), JDBCPartition(THEID >= 2, 2) Author: sureshthalamati <suresh.thalamati@gmail.com> Closes #11063 from sureshthalamati/nullable_jdbc_part_col_spark-13167.

Diffstat (limited to 'python/pyspark/ml/clustering.py')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: