[SPARK-13260][SQL] count(*) does not work with CSV data source - spark

diff options

author	hyukjinkwon <gurwls223@gmail.com>	2016-02-12 11:54:58 -0800
committer	Reynold Xin <rxin@databricks.com>	2016-02-12 11:54:58 -0800
commit	ac7d6af1cafc6b159d1df6cf349bb0c7ffca01cd (patch)
tree	8b8a0d66c0c6c48b6626b7654e848f6ced61e04d /python/pyspark/sql/functions.py
parent	c4d5ad80c8091c961646a82e85ecbc335b8ffe2d (diff)
download	spark-ac7d6af1cafc6b159d1df6cf349bb0c7ffca01cd.tar.gz spark-ac7d6af1cafc6b159d1df6cf349bb0c7ffca01cd.tar.bz2 spark-ac7d6af1cafc6b159d1df6cf349bb0c7ffca01cd.zip

[SPARK-13260][SQL] count(*) does not work with CSV data source

https://issues.apache.org/jira/browse/SPARK-13260 This is a quicky fix for `count(*)`. When the `requiredColumns` is empty, currently it returns `sqlContext.sparkContext.emptyRDD[Row]` which does not have the count. Just like JSON datasource, this PR lets the CSV datasource count the rows but do not parse each set of tokens. Author: hyukjinkwon <gurwls223@gmail.com> Closes #11169 from HyukjinKwon/SPARK-13260.

Diffstat (limited to 'python/pyspark/sql/functions.py')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: