aboutsummaryrefslogtreecommitdiff
path: root/streaming/src
diff options
context:
space:
mode:
authorTathagata Das <tathagata.das1565@gmail.com>2016-05-06 15:04:16 -0700
committerYin Huai <yhuai@databricks.com>2016-05-06 15:04:16 -0700
commitf7b7ef41662d7d02fc4f834f3c6c4ee8802e949c (patch)
tree715c731c578d7ebe519ae3b0473882164a418a20 /streaming/src
parente20cd9f4ce977739ce80a2c39f8ebae5e53f72f6 (diff)
downloadspark-f7b7ef41662d7d02fc4f834f3c6c4ee8802e949c.tar.gz
spark-f7b7ef41662d7d02fc4f834f3c6c4ee8802e949c.tar.bz2
spark-f7b7ef41662d7d02fc4f834f3c6c4ee8802e949c.zip
[SPARK-14997][SQL] Fixed FileCatalog to return correct set of files when there is no partitioning scheme in the given paths
## What changes were proposed in this pull request? Lets says there are json files in the following directories structure ``` xyz/file0.json xyz/subdir1/file1.json xyz/subdir2/file2.json xyz/subdir1/subsubdir1/file3.json ``` `sqlContext.read.json("xyz")` should read only file0.json according to behavior in Spark 1.6.1. However in current master, all the 4 files are read. The fix is to make FileCatalog return only the children files of the given path if there is not partitioning detected (instead of all the recursive list of files). Closes #12774 ## How was this patch tested? unit tests Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #12856 from tdas/SPARK-14997.
Diffstat (limited to 'streaming/src')
0 files changed, 0 insertions, 0 deletions