aboutsummaryrefslogtreecommitdiff
path: root/sql/core/src/test/resources/sql-tests/results
diff options
context:
space:
mode:
authorLiwei Lin <lwlin7@gmail.com>2017-02-28 22:58:51 -0800
committerShixiong Zhu <shixiong@databricks.com>2017-02-28 22:58:51 -0800
commit4913c92c2fbfcc22b41afb8ce79687165392d7da (patch)
tree3879e2eed39d386aaf67383b7f6abdb170e923f0 /sql/core/src/test/resources/sql-tests/results
parent89cd3845b6edb165236a6498dcade033975ee276 (diff)
downloadspark-4913c92c2fbfcc22b41afb8ce79687165392d7da.tar.gz
spark-4913c92c2fbfcc22b41afb8ce79687165392d7da.tar.bz2
spark-4913c92c2fbfcc22b41afb8ce79687165392d7da.zip
[SPARK-19633][SS] FileSource read from FileSink
## What changes were proposed in this pull request? Right now file source always uses `InMemoryFileIndex` to scan files from a given path. But when reading the outputs from another streaming query, the file source should use `MetadataFileIndex` to list files from the sink log. This patch adds this support. ## `MetadataFileIndex` or `InMemoryFileIndex` ```scala spark .readStream .format(...) .load("/some/path") // for a non-glob path: // - use `MetadataFileIndex` when `/some/path/_spark_meta` exists // - fall back to `InMemoryFileIndex` otherwise ``` ```scala spark .readStream .format(...) .load("/some/path/*/*") // for a glob path: always use `InMemoryFileIndex` ``` ## How was this patch tested? two newly added tests Author: Liwei Lin <lwlin7@gmail.com> Closes #16987 from lw-lin/source-read-from-sink.
Diffstat (limited to 'sql/core/src/test/resources/sql-tests/results')
0 files changed, 0 insertions, 0 deletions