aboutsummaryrefslogtreecommitdiff
path: root/bin/load-spark-env.cmd
diff options
context:
space:
mode:
authorShixiong Zhu <shixiong@databricks.com>2016-02-09 18:50:06 -0800
committerTathagata Das <tathagata.das1565@gmail.com>2016-02-09 18:50:06 -0800
commitb385ce38825de4b1420c5a0e8191e91fc8afecf5 (patch)
treeef988edcab7bdbf37082d07781b5addd9c3a364c /bin/load-spark-env.cmd
parent6f710f9fd4f85370557b7705020ff16f2385e645 (diff)
downloadspark-b385ce38825de4b1420c5a0e8191e91fc8afecf5.tar.gz
spark-b385ce38825de4b1420c5a0e8191e91fc8afecf5.tar.bz2
spark-b385ce38825de4b1420c5a0e8191e91fc8afecf5.zip
[SPARK-13149][SQL] Add FileStreamSource
`FileStreamSource` is an implementation of `org.apache.spark.sql.execution.streaming.Source`. It takes advantage of the existing `HadoopFsRelationProvider` to support various file formats. It remembers files in each batch and stores it into the metadata files so as to recover them when restarting. The metadata files are stored in the file system. There will be a further PR to clean up the metadata files periodically. This is based on the initial work from marmbrus. Author: Shixiong Zhu <shixiong@databricks.com> Closes #11034 from zsxwing/stream-df-file-source.
Diffstat (limited to 'bin/load-spark-env.cmd')
0 files changed, 0 insertions, 0 deletions