diff options
author | Aaron Davidson <aaron@databricks.com> | 2015-07-28 10:12:09 -0700 |
---|---|---|
committer | Michael Armbrust <michael@databricks.com> | 2015-07-28 10:12:09 -0700 |
commit | 35ef853b3f9d955949c464e4a0d445147e0e9a07 (patch) | |
tree | 24350c93d1ece87827827d246c51a59b21200245 /R/pkg/inst/tests/test_client.R | |
parent | 9bbe0171cb434edb160fad30ea2d4221f525c919 (diff) | |
download | spark-35ef853b3f9d955949c464e4a0d445147e0e9a07.tar.gz spark-35ef853b3f9d955949c464e4a0d445147e0e9a07.tar.bz2 spark-35ef853b3f9d955949c464e4a0d445147e0e9a07.zip |
[SPARK-9397] DataFrame should provide an API to find source data files if applicable
Certain applications would benefit from being able to inspect DataFrames that are straightforwardly produced by data sources that stem from files, and find out their source data. For example, one might want to display to a user the size of the data underlying a table, or to copy or mutate it.
This PR exposes an `inputFiles` method on DataFrame which attempts to discover the source data in a best-effort manner, by inspecting HadoopFsRelations and JSONRelations.
Author: Aaron Davidson <aaron@databricks.com>
Closes #7717 from aarondav/paths and squashes the following commits:
ff67430 [Aaron Davidson] inputFiles
0acd3ad [Aaron Davidson] [SPARK-9397] DataFrame should provide an API to find source data files if applicable
Diffstat (limited to 'R/pkg/inst/tests/test_client.R')
0 files changed, 0 insertions, 0 deletions