[SPARK-19919][SQL] Defer throwing the exception for empty paths in CSV datasource into `DataSource`

## What changes were proposed in this pull request? This PR proposes to defer throwing the exception within `DataSource`. Currently, if other datasources fail to infer the schema, it returns `None` and then this is being validated in `DataSource` as below: ``` scala> spark.read.json("emptydir") org.apache.spark.sql.AnalysisException: Unable to infer schema for JSON. It must be specified manually.; ``` ``` scala> spark.read.orc("emptydir") org.apache.spark.sql.AnalysisException: Unable to infer schema for ORC. It must be specified manually.; ``` ``` scala> spark.read.parquet("emptydir") org.apache.spark.sql.AnalysisException: Unable to infer schema for Parquet. It must be specified manually.; ``` However, CSV it checks it within the datasource implementation and throws another exception message as below: ``` scala> spark.read.csv("emptydir") java.lang.IllegalArgumentException: requirement failed: Cannot infer schema from an empty set of files ``` We could remove this duplicated check and validate this in one place in the same way with the same message. ## How was this patch tested? Unit test in `CSVSuite` and manual test. Author: hyukjinkwon <gurwls223@gmail.com> Closes #17256 from HyukjinKwon/SPARK-19919.
author: hyukjinkwon <gurwls223@gmail.com> 2017-03-22 08:41:46 +0800
committer: Wenchen Fan <wenchen@databricks.com> 2017-03-22 08:41:46 +0800
commit: 9281a3d504d526440c1d445075e38a6d9142ac93 (patch)
tree: b39fe650c06e306a52bc48be4e4098833c5c6463 /sql/core/src/test/scala
parent: a04dcde8cb191e591a5f5d7a67a5371e31e7343c (diff)
download: spark-9281a3d504d526440c1d445075e38a6d9142ac93.tar.gz
spark-9281a3d504d526440c1d445075e38a6d9142ac93.tar.bz2
spark-9281a3d504d526440c1d445075e38a6d9142ac93.zip
1 files changed, 4 insertions, 2 deletions
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
index 8a8ba05534..8287776f8f 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
@@ -370,9 +370,11 @@ class DataFrameReaderWriterSuite extends QueryTest with SharedSQLContext with Be
     val schema = df.schema
 
     // Reader, without user specified schema
-    intercept[IllegalArgumentException] {
+    val message = intercept[AnalysisException] {
       testRead(spark.read.csv(), Seq.empty, schema)
-    }
+    }.getMessage
+    assert(message.contains("Unable to infer schema for CSV. It must be specified manually."))
+
     testRead(spark.read.csv(dir), data, schema)
     testRead(spark.read.csv(dir, dir), data ++ data, schema)
     testRead(spark.read.csv(Seq(dir, dir): _*), data ++ data, schema)
author	hyukjinkwon <gurwls223@gmail.com>	2017-03-22 08:41:46 +0800
committer	Wenchen Fan <wenchen@databricks.com>	2017-03-22 08:41:46 +0800
commit	9281a3d504d526440c1d445075e38a6d9142ac93 (patch)
tree	b39fe650c06e306a52bc48be4e4098833c5c6463 /sql/core/src/test/scala
parent	a04dcde8cb191e591a5f5d7a67a5371e31e7343c (diff)
download	spark-9281a3d504d526440c1d445075e38a6d9142ac93.tar.gz spark-9281a3d504d526440c1d445075e38a6d9142ac93.tar.bz2 spark-9281a3d504d526440c1d445075e38a6d9142ac93.zip