diff options
author | Cheng Lian <lian@databricks.com> | 2016-04-19 09:37:00 -0700 |
---|---|---|
committer | Yin Huai <yhuai@databricks.com> | 2016-04-19 09:37:00 -0700 |
commit | 5e360c93bed9d4f9250cf79bbcebd8552557f548 (patch) | |
tree | ce3a791360d08ebedd126e91764ca008a304058b /sql/hive/src/main/scala/org/apache | |
parent | 3d46d796a3a2b60b37dc318652eded5e992be1e5 (diff) | |
download | spark-5e360c93bed9d4f9250cf79bbcebd8552557f548.tar.gz spark-5e360c93bed9d4f9250cf79bbcebd8552557f548.tar.bz2 spark-5e360c93bed9d4f9250cf79bbcebd8552557f548.zip |
[SPARK-13681][SPARK-14458][SPARK-14566][SQL] Add back once removed CommitFailureTestRelationSuite and SimpleTextHadoopFsRelationSuite
## What changes were proposed in this pull request?
These test suites were removed while refactoring `HadoopFsRelation` related API. This PR brings them back.
This PR also fixes two regressions:
- SPARK-14458, which causes runtime error when saving partitioned tables using `FileFormat` data sources that are not able to infer their own schemata. This bug wasn't detected by any built-in data sources because all of them happen to have schema inference feature.
- SPARK-14566, which happens to be covered by SPARK-14458 and causes wrong query result or runtime error when
- appending a Dataset `ds` to a persisted partitioned data source relation `t`, and
- partition columns in `ds` don't all appear after data columns
## How was this patch tested?
`CommitFailureTestRelationSuite` uses a testing relation that always fails when committing write tasks to test write job cleanup.
`SimpleTextHadoopFsRelationSuite` uses a testing relation to test general `HadoopFsRelation` and `FileFormat` interfaces.
The two regressions are both covered by existing test cases.
Author: Cheng Lian <lian@databricks.com>
Closes #12179 from liancheng/spark-13681-commit-failure-test.
Diffstat (limited to 'sql/hive/src/main/scala/org/apache')
-rw-r--r-- | sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala | 11 | ||||
-rw-r--r-- | sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/commands.scala | 2 |
2 files changed, 7 insertions, 6 deletions
diff --git a/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala b/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala index 697cf719c1..79fe23b258 100644 --- a/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala +++ b/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala @@ -504,11 +504,12 @@ private[hive] class HiveMetastoreCatalog(val client: HiveClient, hive: HiveConte } } - private def convertToLogicalRelation(metastoreRelation: MetastoreRelation, - options: Map[String, String], - defaultSource: FileFormat, - fileFormatClass: Class[_ <: FileFormat], - fileType: String): LogicalRelation = { + private def convertToLogicalRelation( + metastoreRelation: MetastoreRelation, + options: Map[String, String], + defaultSource: FileFormat, + fileFormatClass: Class[_ <: FileFormat], + fileType: String): LogicalRelation = { val metastoreSchema = StructType.fromAttributes(metastoreRelation.output) val tableIdentifier = QualifiedTableName(metastoreRelation.databaseName, metastoreRelation.tableName) diff --git a/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/commands.scala b/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/commands.scala index 5ef502afa5..8f7c4e8289 100644 --- a/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/commands.scala +++ b/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/commands.scala @@ -300,7 +300,7 @@ case class CreateMetastoreDataSourceAsSelect( val data = Dataset.ofRows(hiveContext, query) val df = existingSchema match { // If we are inserting into an existing table, just use the existing schema. - case Some(s) => sqlContext.internalCreateDataFrame(data.queryExecution.toRdd, s) + case Some(s) => data.selectExpr(s.fieldNames: _*) case None => data } |