[SPARK-12941][SQL][MASTER] Spark-SQL JDBC Oracle dialect fails to map string datatypes to Oracle VARCHAR datatype

## What changes were proposed in this pull request? This Pull request is used for the fix SPARK-12941, creating a data type mapping to Oracle for the corresponding data type"Stringtype" from dataframe. This PR is for the master branch fix, where as another PR is already tested with the branch 1.4 ## How was the this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) This patch was tested using the Oracle docker .Created a new integration suite for the same.The oracle.jdbc jar was to be downloaded from the maven repository.Since there was no jdbc jar available in the maven repository, the jar was downloaded from oracle site manually and installed in the local; thus tested. So, for SparkQA test case run, the ojdbc jar might be manually placed in the local maven repository(com/oracle/ojdbc6/11.2.0.2.0) while Spark QA test run. Author: thomastechs <thomas.sebastian@tcs.com> Closes #11306 from thomastechs/master.
author: thomastechs <thomas.sebastian@tcs.com> 2016-02-25 22:52:25 -0800
committer: Yin Huai <yhuai@databricks.com> 2016-02-25 22:52:25 -0800
commit: 8afe49141d9b6a603eb3907f32dce802a3d05172 (patch)
tree: a7871c801eb7802c9e2d9dd2e25db2098513bd93 /docker-integration-tests
parent: 50e60e36f7775a10cf39338e7c5716578a24d89f (diff)
download: spark-8afe49141d9b6a603eb3907f32dce802a3d05172.tar.gz
spark-8afe49141d9b6a603eb3907f32dce802a3d05172.tar.bz2
spark-8afe49141d9b6a603eb3907f32dce802a3d05172.zip
2 files changed, 93 insertions, 0 deletions
diff --git a/docker-integration-tests/pom.xml b/docker-integration-tests/pom.xml
index 833ca29cd8..048e58d888 100644
--- a/docker-integration-tests/pom.xml
+++ b/docker-integration-tests/pom.xml
@@ -131,6 +131,19 @@
       <artifactId>postgresql</artifactId>
       <scope>test</scope>
     </dependency>
+    <!-- Oracle ojdbc jar, used for oracle  integration suite for docker testing.
+     See https://github.com/apache/spark/pull/11306 for background on why we need
+     to use a an ojdbc jar for the testcase. The maven dependency here is commented
+     because currently the maven repository does not contain the ojdbc jar mentioned.
+     Once the jar is available in maven, this could be uncommented. -->
+    <!--
+      <dependency>
+        <groupId>com.oracle</groupId>
+        <artifactId>ojdbc6</artifactId>
+        <version>11.2.0.2.0</version>
+        <scope>test</scope>
+      </dependency>
+    -->
     <!-- Jersey dependencies, used to override version.
      See https://github.com/apache/spark/pull/9503#issuecomment-154369560 for
      background on why we need to use a newer Jersey only in this test module;
diff --git a/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala b/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
new file mode 100644
index 0000000000..b5416d7072
--- /dev/null
+++ b/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.jdbc
+
+import java.math.BigDecimal
+import java.sql.{Connection, Date, Timestamp}
+import java.util.Properties
+
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types._
+import org.apache.spark.tags.DockerTest
+
+/**
+ * This patch was tested using the Oracle docker. Created this integration suite for the same.
+ * The ojdbc6-11.2.0.2.0.jar was to be downloaded from the maven repository. Since there was
+ * no jdbc jar available in the maven repository, the jar was downloaded from oracle site
+ * manually and installed in the local; thus tested. So, for SparkQA test case run, the
+ * ojdbc jar might be manually placed in the local maven repository(com/oracle/ojdbc6/11.2.0.2.0)
+ * while Spark QA test run.
+ *
+ * The following would be the steps to test this
+ * 1. Pull oracle 11g image - docker pull wnameless/oracle-xe-11g
+ * 2. Start docker - sudo service docker start
+ * 3. Download oracle 11g driver jar and put it in maven local repo:
+ *    (com/oracle/ojdbc6/11.2.0.2.0/ojdbc6-11.2.0.2.0.jar)
+ * 4. The timeout and interval parameter to be increased from 60,1 to a high value for oracle test
+ *    in DockerJDBCIntegrationSuite.scala (Locally tested with 200,200 and executed successfully).
+ * 5. Run spark test - ./build/sbt "test-only org.apache.spark.sql.jdbc.OracleIntegrationSuite"
+ *
+ * All tests in this suite are ignored because of the dependency with the oracle jar from maven
+ * repository.
+ */
+@DockerTest
+class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with SharedSQLContext {
+  import testImplicits._
+
+  override val db = new DatabaseOnDocker {
+    override val imageName = "wnameless/oracle-xe-11g:latest"
+    override val env = Map(
+      "ORACLE_ROOT_PASSWORD" -> "oracle"
+    )
+    override val jdbcPort: Int = 1521
+    override def getJdbcUrl(ip: String, port: Int): String =
+      s"jdbc:oracle:thin:system/oracle@//$ip:$port/xe"
+  }
+
+  override def dataPreparation(conn: Connection): Unit = {
+  }
+
+  ignore("SPARK-12941: String datatypes to be mapped to Varchar in Oracle") {
+    // create a sample dataframe with string type
+    val df1 = sparkContext.parallelize(Seq(("foo"))).toDF("x")
+    // write the dataframe to the oracle table tbl
+    df1.write.jdbc(jdbcUrl, "tbl2", new Properties)
+    // read the table from the oracle
+    val dfRead = sqlContext.read.jdbc(jdbcUrl, "tbl2", new Properties)
+    // get the rows
+    val rows = dfRead.collect()
+    // verify the data type is inserted
+    val types = rows(0).toSeq.map(x => x.getClass.toString)
+    assert(types(0).equals("class java.lang.String"))
+    // verify the value is the inserted correct or not
+    assert(rows(0).getString(0).equals("foo"))
+  }
+}
author	thomastechs <thomas.sebastian@tcs.com>	2016-02-25 22:52:25 -0800
committer	Yin Huai <yhuai@databricks.com>	2016-02-25 22:52:25 -0800
commit	8afe49141d9b6a603eb3907f32dce802a3d05172 (patch)
tree	a7871c801eb7802c9e2d9dd2e25db2098513bd93 /docker-integration-tests
parent	50e60e36f7775a10cf39338e7c5716578a24d89f (diff)
download	spark-8afe49141d9b6a603eb3907f32dce802a3d05172.tar.gz spark-8afe49141d9b6a603eb3907f32dce802a3d05172.tar.bz2 spark-8afe49141d9b6a603eb3907f32dce802a3d05172.zip