| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
This reverts commit 2b72c569a674cccf79ebbe8d067b8dbaaf78007f.
|
|
|
|
| |
This reverts commit bc05df8a23ba7ad485f6844f28f96551b13ba461.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enables Kryo and disables reference tracking by default in Spark SQL Thrift server. Configurations explicitly defined by users in `spark-defaults.conf` are respected (the Thrift server is started by `spark-submit`, which handles configuration properties properly).
<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/3621)
<!-- Reviewable:end -->
Author: Cheng Lian <lian@databricks.com>
Closes #3621 from liancheng/kryo-by-default and squashes the following commits:
70c2775 [Cheng Lian] Enables Kryo by default in Spark SQL Thrift server
(cherry picked from commit 6f61e1f961826a6c9e98a66d10b271b7e3c7dd55)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
|
| |
|
| |
|
|
|
|
| |
This reverts commit 1056e9ec13203d0c51564265e94d77a054498fdb.
|
|
|
|
| |
This reverts commit 00316cc87983b844f6603f351a8f0b84fe1f6035.
|
| |
|
| |
|
|
|
|
| |
This reverts commit 39c7d1c1f9a7785285cf4c20dfbffd96f72d5634.
|
|
|
|
| |
This reverts commit fc7bff00ac731d2632213a98cd92dc5e84ce7dcd.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
HiveThriftServer2
This PR disables HiveThriftServer2 asynchronous execution by setting `runInBackground` argument in `ExecuteStatementOperation` to `false`, and reverting `SparkExecuteStatementOperation.run` in Hive 13 shim to Hive 12 version. This change makes Simba ODBC driver v1.0.0.1000 work.
<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/3506)
<!-- Reviewable:end -->
Author: Cheng Lian <lian@databricks.com>
Closes #3506 from liancheng/disable-async-exec and squashes the following commits:
593804d [Cheng Lian] Disables asynchronous execution in Hive 0.13.1 HiveThriftServer2
|
|
|
|
|
|
|
|
|
|
| |
In `HiveThriftServer2`, when an exception is thrown during a SQL execution, the SQL operation state should be set to `ERROR`, but now it remains `RUNNING`. This affects the result of the `GetOperationStatus` Thrift API.
Author: Cheng Lian <lian@databricks.com>
Closes #3175 from liancheng/fix-op-state and squashes the following commits:
6d4c1fe [Cheng Lian] Sets SQL operation state to ERROR when exception is thrown
|
|
|
|
| |
This reverts commit cc2c05e4ee81d2f34873a2ebb9a5272867cb65c2.
|
|
|
|
| |
This reverts commit 380eba5f49eca1dbd4084e6c84e19866fffd4efa.
|
| |
|
| |
|
|
|
|
| |
This reverts commit 5247dd859b95a440baa562b9827bdeb26aa6530e.
|
|
|
|
| |
This reverts commit 79df6b43ae762263a8120f423ddb4a0811dd4b6f.
|
| |
|
| |
|
|
|
|
| |
This reverts commit db7f4a898af22a02b36428507f8ef2b429d78dc1.
|
|
|
|
| |
This reverts commit d7b1ecb25676d228deb6fe05efdb4e2ab9c3e30b.
|
| |
|
| |
|
|
|
|
| |
This reverts commit 38c1fbd9694430cefd962c90bc36b0d108c6124b.
|
|
|
|
| |
This reverts commit d7ac6013483e83caff8ea54c228f37aeca159db8.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
```timeTaken``` should not count the time of printing result.
Author: w00228970 <wangfei1@huawei.com>
Closes #3423 from scwf/time-taken-bug and squashes the following commits:
da7e102 [w00228970] compute time taken correctly
(cherry picked from commit 723be60e233d0f85944d948efd06845ef546c9f5)
Signed-off-by: Reynold Xin <rxin@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This file is for Hive 0.13.1 I think.
Author: Daniel Darabos <darabos.daniel@gmail.com>
Closes #3432 from darabos/patch-2 and squashes the following commits:
4fd22ed [Daniel Darabos] Fix comment. This file is for Hive 0.13.1.
(cherry picked from commit d5834f0732b586731034a7df5402c25454770fc5)
Signed-off-by: Michael Armbrust <michael@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for complex types
This PR is exactly the same as #3178 except it reverts the `FileStatus.isDir` to `FileStatus.isDirectory` change, since it doesn't compile with Hadoop 1.
<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/3298)
<!-- Reviewable:end -->
Author: Cheng Lian <lian@databricks.com>
Closes #3298 from liancheng/date-for-thriftserver and squashes the following commits:
866037e [Cheng Lian] Revers isDirectory to isDir (it breaks Hadoop 1 profile)
6f71d0b [Cheng Lian] Makes toHiveString static
26fa955 [Cheng Lian] Fixes complex type support in Hive 0.13.1 shim
a92882a [Cheng Lian] Updates HiveShim for 0.13.1
73f442b [Cheng Lian] Adds Date support for HiveThriftServer2 (Hive 0.12.0)
(cherry picked from commit 6b7f2f753d16ff038881772f1958e3f4fd5597a7)
Signed-off-by: Michael Armbrust <michael@databricks.com>
|
| |
|
| |
|
|
|
|
| |
This reverts commit bc09875799aa373f4320d38b02618173ffa4c96f.
|
|
|
|
| |
This reverts commit 6c6fd218c83a049c874b8a0ea737333c1899c94a.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and fixes for complex types"
Author: Michael Armbrust <michael@databricks.com>
Closes #3292 from marmbrus/revert4309 and squashes the following commits:
808e96e [Michael Armbrust] Revert "[SPARK-4309][SPARK-4407][SQL] Date type support for Thrift server, and fixes for complex types"
(cherry picked from commit 45ce3273cb618d14ec4d20c4c95699634b951086)
Signed-off-by: Michael Armbrust <michael@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for complex types
SPARK-4407 was detected while working on SPARK-4309. Merged these two into a single PR since 1.2.0 RC is approaching.
<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/3178)
<!-- Reviewable:end -->
Author: Cheng Lian <lian@databricks.com>
Closes #3178 from liancheng/date-for-thriftserver and squashes the following commits:
6f71d0b [Cheng Lian] Makes toHiveString static
26fa955 [Cheng Lian] Fixes complex type support in Hive 0.13.1 shim
a92882a [Cheng Lian] Updates HiveShim for 0.13.1
73f442b [Cheng Lian] Adds Date support for HiveThriftServer2 (Hive 0.12.0)
(cherry picked from commit cb6bd83a91d9b4a227dc6467255231869c1820e2)
Signed-off-by: Michael Armbrust <michael@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
select * from src, get the wrong result set as follows:
```
...
| 309 | val_309 |
| 309 | val_309 |
| 309 | val_309 |
| 309 | val_309 |
| 309 | val_309 |
| 309 | val_309 |
| 309 | val_309 |
| 309 | val_309 |
| 309 | val_309 |
| 309 | val_309 |
| 97 | val_97 |
| 97 | val_97 |
| 97 | val_97 |
| 97 | val_97 |
| 97 | val_97 |
| 97 | val_97 |
| 97 | val_97 |
| 97 | val_97 |
| 97 | val_97 |
| 97 | val_97 |
| 97 | val_97 |
...
```
Author: wangfei <wangfei1@huawei.com>
Closes #3149 from scwf/SPARK-4292 and squashes the following commits:
1574a43 [wangfei] using result.collect
8b2d845 [wangfei] adding test
f64eddf [wangfei] result set iter bug
(cherry picked from commit d6e55524437026c0c76addeba8f99249a8316716)
Signed-off-by: Michael Armbrust <michael@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR resorts to `SparkContext.version` rather than META-INF/MANIFEST.MF in the assembly jar to inspect Spark version. Currently, when built with Maven, the MANIFEST.MF file in the assembly jar is incorrectly replaced by Guava 15.0 MANIFEST.MF, probably because of the assembly/shading tricks.
Another related PR is #3103, which tries to fix the MANIFEST issue.
Author: Cheng Lian <lian@databricks.com>
Closes #3105 from liancheng/spark-4225 and squashes the following commits:
d9585e1 [Cheng Lian] Resorts to SparkContext.version to inspect Spark version
(cherry picked from commit 86e9eaa3f0ec23cb38bce67585adb2d5f484f4ee)
Signed-off-by: Michael Armbrust <michael@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR overrides the `GetInfo` Hive Thrift API to provide correct version information. Another property `spark.sql.hive.version` is added to reveal the underlying Hive version. These are generally useful for Spark SQL ODBC driver providers. The Spark version information is extracted from the jar manifest. Also took the chance to remove the `SET -v` hack, which was a workaround for Simba ODBC driver connectivity.
TODO
- [x] Find a general way to figure out Hive (or even any dependency) version.
This [blog post](http://blog.soebes.de/blog/2014/01/02/version-information-into-your-appas-with-maven/) suggests several methods to inspect application version. In the case of Spark, this can be tricky because the chosen method:
1. must applies to both Maven build and SBT build
For Maven builds, we can retrieve the version information from the META-INF/maven directory within the assembly jar. But this doesn't work for SBT builds.
2. must not rely on the original jars of dependencies to extract specific dependency version, because Spark uses assembly jar.
This implies we can't read Hive version from Hive jar files since standard Spark distribution doesn't include them.
3. should play well with `SPARK_PREPEND_CLASSES` to ease local testing during development.
`SPARK_PREPEND_CLASSES` prevents classes to be loaded from the assembly jar, thus we can't locate the jar file and read its manifest.
Given these, maybe the only reliable method is to generate a source file containing version information at build time. pwendell Do you have any suggestions from the perspective of the build process?
**Update** Hive version is now retrieved from the newly introduced `HiveShim` object.
Author: Cheng Lian <lian.cs.zju@gmail.com>
Author: Cheng Lian <lian@databricks.com>
Closes #2843 from liancheng/get-info and squashes the following commits:
a873d0f [Cheng Lian] Updates test case
53f43cd [Cheng Lian] Retrieves underlying Hive verson via HiveShim
1d282b8 [Cheng Lian] Removes the Simba ODBC "SET -v" hack
f857fce [Cheng Lian] Overrides Hive GetInfo Thrift API and adds Hive version property
|
|
|
|
|
|
|
|
|
|
| |
`CliSuite` has been flaky for a while, this PR tries to improve this situation by fixing a race condition in `CliSuite`. The `captureOutput` function is used to capture both stdout and stderr output of the forked external process in two background threads and search for expected strings, but wasn't been properly synchronized before.
Author: Cheng Lian <lian@databricks.com>
Closes #3060 from liancheng/fix-cli-suite and squashes the following commits:
a70569c [Cheng Lian] Fixes race condition in CliSuite
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
optimizations
- Adds optional precision and scale to Spark SQL's decimal type, which behave similarly to those in Hive 13 (https://cwiki.apache.org/confluence/download/attachments/27362075/Hive_Decimal_Precision_Scale_Support.pdf)
- Replaces our internal representation of decimals with a Decimal class that can store small values in a mutable Long, saving memory in this situation and letting some operations happen directly on Longs
This is still marked WIP because there are a few TODOs, but I'll remove that tag when done.
Author: Matei Zaharia <matei@databricks.com>
Closes #2983 from mateiz/decimal-1 and squashes the following commits:
35e6b02 [Matei Zaharia] Fix issues after merge
227f24a [Matei Zaharia] Review comments
31f915e [Matei Zaharia] Implement Davies's suggestions in Python
eb84820 [Matei Zaharia] Support reading/writing decimals as fixed-length binary in Parquet
4dc6bae [Matei Zaharia] Fix decimal support in PySpark
d1d9d68 [Matei Zaharia] Fix compile error and test issues after rebase
b28933d [Matei Zaharia] Support decimal precision/scale in Hive metastore
2118c0d [Matei Zaharia] Some test and bug fixes
81db9cb [Matei Zaharia] Added mutable Decimal that will be more efficient for small precisions
7af0c3b [Matei Zaharia] Add optional precision and scale to DecimalType, but use Unlimited for now
ec0a947 [Matei Zaharia] Make the result of AVG on Decimals be Decimal, not Double
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`HiveThriftServer2` creates a global singleton `SessionState` instance and overrides `HiveContext` to inject the `SessionState` object. This messes up `SessionState` initialization and causes problems.
This PR replaces the global `SessionState` with `HiveContext.sessionState` to avoid the initialization conflict. Also `HiveContext` reuses existing started `SessionState` if any (this is required by `SparkSQLCLIDriver`, which uses specialized `CliSessionState`).
Author: Cheng Lian <lian@databricks.com>
Closes #2887 from liancheng/spark-4037 and squashes the following commits:
8446675 [Cheng Lian] Removes redundant Driver initialization
a28fef5 [Cheng Lian] Avoid starting HiveContext.sessionState multiple times
49b1c5b [Cheng Lian] Reuses existing started SessionState if any
3cd6fab [Cheng Lian] Fixes SPARK-4037
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In #2241 hive-thriftserver is not enabled. This patch enable hive-thriftserver to support hive-0.13.1 by using a shim layer refer to #2241.
1 A light shim layer(code in sql/hive-thriftserver/hive-version) for each different hive version to handle api compatibility
2 New pom profiles "hive-default" and "hive-versions"(copy from #2241) to activate different hive version
3 SBT cmd for different version as follows:
hive-0.12.0 --- sbt/sbt -Phive,hadoop-2.3 -Phive-0.12.0 assembly
hive-0.13.1 --- sbt/sbt -Phive,hadoop-2.3 -Phive-0.13.1 assembly
4 Since hive-thriftserver depend on hive subproject, this patch should be merged with #2241 to enable hive-0.13.1 for hive-thriftserver
Author: wangfei <wangfei1@huawei.com>
Author: scwf <wangfei1@huawei.com>
Closes #2685 from scwf/shim-thriftserver1 and squashes the following commits:
f26f3be [wangfei] remove clean to save time
f5cac74 [wangfei] remove local hivecontext test
578234d [wangfei] use new shaded hive
18fb1ff [wangfei] exclude kryo in hive pom
fa21d09 [wangfei] clean package assembly/assembly
8a4daf2 [wangfei] minor fix
0d7f6cf [wangfei] address comments
f7c93ae [wangfei] adding build with hive 0.13 before running tests
bcf943f [wangfei] Merge branch 'master' of https://github.com/apache/spark into shim-thriftserver1
c359822 [wangfei] reuse getCommandProcessor in hiveshim
52674a4 [scwf] sql/hive included since examples depend on it
3529e98 [scwf] move hive module to hive profile
f51ff4e [wangfei] update and fix conflicts
f48d3a5 [scwf] Merge branch 'master' of https://github.com/apache/spark into shim-thriftserver1
41f727b [scwf] revert pom changes
13afde0 [scwf] fix small bug
4b681f4 [scwf] enable thriftserver in profile hive-0.13.1
0bc53aa [scwf] fixed when result filed is null
dfd1c63 [scwf] update run-tests to run hive-0.12.0 default now
c6da3ce [scwf] Merge branch 'master' of https://github.com/apache/spark into shim-thriftserver
7c66b8e [scwf] update pom according spark-2706
ae47489 [scwf] update and fix conflicts
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If wrong sql,the console print error one times。
eg:
<pre>
spark-sql> show tabless;
show tabless;
14/10/13 21:03:48 INFO ParseDriver: Parsing command: show tabless
............
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:274)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:209)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
Caused by: org.apache.hadoop.hive.ql.parse.ParseException: line 1:5 cannot recognize input near 'show' 'tabless' '<EOF>' in ddl statement
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:193)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:161)
at org.apache.spark.sql.hive.HiveQl$.getAst(HiveQl.scala:218)
at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:226)
... 47 more
Time taken: 4.35 seconds
14/10/13 21:03:51 INFO CliDriver: Time taken: 4.35 seconds
</pre>
Author: wangxiaojing <u9jing@gmail.com>
Closes #2790 from wangxiaojing/spark-3940 and squashes the following commits:
e2e5c14 [wangxiaojing] sql Print the error code three times
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
JDBC server
Write properties of hive-site.xml to HiveContext when initilize session state in SparkSQLEnv.scala.
The method of SparkSQLEnv.init() in HiveThriftServer2.scala can not write the properties of hive-site.xml to HiveContext. Such as: add configuration property spark.sql.shuffle.partititions in the hive-site.xml.
Author: luogankun <luogankun@gmail.com>
Closes #2800 from luogankun/SPARK-3945 and squashes the following commits:
3679efc [luogankun] [SPARK-3945]Write properties of hive-site.xml to HiveContext when initilize session state In SparkSQLEnv.scala
|