| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Author: Reynold Xin <rxin@databricks.com>
Closes #10764 from rxin/SPARK-12830.
|
|
|
|
|
|
|
|
|
|
|
|
| |
**Please attribute this PR to `Zhichao Li <zhichao.liintel.com>`.**
This PR is based on PR #8476 authored by zhichao-li. It fixes SPARK-10310 by adding field delimiter SerDe property to the default `LazySimpleSerDe`, and enabling default record reader/writer classes.
Currently, we only support `LazySimpleSerDe`, used together with `TextRecordReader` and `TextRecordWriter`, and don't support customizing record reader/writer using `RECORDREADER`/`RECORDWRITER` clauses. This should be addressed in separate PR(s).
Author: Cheng Lian <lian@databricks.com>
Closes #8860 from liancheng/spark-10310/fix-script-trans-delimiters.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of Udf
Follow-up of #6902 for being coherent between ```Udf``` and ```UDF```
Author: BenFradet <benjamin.fradet@gmail.com>
Closes #6920 from BenFradet/SPARK-8478 and squashes the following commits:
c500f29 [BenFradet] renamed a few variables in functions to use UDF
8ab0f2d [BenFradet] renamed idUdf to idUDF in SQLQuerySuite
98696c2 [BenFradet] renamed originalUdfs in TestHive to originalUDFs
7738f74 [BenFradet] modified HiveUDFSuite to use only UDF
c52608d [BenFradet] renamed HiveUdfSuite to HiveUDFSuite
e51b9ac [BenFradet] renamed ExtractPythonUdfs to ExtractPythonUDFs
8c756f1 [BenFradet] renamed Hive UDF related code
2a1ca76 [BenFradet] renamed pythonUdfs to pythonUDFs
261e6fb [BenFradet] renamed ScalaUdf to ScalaUDF
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
values
In org.apache.hadoop.hive.serde2.io.TimestampWritable.set , if the next entry is null then current time stamp object is being reset.
However because of this hiveinspectors:unwrap cannot use the same timestamp object without creating a copy.
Author: Venkata Ramana G <ramana.gollamudihuawei.com>
Author: Venkata Ramana Gollamudi <ramana.gollamudi@huawei.com>
Closes #3019 from gvramana/spark_4077 and squashes the following commits:
32d818f [Venkata Ramana Gollamudi] fixed check style
fa01e71 [Venkata Ramana Gollamudi] cloned timestamp object as org.apache.hadoop.hive.serde2.io.TimestampWritable.set will reset current time object
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As part of the upgrade I also copy the newest version of the query tests, and whitelist a bunch of new ones that are now passing.
Author: Michael Armbrust <michael@databricks.com>
Closes #2936 from marmbrus/fix13tests and squashes the following commits:
d9cbdab [Michael Armbrust] Remove user specific tests
65801cd [Michael Armbrust] style and rat
8f6b09a [Michael Armbrust] Update test harness to work with both Hive 12 and 13.
f044843 [Michael Armbrust] Update Hive query tests and golden files to 0.13
|
|
|
|
|
|
|
|
|
|
| |
Author: Xi Liu <xil@conviva.com>
Closes #796 from xiliu82/sqlbug and squashes the following commits:
328dfc4 [Xi Liu] [Spark SQL] remove a temporary function after test
354386a [Xi Liu] [Spark SQL] add test suite for UDF on struct
8fc6f51 [Xi Liu] [SparkSQL] allow UDF on struct
|
|
This PR removes our test dependence on files hosted at Berkeley by checking the test queries and answers into the repository. This should also fix the maven Jenkins build.
I realize this is a *giant* commit. But size wise its actually pretty small. We are only looking at ~1.2Mb compressed (~30Mb uncompressed). Given that we already have a ~80Mb file permanently added to the spark code lineage, I do not think that this will change the developer experience significantly.
Furthermore, I think it is good engineering practice to consider such test support files as "code", since changes to them would indicate a change in functionality. These files were only excluded from the initial PR as I wanted the diff to be readable.
Author: Michael Armbrust <michael@databricks.com>
Closes #199 from marmbrus/hiveTestFiles and squashes the following commits:
b9b9b17 [Michael Armbrust] Add hive test files to repository. Remove download script.
|