diff options
author | Zhichao Li <zhichao.li@intel.com> | 2015-09-22 19:41:57 -0700 |
---|---|---|
committer | Yin Huai <yhuai@databricks.com> | 2015-09-22 19:41:57 -0700 |
commit | 84f81e035e1dab1b42c36563041df6ba16e7b287 (patch) | |
tree | 36d06cf10253cc10a201bec6d2e26d7b44862e5e /sql/hive/src/test/resources | |
parent | 61d4c07f4becb42f054e588be56ed13239644410 (diff) | |
download | spark-84f81e035e1dab1b42c36563041df6ba16e7b287.tar.gz spark-84f81e035e1dab1b42c36563041df6ba16e7b287.tar.bz2 spark-84f81e035e1dab1b42c36563041df6ba16e7b287.zip |
[SPARK-10310] [SQL] Fixes script transformation field/line delimiters
**Please attribute this PR to `Zhichao Li <zhichao.liintel.com>`.**
This PR is based on PR #8476 authored by zhichao-li. It fixes SPARK-10310 by adding field delimiter SerDe property to the default `LazySimpleSerDe`, and enabling default record reader/writer classes.
Currently, we only support `LazySimpleSerDe`, used together with `TextRecordReader` and `TextRecordWriter`, and don't support customizing record reader/writer using `RECORDREADER`/`RECORDWRITER` clauses. This should be addressed in separate PR(s).
Author: Cheng Lian <lian@databricks.com>
Closes #8860 from liancheng/spark-10310/fix-script-trans-delimiters.
Diffstat (limited to 'sql/hive/src/test/resources')
-rwxr-xr-x | sql/hive/src/test/resources/data/scripts/test_transform.py | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/sql/hive/src/test/resources/data/scripts/test_transform.py b/sql/hive/src/test/resources/data/scripts/test_transform.py new file mode 100755 index 0000000000..ac6d11d8b9 --- /dev/null +++ b/sql/hive/src/test/resources/data/scripts/test_transform.py @@ -0,0 +1,6 @@ +import sys + +delim = sys.argv[1] + +for row in sys.stdin: + print(delim.join([w + '#' for w in row[:-1].split(delim)])) |