[SPARK-4176] [SQL] Supports decimal types with precision > 18 in Parquet - spark

diff options

author	Rene Treffer <treffer+github@measite.de>	2015-07-27 23:29:40 +0800
committer	Cheng Lian <lian@databricks.com>	2015-07-27 23:29:40 +0800
commit	aa19c696e25ebb07fd3df110cfcbcc69954ce335 (patch)
tree	f8d93995a6b7c91a799fe6529578bdcdba0eaff1 /docs/graphx-programming-guide.md
parent	622838165756e9669cbf7af13eccbc719638f40b (diff)
download	spark-aa19c696e25ebb07fd3df110cfcbcc69954ce335.tar.gz spark-aa19c696e25ebb07fd3df110cfcbcc69954ce335.tar.bz2 spark-aa19c696e25ebb07fd3df110cfcbcc69954ce335.zip

[SPARK-4176] [SQL] Supports decimal types with precision > 18 in Parquet

This PR is based on #6796 authored by rtreffer. To support large decimal precisions (> 18), we do the following things in this PR: 1. Making `CatalystSchemaConverter` support large decimal precision Decimal types with large precision are always converted to fixed-length byte array. 2. Making `CatalystRowConverter` support reading decimal values with large precision When the precision is > 18, constructs `Decimal` values with an unscaled `BigInteger` rather than an unscaled `Long`. 3. Making `RowWriteSupport` support writing decimal values with large precision In this PR we always write decimals as fixed-length byte array, because Parquet write path hasn't been refactored to conform Parquet format spec (see SPARK-6774 & SPARK-8848). Two follow-up tasks should be done in future PRs: - [ ] Writing decimals as `INT32`, `INT64` when possible while fixing SPARK-8848 - [ ] Adding compatibility tests as part of SPARK-5463 Author: Cheng Lian <lian@databricks.com> Closes #7455 from liancheng/spark-4176 and squashes the following commits: a543d10 [Cheng Lian] Fixes errors introduced while rebasing 9e31cdf [Cheng Lian] Supports decimals with precision > 18 for Parquet

Diffstat (limited to 'docs/graphx-programming-guide.md')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: