[SPARK-18857][SQL] Don't use `Iterator.duplicate` for `incrementalCollect` in Thrift Server - spark

diff options

author	Dongjoon Hyun <dongjoon@apache.org>	2017-01-10 13:27:55 +0000
committer	Sean Owen <sowen@cloudera.com>	2017-01-10 13:27:55 +0000
commit	a2c6adcc5d2702d2f0e9b239517353335e5f911e (patch)
tree	6691ef8d2fc499df589622e256fc60eb5e82b5d8 /docs/contributing-to-spark.md
parent	2cfd41ac02193aaf121afcddcb6383f4d075ea1e (diff)
download	spark-a2c6adcc5d2702d2f0e9b239517353335e5f911e.tar.gz spark-a2c6adcc5d2702d2f0e9b239517353335e5f911e.tar.bz2 spark-a2c6adcc5d2702d2f0e9b239517353335e5f911e.zip

[SPARK-18857][SQL] Don't use `Iterator.duplicate` for `incrementalCollect` in Thrift Server

## What changes were proposed in this pull request? To support `FETCH_FIRST`, SPARK-16563 used Scala `Iterator.duplicate`. However, Scala `Iterator.duplicate` uses a **queue to buffer all items between both iterators**, this causes GC and hangs for queries with large number of rows. We should not use this, especially for `spark.sql.thriftServer.incrementalCollect`. https://github.com/scala/scala/blob/2.12.x/src/library/scala/collection/Iterator.scala#L1262-L1300 ## How was this patch tested? Pass the existing tests. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #16440 from dongjoon-hyun/SPARK-18857.

Diffstat (limited to 'docs/contributing-to-spark.md')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: