diff options
author | zsxwing <zsxwing@gmail.com> | 2014-12-22 14:26:28 -0800 |
---|---|---|
committer | Josh Rosen <joshrosen@databricks.com> | 2014-12-22 14:26:28 -0800 |
commit | c233ab3d8d75a33495298964fe73dbf7dd8fe305 (patch) | |
tree | 49811eb00136741ab5d320ed7d60519561b177f8 /bin | |
parent | de9d7d2b5b6d80963505571700e83779fd98f850 (diff) | |
download | spark-c233ab3d8d75a33495298964fe73dbf7dd8fe305.tar.gz spark-c233ab3d8d75a33495298964fe73dbf7dd8fe305.tar.bz2 spark-c233ab3d8d75a33495298964fe73dbf7dd8fe305.zip |
[SPARK-4818][Core] Add 'iterator' to reduce memory consumed by join
In Scala, `map` and `flatMap` of `Iterable` will copy the contents of `Iterable` to a new `Seq`. Such as,
```Scala
val iterable = Seq(1, 2, 3).map(v => {
println(v)
v
})
println("Iterable map done")
val iterator = Seq(1, 2, 3).iterator.map(v => {
println(v)
v
})
println("Iterator map done")
```
outputed
```
1
2
3
Iterable map done
Iterator map done
```
So we should use 'iterator' to reduce memory consumed by join.
Found by Johannes Simon in http://mail-archives.apache.org/mod_mbox/spark-user/201412.mbox/%3C5BE70814-9D03-4F61-AE2C-0D63F2DE4446%40mail.de%3E
Author: zsxwing <zsxwing@gmail.com>
Closes #3671 from zsxwing/SPARK-4824 and squashes the following commits:
48ee7b9 [zsxwing] Remove the explicit types
95d59d6 [zsxwing] Add 'iterator' to reduce memory consumed by join
Diffstat (limited to 'bin')
0 files changed, 0 insertions, 0 deletions