diff options
author | Daniel Darabos <darabos.daniel@gmail.com> | 2015-07-16 08:16:54 +0100 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2015-07-16 08:16:54 +0100 |
commit | 011551620faa87107a787530f074af3d9be7e695 (patch) | |
tree | 23d59a693288f5897e85b2834375a6f8cd474737 /pom.xml | |
parent | 0a795336df20c7ec969366e613286f0c060a4eeb (diff) | |
download | spark-011551620faa87107a787530f074af3d9be7e695.tar.gz spark-011551620faa87107a787530f074af3d9be7e695.tar.bz2 spark-011551620faa87107a787530f074af3d9be7e695.zip |
[SPARK-8893] Add runtime checks against non-positive number of partitions
https://issues.apache.org/jira/browse/SPARK-8893
> What does `sc.parallelize(1 to 3).repartition(p).collect` return? I would expect `Array(1, 2, 3)` regardless of `p`. But if `p` < 1, it returns `Array()`. I think instead it should throw an `IllegalArgumentException`.
> I think the case is pretty clear for `p` < 0. But the behavior for `p` = 0 is also error prone. In fact that's how I found this strange behavior. I used `rdd.repartition(a/b)` with positive `a` and `b`, but `a/b` was rounded down to zero and the results surprised me. I'd prefer an exception instead of unexpected (corrupt) results.
Author: Daniel Darabos <darabos.daniel@gmail.com>
Closes #7285 from darabos/patch-1 and squashes the following commits:
decba82 [Daniel Darabos] Allow repartitioning empty RDDs to zero partitions.
97de852 [Daniel Darabos] Allow zero partition count in HashPartitioner
f6ba5fb [Daniel Darabos] Use require() for simpler syntax.
d5e3df8 [Daniel Darabos] Require positive number of partitions in HashPartitioner
897c628 [Daniel Darabos] Require positive maxPartitions in CoalescedRDD
Diffstat (limited to 'pom.xml')
0 files changed, 0 insertions, 0 deletions