diff options
author | Davies Liu <davies.liu@gmail.com> | 2014-10-06 14:07:53 -0700 |
---|---|---|
committer | Josh Rosen <joshrosen@apache.org> | 2014-10-06 14:07:53 -0700 |
commit | 4f01265f7d62e070ba42c251255e385644c1b16c (patch) | |
tree | e6dbec031ebe0653ab232ac613548289c720eb48 /python/pyspark/mllib/tests.py | |
parent | 20ea54cc7a5176ebc63bfa9393a9bf84619bfc66 (diff) | |
download | spark-4f01265f7d62e070ba42c251255e385644c1b16c.tar.gz spark-4f01265f7d62e070ba42c251255e385644c1b16c.tar.bz2 spark-4f01265f7d62e070ba42c251255e385644c1b16c.zip |
[SPARK-3786] [PySpark] speedup tests
This patch try to speed up tests of PySpark, re-use the SparkContext in tests.py and mllib/tests.py to reduce the overhead of create SparkContext, remove some test cases, which did not make sense. It also improve the performance of some cases, such as MergerTests and SortTests.
before this patch:
real 21m27.320s
user 4m42.967s
sys 0m17.343s
after this patch:
real 9m47.541s
user 2m12.947s
sys 0m14.543s
It almost cut the time by half.
Author: Davies Liu <davies.liu@gmail.com>
Closes #2646 from davies/tests and squashes the following commits:
c54de60 [Davies Liu] revert change about memory limit
6a2a4b0 [Davies Liu] refactor of tests, speedup 100%
Diffstat (limited to 'python/pyspark/mllib/tests.py')
-rw-r--r-- | python/pyspark/mllib/tests.py | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/python/pyspark/mllib/tests.py b/python/pyspark/mllib/tests.py index f72e88ba6e..5c20e100e1 100644 --- a/python/pyspark/mllib/tests.py +++ b/python/pyspark/mllib/tests.py @@ -32,7 +32,7 @@ else: from pyspark.serializers import PickleSerializer from pyspark.mllib.linalg import Vector, SparseVector, DenseVector, _convert_to_vector from pyspark.mllib.regression import LabeledPoint -from pyspark.tests import PySparkTestCase +from pyspark.tests import ReusedPySparkTestCase as PySparkTestCase _have_scipy = False |