StatCounter on NumPy arrays [PYSPARK][SPARK-2012] - spark

diff options

author	Jeremy Freeman <the.freeman.lab@gmail.com>	2014-08-01 22:33:25 -0700
committer	Josh Rosen <joshrosen@apache.org>	2014-08-01 22:33:25 -0700
commit	4bc3bb29a4b6ab24b6b7e1f8df26414c41c80ace (patch)
tree	ece0def1b321943074f43a6670040c02711604e3 /project
parent	fda475987f3b8b37d563033b0e45706ce433824a (diff)
download	spark-4bc3bb29a4b6ab24b6b7e1f8df26414c41c80ace.tar.gz spark-4bc3bb29a4b6ab24b6b7e1f8df26414c41c80ace.tar.bz2 spark-4bc3bb29a4b6ab24b6b7e1f8df26414c41c80ace.zip

StatCounter on NumPy arrays [PYSPARK][SPARK-2012]

These changes allow StatCounters to work properly on NumPy arrays, to fix the issue reported here (https://issues.apache.org/jira/browse/SPARK-2012). If NumPy is installed, the NumPy functions ``maximum``, ``minimum``, and ``sqrt``, which work on arrays, are used to merge statistics. If not, we fall back on scalar operators, so it will work on arrays with NumPy, but will also work without NumPy. New unit tests added, along with a check for NumPy in the tests. Author: Jeremy Freeman <the.freeman.lab@gmail.com> Closes #1725 from freeman-lab/numpy-max-statcounter and squashes the following commits: fe973b1 [Jeremy Freeman] Avoid duplicate array import in tests 7f0e397 [Jeremy Freeman] Refactored check for numpy 8e764dd [Jeremy Freeman] Explicit numpy imports 875414c [Jeremy Freeman] Fixed indents 1c8a832 [Jeremy Freeman] Unit tests for StatCounter with NumPy arrays 176a127 [Jeremy Freeman] Use numpy arrays in StatCounter

Diffstat (limited to 'project')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: