diff options
author | Doris Xin <doris.s.xin@gmail.com> | 2014-07-31 20:32:57 -0700 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2014-07-31 20:32:57 -0700 |
commit | d8430148ee1f6ba02569db0538eeae473a32c78e (patch) | |
tree | d5103a5bc8f3068c48e0d581abe515560c1ecfe5 /python/pyspark/__init__.py | |
parent | 8f51491ea78d8e88fc664c2eac3b4ac14226d98f (diff) | |
download | spark-d8430148ee1f6ba02569db0538eeae473a32c78e.tar.gz spark-d8430148ee1f6ba02569db0538eeae473a32c78e.tar.bz2 spark-d8430148ee1f6ba02569db0538eeae473a32c78e.zip |
[SPARK-2724] Python version of RandomRDDGenerators
RandomRDDGenerators but without support for randomRDD and randomVectorRDD, which take in arbitrary DistributionGenerator.
`randomRDD.py` is named to avoid collision with the built-in Python `random` package.
Author: Doris Xin <doris.s.xin@gmail.com>
Closes #1628 from dorx/pythonRDD and squashes the following commits:
55c6de8 [Doris Xin] review comments. all python units passed.
f831d9b [Doris Xin] moved default args logic into PythonMLLibAPI
2d73917 [Doris Xin] fix for linalg.py
8663e6a [Doris Xin] reverting back to a single python file for random
f47c481 [Doris Xin] docs update
687aac0 [Doris Xin] add RandomRDDGenerators.py to run-tests
4338f40 [Doris Xin] renamed randomRDD to rand and import as random
29d205e [Doris Xin] created mllib.random package
bd2df13 [Doris Xin] typos
07ddff2 [Doris Xin] units passed.
23b2ecd [Doris Xin] WIP
Diffstat (limited to 'python/pyspark/__init__.py')
-rw-r--r-- | python/pyspark/__init__.py | 10 |
1 files changed, 10 insertions, 0 deletions
diff --git a/python/pyspark/__init__.py b/python/pyspark/__init__.py index 312c75d112..c58555fc9d 100644 --- a/python/pyspark/__init__.py +++ b/python/pyspark/__init__.py @@ -49,6 +49,16 @@ Hive: Main entry point for accessing data stored in Apache Hive.. """ +# The following block allows us to import python's random instead of mllib.random for scripts in +# mllib that depend on top level pyspark packages, which transitively depend on python's random. +# Since Python's import logic looks for modules in the current package first, we eliminate +# mllib.random as a candidate for C{import random} by removing the first search path, the script's +# location, in order to force the loader to look in Python's top-level modules for C{random}. +import sys +s = sys.path.pop(0) +import random +sys.path.insert(0, s) + from pyspark.conf import SparkConf from pyspark.context import SparkContext from pyspark.sql import SQLContext |