aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/ml/param
diff options
context:
space:
mode:
authorJoseph K. Bradley <joseph@databricks.com>2016-04-16 11:23:28 -0700
committerJoseph K. Bradley <joseph@databricks.com>2016-04-16 11:23:28 -0700
commit36da5e323487aa851a45475109185b9b0653db75 (patch)
tree3088ccde1eeefa430d8ca58dcdadc6e8caa64126 /python/pyspark/ml/param
parent9f678e97549b19d6d979b22fa4079094ce9fb2c0 (diff)
downloadspark-36da5e323487aa851a45475109185b9b0653db75.tar.gz
spark-36da5e323487aa851a45475109185b9b0653db75.tar.bz2
spark-36da5e323487aa851a45475109185b9b0653db75.zip
[SPARK-14605][ML][PYTHON] Changed Python to use unicode UIDs for spark.ml Identifiable
## What changes were proposed in this pull request? Python spark.ml Identifiable classes use UIDs of type str, but they should use unicode (in Python 2.x) to match Java. This could be a problem if someone created a class in Java with odd unicode characters, saved it, and loaded it in Python. This PR: Use unicode everywhere in Python. ## How was this patch tested? Updated persistence unit test to check uid type Author: Joseph K. Bradley <joseph@databricks.com> Closes #12368 from jkbradley/python-uid-unicode.
Diffstat (limited to 'python/pyspark/ml/param')
-rw-r--r--python/pyspark/ml/param/__init__.py3
1 files changed, 2 insertions, 1 deletions
diff --git a/python/pyspark/ml/param/__init__.py b/python/pyspark/ml/param/__init__.py
index 9f0b063aac..40d8300625 100644
--- a/python/pyspark/ml/param/__init__.py
+++ b/python/pyspark/ml/param/__init__.py
@@ -485,10 +485,11 @@ class Params(Identifiable):
Changes the uid of this instance. This updates both
the stored uid and the parent uid of params and param maps.
This is used by persistence (loading).
- :param newUid: new uid to use
+ :param newUid: new uid to use, which is converted to unicode
:return: same instance, but with the uid and Param.parent values
updated, including within param maps
"""
+ newUid = unicode(newUid)
self.uid = newUid
newDefaultParamMap = dict()
newParamMap = dict()