aboutsummaryrefslogtreecommitdiff
path: root/python
diff options
context:
space:
mode:
authorTarek Auel <tarek.auel@gmail.com>2015-06-29 11:57:19 -0700
committerDavies Liu <davies@databricks.com>2015-06-29 11:57:19 -0700
commita5c2961caaafd751f11bdd406bb6885443d7572e (patch)
tree8cdb6288d459f82e155e4510baa0a2523a76b6ad /python
parent3664ee25f0a67de5ba76e9487a55a55216ae589f (diff)
downloadspark-a5c2961caaafd751f11bdd406bb6885443d7572e.tar.gz
spark-a5c2961caaafd751f11bdd406bb6885443d7572e.tar.bz2
spark-a5c2961caaafd751f11bdd406bb6885443d7572e.zip
[SPARK-8235] [SQL] misc function sha / sha1
Jira: https://issues.apache.org/jira/browse/SPARK-8235 I added the support for sha1. If I understood rxin correctly, sha and sha1 should execute the same algorithm, shouldn't they? Please take a close look on the Python part. This is adopted from #6934 Author: Tarek Auel <tarek.auel@gmail.com> Author: Tarek Auel <tarek.auel@googlemail.com> Closes #6963 from tarekauel/SPARK-8235 and squashes the following commits: f064563 [Tarek Auel] change to shaHex 7ce3cdc [Tarek Auel] rely on automatic cast a1251d6 [Tarek Auel] Merge remote-tracking branch 'upstream/master' into SPARK-8235 68eb043 [Tarek Auel] added docstring be5aff1 [Tarek Auel] improved error message 7336c96 [Tarek Auel] added type check cf23a80 [Tarek Auel] simplified example ebf75ef [Tarek Auel] [SPARK-8301] updated the python documentation. Removed sha in python and scala 6d6ff0d [Tarek Auel] [SPARK-8233] added docstring ea191a9 [Tarek Auel] [SPARK-8233] fixed signatureof python function. Added expected type to misc e3fd7c3 [Tarek Auel] SPARK[8235] added sha to the list of __all__ e5dad4e [Tarek Auel] SPARK[8235] sha / sha1
Diffstat (limited to 'python')
-rw-r--r--python/pyspark/sql/functions.py14
1 files changed, 14 insertions, 0 deletions
diff --git a/python/pyspark/sql/functions.py b/python/pyspark/sql/functions.py
index 7d3d036161..45ecd826bd 100644
--- a/python/pyspark/sql/functions.py
+++ b/python/pyspark/sql/functions.py
@@ -42,6 +42,7 @@ __all__ = [
'monotonicallyIncreasingId',
'rand',
'randn',
+ 'sha1',
'sha2',
'sparkPartitionId',
'struct',
@@ -382,6 +383,19 @@ def sha2(col, numBits):
return Column(jc)
+@ignore_unicode_prefix
+@since(1.5)
+def sha1(col):
+ """Returns the hex string result of SHA-1.
+
+ >>> sqlContext.createDataFrame([('ABC',)], ['a']).select(sha1('a').alias('hash')).collect()
+ [Row(hash=u'3c01bdbb26f358bab27f267924aa2c9a03fcfdb8')]
+ """
+ sc = SparkContext._active_spark_context
+ jc = sc._jvm.functions.sha1(_to_java_column(col))
+ return Column(jc)
+
+
@since(1.4)
def sparkPartitionId():
"""A column for partition ID of the Spark task.