aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/sql/functions.py
diff options
context:
space:
mode:
authorLiang-Chi Hsieh <viirya@gmail.com>2015-06-17 23:31:30 -0700
committerReynold Xin <rxin@databricks.com>2015-06-17 23:31:30 -0700
commitfee3438a32136a8edbca71efb566965587a88826 (patch)
tree2fa725811cd231a35130bcc0e75b22df1257c73d /python/pyspark/sql/functions.py
parent78a430ea4d2aef58a8bf38ce488553ca6acea428 (diff)
downloadspark-fee3438a32136a8edbca71efb566965587a88826.tar.gz
spark-fee3438a32136a8edbca71efb566965587a88826.tar.bz2
spark-fee3438a32136a8edbca71efb566965587a88826.zip
[SPARK-8218][SQL] Add binary log math function
JIRA: https://issues.apache.org/jira/browse/SPARK-8218 Because there is already `log` unary function defined, the binary log function is called `logarithm` for now. Author: Liang-Chi Hsieh <viirya@gmail.com> Closes #6725 from viirya/expr_binary_log and squashes the following commits: bf96bd9 [Liang-Chi Hsieh] Compare log result in string. 102070d [Liang-Chi Hsieh] Round log result to better comparing in python test. fd01863 [Liang-Chi Hsieh] For comments. beed631 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into expr_binary_log 6089d11 [Liang-Chi Hsieh] Remove unnecessary override. 8cf37b7 [Liang-Chi Hsieh] For comments. bc89597 [Liang-Chi Hsieh] For comments. db7dc38 [Liang-Chi Hsieh] Use ctor instead of companion object. 0634ef7 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into expr_binary_log 1750034 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into expr_binary_log 3d75bfc [Liang-Chi Hsieh] Fix scala style. 5b39c02 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into expr_binary_log 23c54a3 [Liang-Chi Hsieh] Fix scala style. ebc9929 [Liang-Chi Hsieh] Let Logarithm accept one parameter too. 605574d [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into expr_binary_log 21c3bfd [Liang-Chi Hsieh] Fix scala style. c6c187f [Liang-Chi Hsieh] For comments. c795342 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into expr_binary_log f373bac [Liang-Chi Hsieh] Add binary log expression.
Diffstat (limited to 'python/pyspark/sql/functions.py')
-rw-r--r--python/pyspark/sql/functions.py18
1 files changed, 17 insertions, 1 deletions
diff --git a/python/pyspark/sql/functions.py b/python/pyspark/sql/functions.py
index bbf465aca8..177fc196e0 100644
--- a/python/pyspark/sql/functions.py
+++ b/python/pyspark/sql/functions.py
@@ -18,6 +18,7 @@
"""
A collections of builtin functions
"""
+import math
import sys
if sys.version < "3":
@@ -143,7 +144,7 @@ _binary_mathfunctions = {
'atan2': 'Returns the angle theta from the conversion of rectangular coordinates (x, y) to' +
'polar coordinates (r, theta).',
'hypot': 'Computes `sqrt(a^2^ + b^2^)` without intermediate overflow or underflow.',
- 'pow': 'Returns the value of the first argument raised to the power of the second argument.'
+ 'pow': 'Returns the value of the first argument raised to the power of the second argument.',
}
_window_functions = {
@@ -404,6 +405,21 @@ def when(condition, value):
@since(1.4)
+def log(col, base=math.e):
+ """Returns the first argument-based logarithm of the second argument.
+
+ >>> df.select(log(df.age, 10.0).alias('ten')).map(lambda l: str(l.ten)[:7]).collect()
+ ['0.30102', '0.69897']
+
+ >>> df.select(log(df.age).alias('e')).map(lambda l: str(l.e)[:7]).collect()
+ ['0.69314', '1.60943']
+ """
+ sc = SparkContext._active_spark_context
+ jc = sc._jvm.functions.log(base, _to_java_column(col))
+ return Column(jc)
+
+
+@since(1.4)
def lag(col, count=1, default=None):
"""
Window function: returns the value that is `offset` rows before the current row, and