aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/sql
diff options
context:
space:
mode:
authorYin Huai <yhuai@databricks.com>2016-04-05 12:27:06 -0700
committerAndrew Or <andrew@databricks.com>2016-04-05 12:27:06 -0700
commit72544d6f2a72b9e44e0a32de1fb379e3342be5c3 (patch)
tree8767dc0a6e575dac7cbd87ccbb5c01f504d6b851 /python/pyspark/sql
parentbc36df127d3b9f56b4edaeb5eca7697d4aef761a (diff)
downloadspark-72544d6f2a72b9e44e0a32de1fb379e3342be5c3.tar.gz
spark-72544d6f2a72b9e44e0a32de1fb379e3342be5c3.tar.bz2
spark-72544d6f2a72b9e44e0a32de1fb379e3342be5c3.zip
[SPARK-14123][SPARK-14384][SQL] Handle CreateFunction/DropFunction
## What changes were proposed in this pull request? This PR implements CreateFunction and DropFunction commands. Besides implementing these two commands, we also change how to manage functions. Here are the main changes. * `FunctionRegistry` will be a container to store all functions builders and it will not actively load any functions. Because of this change, we do not need to maintain a separate registry for HiveContext. So, `HiveFunctionRegistry` is deleted. * SessionCatalog takes care the job of loading a function if this function is not in the `FunctionRegistry` but its metadata is stored in the external catalog. For this case, SessionCatalog will (1) load the metadata from the external catalog, (2) load all needed resources (i.e. jars and files), (3) create a function builder based on the function definition, (4) register the function builder in the `FunctionRegistry`. * A `UnresolvedGenerator` is created. So, the parser will not need to call `FunctionRegistry` directly during parsing, which is not a good time to create a Hive UDTF. In the analysis phase, we will resolve `UnresolvedGenerator`. This PR is based on viirya's https://github.com/apache/spark/pull/12036/ ## How was this patch tested? Existing tests and new tests. ## TODOs [x] Self-review [x] Cleanup [x] More tests for create/drop functions (we need to more tests for permanent functions). [ ] File JIRAs for all TODOs [x] Standardize the error message when a function does not exist. Author: Yin Huai <yhuai@databricks.com> Author: Liang-Chi Hsieh <simonh@tw.ibm.com> Closes #12117 from yhuai/function.
Diffstat (limited to 'python/pyspark/sql')
0 files changed, 0 insertions, 0 deletions