diff options
author | linweizhong <linweizhong@huawei.com> | 2015-04-24 20:23:19 -0700 |
---|---|---|
committer | Reynold Xin <rxin@databricks.com> | 2015-04-24 20:23:19 -0700 |
commit | d874f8b546d8fae95bc92d8461b8189e51cb731b (patch) | |
tree | 81654bbc695f2e74e1def1a9e0ceba778c7a33ca /examples | |
parent | 438859eb7c4e605bb4041d9a547a16be9c827c75 (diff) | |
download | spark-d874f8b546d8fae95bc92d8461b8189e51cb731b.tar.gz spark-d874f8b546d8fae95bc92d8461b8189e51cb731b.tar.bz2 spark-d874f8b546d8fae95bc92d8461b8189e51cb731b.zip |
[PySpark][Minor] Update sql example, so that can read file correctly
To run Spark, default will read file from HDFS if we don't set the schema.
Author: linweizhong <linweizhong@huawei.com>
Closes #5684 from Sephiroth-Lin/pyspark_example_minor and squashes the following commits:
19fe145 [linweizhong] Update example sql.py, so that can read file correctly
Diffstat (limited to 'examples')
-rw-r--r-- | examples/src/main/python/sql.py | 7 |
1 files changed, 6 insertions, 1 deletions
diff --git a/examples/src/main/python/sql.py b/examples/src/main/python/sql.py index 87d7b088f0..2c18875932 100644 --- a/examples/src/main/python/sql.py +++ b/examples/src/main/python/sql.py @@ -18,6 +18,7 @@ from __future__ import print_function import os +import sys from pyspark import SparkContext from pyspark.sql import SQLContext @@ -50,7 +51,11 @@ if __name__ == "__main__": # A JSON dataset is pointed to by path. # The path can be either a single text file or a directory storing text files. - path = os.path.join(os.environ['SPARK_HOME'], "examples/src/main/resources/people.json") + if len(sys.argv) < 2: + path = "file://" + \ + os.path.join(os.environ['SPARK_HOME'], "examples/src/main/resources/people.json") + else: + path = sys.argv[1] # Create a DataFrame from the file(s) pointed to by path people = sqlContext.jsonFile(path) # root |