aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/sql/readwriter.py
Commit message (Collapse)AuthorAgeFilesLines
* [SPARK-10373] [PYSPARK] move @since into pyspark from sqlDavies Liu2015-09-081-2/+1
| | | | | | | | cc mengxr Author: Davies Liu <davies@databricks.com> Closes #8657 from davies/move_since.
* [SPARK-9964] [PYSPARK] [SQL] PySpark DataFrameReader accept RDD of String ↵Yanbo Liang2015-08-261-6/+22
| | | | | | | | | | | for JSON PySpark DataFrameReader should could accept an RDD of Strings (like the Scala version does) for JSON, rather than only taking a path. If this PR is merged, it should be duplicated to cover the other input types (not just JSON). Author: Yanbo Liang <ybliang8@gmail.com> Closes #8444 from yanboliang/spark-9964.
* [SPARK-9828] [PYSPARK] Mutable values should not be default argumentsMechCoder2015-08-141-2/+6
| | | | | | Author: MechCoder <manojkumarsivaraj334@gmail.com> Closes #8110 from MechCoder/spark-9828.
* [SPARK-6591] [SQL] Python data source load options should auto convert ↵Yijie Shen2015-08-051-3/+14
| | | | | | | | | | | | | | | common types into strings JIRA: https://issues.apache.org/jira/browse/SPARK-6591 Author: Yijie Shen <henry.yijieshen@gmail.com> Closes #7926 from yjshen/py_dsload_opt and squashes the following commits: b207832 [Yijie Shen] fix style efdf834 [Yijie Shen] resolve comment 7a8f6a2 [Yijie Shen] lowercase 822e769 [Yijie Shen] convert load opts to string
* [SPARK-9100] [SQL] Adds DataFrame reader/writer shortcut methods for ORCCheng Lian2015-07-211-3/+41
| | | | | | | | | | | This PR adds DataFrame reader/writer shortcut methods for ORC in both Scala and Python. Author: Cheng Lian <lian@databricks.com> Closes #7444 from liancheng/spark-9100 and squashes the following commits: 284d043 [Cheng Lian] Fixes PySpark test cases and addresses PR comments e0b09fb [Cheng Lian] Adds DataFrame reader/writer shortcut methods for ORC
* [SPARK-8698] partitionBy in Python DataFrame reader/writer interface should ↵Reynold Xin2015-06-291-8/+13
| | | | | | | | | | not default to empty tuple. Author: Reynold Xin <rxin@databricks.com> Closes #7079 from rxin/SPARK-8698 and squashes the following commits: 8513e1c [Reynold Xin] [SPARK-8698] partitionBy in Python DataFrame reader/writer interface should not default to empty tuple.
* [SPARK-8355] [SQL] Python DataFrameReader/Writer should mirror ScalaCheolsoo Park2015-06-291-0/+14
| | | | | | | | | | | | | I compared PySpark DataFrameReader/Writer against Scala ones. `Option` function is missing in both reader and writer, but the rest seems to all match. I added `Option` to reader and writer and updated the `pyspark-sql` test. Author: Cheolsoo Park <cheolsoop@netflix.com> Closes #7078 from piaozhexiu/SPARK-8355 and squashes the following commits: c63d419 [Cheolsoo Park] Fix version 524e0aa [Cheolsoo Park] Add option function to df reader and writer
* [SPARK-8532] [SQL] In Python's DataFrameWriter, ↵Yin Huai2015-06-221-11/+19
| | | | | | | | | | | | | | | | | | | | | save/saveAsTable/json/parquet/jdbc always override mode https://issues.apache.org/jira/browse/SPARK-8532 This PR has two changes. First, it fixes the bug that save actions (i.e. `save/saveAsTable/json/parquet/jdbc`) always override mode. Second, it adds input argument `partitionBy` to `save/saveAsTable/parquet`. Author: Yin Huai <yhuai@databricks.com> Closes #6937 from yhuai/SPARK-8532 and squashes the following commits: f972d5d [Yin Huai] davies's comment. d37abd2 [Yin Huai] style. d21290a [Yin Huai] Python doc. 889eb25 [Yin Huai] Minor refactoring and add partitionBy to save, saveAsTable, and parquet. 7fbc24b [Yin Huai] Use None instead of "error" as the default value of mode since JVM-side already uses "error" as the default value. d696dff [Yin Huai] Python style. 88eb6c4 [Yin Huai] If mode is "error", do not call mode method. c40c461 [Yin Huai] Regression test.
* [SPARK-8060] Improve DataFrame Python test coverage and documentation.Reynold Xin2015-06-031-119/+98
| | | | | | | | | | Author: Reynold Xin <rxin@databricks.com> Closes #6601 from rxin/python-read-write-test-and-doc and squashes the following commits: baa8ad5 [Reynold Xin] Code review feedback. f081d47 [Reynold Xin] More documentation updates. c9902fa [Reynold Xin] [SPARK-8060] Improve DataFrame Python reader/writer interface doc and testing.
* [SPARK-8021] [SQL] [PYSPARK] make Python read/write API consistent with ScalaDavies Liu2015-06-021-27/+94
| | | | | | | | | | | | | | add schema()/format()/options() for reader, add mode()/format()/options()/partitionBy() for writer cc rxin yhuai pwendell Author: Davies Liu <davies@databricks.com> Closes #6578 from davies/readwrite and squashes the following commits: 720d293 [Davies Liu] address comments b65dfa2 [Davies Liu] Update readwriter.py 1299ab6 [Davies Liu] make Python API consistent with Scala
* [SPARK-7840] add insertInto() to WriterDavies Liu2015-05-231-7/+15
| | | | | | | | | | Add tests later. Author: Davies Liu <davies@databricks.com> Closes #6375 from davies/insertInto and squashes the following commits: 826423e [Davies Liu] add insertInto() to Writer
* [SPARK-7606] [SQL] [PySpark] add version to Python SQL API docsDavies Liu2015-05-201-0/+15
| | | | | | | | | | | | | Add version info for public Python SQL API. cc rxin Author: Davies Liu <davies@databricks.com> Closes #6295 from davies/versions and squashes the following commits: cfd91e6 [Davies Liu] add more version for DataFrame API 600834d [Davies Liu] add version to SQL API docs
* [SPARK-7738] [SQL] [PySpark] add reader and writer API in PythonDavies Liu2015-05-191-0/+338
cc rxin, please take a quick look, I'm working on tests. Author: Davies Liu <davies@databricks.com> Closes #6238 from davies/readwrite and squashes the following commits: c7200eb [Davies Liu] update tests 9cbf01b [Davies Liu] Merge branch 'master' of github.com:apache/spark into readwrite f0c5a04 [Davies Liu] use sqlContext.read.load 5f68bc8 [Davies Liu] update tests 6437e9a [Davies Liu] Merge branch 'master' of github.com:apache/spark into readwrite bcc6668 [Davies Liu] add reader amd writer API in Python