spark - Mirror of Apache Spark

	Commit message (Expand)	Author	Age	Files	Lines
*	[SPARK-17472] [PYSPARK] Better error message for serialization failures of la...	Eric Liang	2016-09-14	2	-1/+20
*	[SPARK-17514] df.take(1) and df.limit(1).collect() should perform the same in...	Josh Rosen	2016-09-14	2	-4/+19
*	[SPARK-17525][PYTHON] Remove SparkContext.clearFiles() from the PySpark API a...	Sami Jaktholm	2016-09-14	1	-8/+0
*	[SPARK-17474] [SQL] fix python udf in TakeOrderedAndProjectExec	Davies Liu	2016-09-12	1	-0/+8
*	[SPARK-17389][FOLLOW-UP][ML] Change KMeans k-means\|\| default init steps from ...	Yanbo Liang	2016-09-11	2	-8/+8
*	[MINOR][ML] Correct weights doc of MultilayerPerceptronClassificationModel.	Yanbo Liang	2016-09-06	1	-1/+1
*	[SPARK-17311][MLLIB] Standardize Python-Java MLlib API to accept optional lon...	Sean Owen	2016-09-04	1	-2/+2
*	[SPARK-17298][SQL] Require explicit CROSS join for cartesian products	Srinath Shankar	2016-09-03	1	-1/+1
*	[SPARK-17261] [PYSPARK] Using HiveContext after re-creating SparkContext in S...	Jeff Zhang	2016-09-02	1	-0/+1
*	[SPARK-17264][SQL] DataStreamWriter should document that it only supports Par...	Sean Owen	2016-08-30	1	-1/+1
*	[SPARK-17001][ML] Enable standardScaler to standardize sparse vectors when wi...	Sean Owen	2016-08-27	1	-3/+2
*	[SPARK-17197][ML][PYSPARK] PySpark LiR/LoR supports tree aggregation level co...	Yanbo Liang	2016-08-25	4	-11/+42
*	[SPARK-17215][SQL] Method `SQLContext.parseDataType(dataTypeString: String)` ...	jiangxingbo	2016-08-24	6	-13/+16
*	[SPARK-16216][SQL] Read/write timestamps and dates in ISO 8601 and dateFormat...	hyukjinkwon	2016-08-24	2	-20/+66
*	[SPARK-16781][PYSPARK] java launched by PySpark as gateway may not be the sam...	Sean Owen	2016-08-24	3	-1/+1
*	[SPARK-15113][PYSPARK][ML] Add missing num features num classes	Holden Karau	2016-08-22	3	-11/+64
*	[SPARK-15018][PYSPARK][ML] Improve handling of PySpark Pipeline when used wit...	Bryan Cutler	2016-08-19	2	-8/+14
*	[SPARK-16965][MLLIB][PYSPARK] Fix bound checking for SparseVector.	Jeff Zhang	2016-08-19	1	-0/+15
*	[SPARK-16961][CORE] Fixed off-by-one error that biased randomizeInPlace	Nick Lavers	2016-08-19	3	-8/+8
*	[MINOR][DOC] Fix the descriptions for `properties` argument in the documenati...	mvervuurt	2016-08-16	1	-5/+6
*	[SPARK-17035] [SQL] [PYSPARK] Improve Timestamp not to lose precision for all...	Dongjoon Hyun	2016-08-16	2	-1/+6
*	[SPARK-16700][PYSPARK][SQL] create DataFrame from dict/Row with schema	Davies Liu	2016-08-15	4	-28/+62
*	[MINOR][ML] Rename TreeEnsembleModels to TreeEnsembleModel for PySpark	Yanbo Liang	2016-08-11	2	-6/+6
*	[SPARK-16324][SQL] regexp_extract should doc that it returns empty string whe...	Sean Owen	2016-08-10	1	-1/+5
*	[SPARK-16950] [PYSPARK] fromOffsets parameter support in KafkaUtils.createDir...	Mariusz Strzelecki	2016-08-09	2	-9/+6
*	[SPARK-16409][SQL] regexp_extract with optional groups causes NPE	Sean Owen	2016-08-07	1	-0/+3
*	[SPARK-16772][PYTHON][DOCS] Fix API doc references to UDFRegistration + Updat...	Nicholas Chammas	2016-08-06	3	-9/+5
*	[SPARK-16831][PYTHON] Fixed bug in CrossValidator.avgMetrics	=^_^=	2016-08-03	1	-1/+3
*	[SPARK-16062] [SPARK-15989] [SQL] Fix two bugs of Python-only UDTs	Liang-Chi Hsieh	2016-08-02	1	-0/+35
*	[SPARK-16772][PYTHON][DOCS] Restore "datatype string" to Python API docstrings	Nicholas Chammas	2016-07-29	2	-12/+8
*	[SPARK-16772] Correct API doc references to PySpark classes + formatting fixes	Nicholas Chammas	2016-07-28	8	-58/+75
*	[SPARK-15254][DOC] Improve ML pipeline Cross Validation Scaladoc & PyDoc	krishnakalyan3	2016-07-27	1	-2/+11
*	[SPARK-16653][ML][OPTIMIZER] update ANN convergence tolerance param default t...	WeichenXu	2016-07-25	1	-4/+4
*	[PYSPARK] add picklable SparseMatrix in pyspark.ml.common	WeichenXu	2016-07-24	1	-0/+1
*	[SPARK-16662][PYSPARK][SQL] fix HiveContext warning bug	WeichenXu	2016-07-23	1	-5/+4
*	[SPARK-16651][PYSPARK][DOC] Make `withColumnRenamed/drop` description more co...	Dongjoon Hyun	2016-07-22	1	-0/+2
*	[SPARK-16494][ML] Upgrade breeze version to 0.12	Yanbo Liang	2016-07-19	1	-1/+1
*	[DOC] improve python doc for rdd.histogram and dataframe.join	Mortada Mehyar	2016-07-18	2	-14/+14
*	[SPARK-14817][ML][MLLIB][DOC] Made DataFrame-based API primary in MLlib guide	Joseph K. Bradley	2016-07-15	3	-4/+7
*	[SPARK-16546][SQL][PYSPARK] update python dataframe.drop	WeichenXu	2016-07-14	1	-8/+19
*	[SPARK-16503] SparkSession should provide Spark version	Liwei Lin	2016-07-13	1	-0/+6
*	[SPARK-16536][SQL][PYSPARK][MINOR] Expose `sql` in PySpark Shell	Dongjoon Hyun	2016-07-13	1	-0/+1
*	[SPARK-14812][ML][MLLIB][PYTHON] Experimental, DeveloperApi annotation audit ...	Joseph K. Bradley	2016-07-13	13	-197/+11
*	[SPARK-16429][SQL] Include `StringType` columns in `describe()`	Dongjoon Hyun	2016-07-08	1	-4/+4
*	[SPARK-13638][SQL] Add quoteAll option to CSV DataFrameWriter	Jurriaan Pruis	2016-07-08	1	-2/+5
*	[SPARK-16052][SQL] Improve `CollapseRepartition` optimizer for Repartition/Re...	Dongjoon Hyun	2016-07-08	1	-2/+2
*	[MINOR][PYSPARK][DOC] Fix wrongly formatted examples in PySpark documentation	hyukjinkwon	2016-07-06	6	-23/+26
*	[SPARK-16348][ML][MLLIB][PYTHON] Use full classpaths for pyspark ML JVM calls	Joseph K. Bradley	2016-07-05	8	-26/+28
*	[SPARK-16335][SQL] Structured streaming should fail if source directory does ...	Reynold Xin	2016-07-01	1	-7/+4
*	[SPARK-15954][SQL] Disable loading test tables in Python tests	Reynold Xin	2016-06-30	1	-1/+1