spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[SPARK-10446][SQL] Support to specify join type when calling join with ↵	Liang-Chi Hsieh	2015-09-21	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \|	usingColumns JIRA: https://issues.apache.org/jira/browse/SPARK-10446 Currently the method `join(right: DataFrame, usingColumns: Seq[String])` only supports inner join. It is more convenient to have it support other join types. Author: Liang-Chi Hsieh <viirya@appier.com> Closes #8600 from viirya/usingcolumns_df.
*	[SPARK-10577] [PYSPARK] DataFrame hint for broadcast join	Jian Feng	2015-09-21	2	-0/+27
\| \| \| \| \| \| \| \|	https://issues.apache.org/jira/browse/SPARK-10577 Author: Jian Feng <jzhang.chs@gmail.com> Closes #8801 from Jianfeng-chs/master.
*	[SPARK-10716] [BUILD] spark-1.5.0-bin-hadoop2.6.tgz file doesn't uncompress ↵	Sean Owen	2015-09-21	1	-0/+0
\| \| \| \| \| \| \| \| \| \|	on OS X due to hidden file Remove ._SUCCESS.crc hidden file that may cause problems in distribution tar archive, and is not used Author: Sean Owen <sowen@cloudera.com> Closes #8846 from srowen/SPARK-10716.
*	[SPARK-9821] [PYSPARK] pyspark-reduceByKey-should-take-a-custom-partitioner	Holden Karau	2015-09-21	1	-13/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	from the issue: In Scala, I can supply a custom partitioner to reduceByKey (and other aggregation/repartitioning methods like aggregateByKey and combinedByKey), but as far as I can tell from the Pyspark API, there's no way to do the same in Python. Here's an example of my code in Scala: weblogs.map(s => (getFileType(s), 1)).reduceByKey(new FileTypePartitioner(),_+_) But I can't figure out how to do the same in Python. The closest I can get is to call repartition before reduceByKey like so: weblogs.map(lambda s: (getFileType(s), 1)).partitionBy(3,hash_filetype).reduceByKey(lambda v1,v2: v1+v2).collect() But that defeats the purpose, because I'm shuffling twice instead of once, so my performance is worse instead of better. Author: Holden Karau <holden@pigscanfly.ca> Closes #8569 from holdenk/SPARK-9821-pyspark-reduceByKey-should-take-a-custom-partitioner.
*	[DOC] [PYSPARK] [MLLIB] Added newlines to docstrings to fix parameter formatting	noelsmith	2015-09-21	8	-1/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added newlines before `:param ...:` and `:return:` markup. Without these, parameter lists aren't formatted correctly in the API docs. I.e: ![screen shot 2015-09-21 at 21 49 26](https://cloud.githubusercontent.com/assets/11915197/10004686/de3c41d4-60aa-11e5-9c50-a46dcb51243f.png) .. looks like this once newline is added: ![screen shot 2015-09-21 at 21 50 14](https://cloud.githubusercontent.com/assets/11915197/10004706/f86bfb08-60aa-11e5-8524-ae4436713502.png) Author: noelsmith <mail@noelsmith.com> Closes #8851 from noel-smith/docstring-missing-newline-fix.
*	[SPARK-9769] [ML] [PY] add python api for countvectorizermodel	Holden Karau	2015-09-21	1	-6/+142
\| \| \| \| \| \| \| \|	From JIRA: Add Python API, user guide and example for ml.feature.CountVectorizerModel Author: Holden Karau <holden@pigscanfly.ca> Closes #8561 from holdenk/SPARK-9769-add-python-api-for-countvectorizermodel.
*	[SPARK-10631] [DOCUMENTATION, MLLIB, PYSPARK] Added documentation for few APIs	vinodkc	2015-09-20	1	-5/+17
\| \| \| \| \| \| \| \|	There are some missing API docs in pyspark.mllib.linalg.Vector (including DenseVector and SparseVector). We should add them based on their Scala counterparts. Author: vinodkc <vinod.kc.in@gmail.com> Closes #8834 from vinodkc/fix_SPARK-10631.
*	[SPARK-10710] Remove ability to disable spilling in core and SQL	Josh Rosen	2015-09-19	3	-60/+8
\| \| \| \| \| \| \| \| \| \|	It does not make much sense to set `spark.shuffle.spill` or `spark.sql.planner.externalSort` to false: I believe that these configurations were initially added as "escape hatches" to guard against bugs in the external operators, but these operators are now mature and well-tested. In addition, these configurations are not handled in a consistent way anymore: SQL's Tungsten codepath ignores these configurations and will continue to use spilling operators. Similarly, Spark Core's `tungsten-sort` shuffle manager does not respect `spark.shuffle.spill=false`. This pull request removes these configurations, adds warnings at the appropriate places, and deletes a large amount of code which was only used in code paths that did not support spilling. Author: Josh Rosen <joshrosen@databricks.com> Closes #8831 from JoshRosen/remove-ability-to-disable-spilling.
*	[SPARK-10615] [PYSPARK] change assertEquals to assertEqual	Yanbo Liang	2015-09-18	4	-99/+99
\| \| \| \| \| \| \| \|	As ```assertEquals``` is deprecated, so we need to change ```assertEquals``` to ```assertEqual``` for existing python unit tests. Author: Yanbo Liang <ybliang8@gmail.com> Closes #8814 from yanboliang/spark-10615.
*	[SPARK-10642] [PYSPARK] Fix crash when calling rdd.lookup() on tuple keys	Liang-Chi Hsieh	2015-09-17	1	-1/+4
\| \| \| \| \| \| \| \| \| \|	JIRA: https://issues.apache.org/jira/browse/SPARK-10642 When calling `rdd.lookup()` on a RDD with tuple keys, `portable_hash` will return a long. That causes `DAGScheduler.submitJob` to throw `java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer`. Author: Liang-Chi Hsieh <viirya@appier.com> Closes #8796 from viirya/fix-pyrdd-lookup.
*	[SPARK-10282] [ML] [PYSPARK] [DOCS] Add @since annotation to ↵	Yu ISHIKAWA	2015-09-17	1	-0/+28
\| \| \| \| \| \| \| \|	pyspark.ml.recommendation Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8692 from yu-iskw/SPARK-10282.
*	[SPARK-10274] [MLLIB] Add @since annotation to pyspark.mllib.fpm	Yu ISHIKAWA	2015-09-17	1	-1/+9
\| \| \| \| \| \|	Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8665 from yu-iskw/SPARK-10274.
*	[SPARK-10279] [MLLIB] [PYSPARK] [DOCS] Add @since annotation to ↵	Yu ISHIKAWA	2015-09-17	1	-2/+26
\| \| \| \| \| \| \| \|	pyspark.mllib.util Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8689 from yu-iskw/SPARK-10279.
*	[SPARK-10278] [MLLIB] [PYSPARK] Add @since annotation to pyspark.mllib.tree	Yu ISHIKAWA	2015-09-17	1	-1/+35
\| \| \| \| \| \|	Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8685 from yu-iskw/SPARK-10278.
*	[SPARK-10281] [ML] [PYSPARK] [DOCS] Add @since annotation to ↵	Yu ISHIKAWA	2015-09-17	1	-0/+13
\| \| \| \| \| \| \| \|	pyspark.ml.clustering Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8691 from yu-iskw/SPARK-10281.
*	[SPARK-10283] [ML] [PYSPARK] [DOCS] Add @since annotation to ↵	Yu ISHIKAWA	2015-09-17	1	-0/+65
\| \| \| \| \| \| \| \|	pyspark.ml.regression Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8693 from yu-iskw/SPARK-10283.
*	[SPARK-10284] [ML] [PYSPARK] [DOCS] Add @since annotation to pyspark.ml.tuning	Yu ISHIKAWA	2015-09-17	1	-0/+28
\| \| \| \| \| \|	Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8694 from yu-iskw/SPARK-10284.
*	[SPARK-10276] [MLLIB] [PYSPARK] Add @since annotation to ↵	Yu ISHIKAWA	2015-09-16	1	-1/+35
\| \| \| \| \| \| \| \|	pyspark.mllib.recommendation Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8677 from yu-iskw/SPARK-10276.
*	[SPARK-10516] [ MLLIB] Added values property in DenseVector	Vinod K C	2015-09-15	1	-0/+4
\| \| \| \| \| \|	Author: Vinod K C <vinod.kc@huawei.com> Closes #8682 from vinodkc/fix_SPARK-10516.
*	[PYSPARK] [MLLIB] [DOCS] Replaced addversion with versionadded in mllib.random	noelsmith	2015-09-15	1	-1/+1
\| \| \| \| \| \| \| \|	Missed this when reviewing `pyspark.mllib.random` for SPARK-10275. Author: noelsmith <mail@noelsmith.com> Closes #8773 from noel-smith/mllib-random-versionadded-fix.
*	[SPARK-10275] [MLLIB] Add @since annotation to pyspark.mllib.random	Yu ISHIKAWA	2015-09-14	1	-0/+15
\| \| \| \| \| \|	Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8666 from yu-iskw/SPARK-10275.
*	[SPARK-10273] Add @since annotation to pyspark.mllib.feature	noelsmith	2015-09-14	1	-1/+57
\| \| \| \| \| \| \| \| \| \|	Duplicated the since decorator from pyspark.sql into pyspark (also tweaked to handle functions without docstrings). Added since to methods + "versionadded::" to classes (derived from the git file history in pyspark). Author: noelsmith <mail@noelsmith.com> Closes #8633 from noel-smith/SPARK-10273-since-mllib-feature.
*	[SPARK-9793] [MLLIB] [PYSPARK] PySpark DenseVector, SparseVector implement ↵	Yanbo Liang	2015-09-14	2	-15/+107
\| \| \| \| \| \| \| \| \| \| \|	__eq__ and __hash__ correctly PySpark DenseVector, SparseVector ```__eq__``` method should use semantics equality, and DenseVector can compared with SparseVector. Implement PySpark DenseVector, SparseVector ```__hash__``` method based on the first 16 entries. That will make PySpark Vector objects can be used in collections. Author: Yanbo Liang <ybliang8@gmail.com> Closes #8166 from yanboliang/spark-9793.
*	[SPARK-10542] [PYSPARK] fix serialize namedtuple	Davies Liu	2015-09-14	3	-1/+20
\| \| \| \| \| \|	Author: Davies Liu <davies@databricks.com> Closes #8707 from davies/fix_namedtuple.
*	[SPARK-10194] [MLLIB] [PYSPARK] SGD algorithms need convergenceTol parameter ↵	Yanbo Liang	2015-09-14	2	-16/+33
\| \| \| \| \| \| \| \| \| \|	in Python [SPARK-3382](https://issues.apache.org/jira/browse/SPARK-3382) added a ```convergenceTol``` parameter for GradientDescent-based methods in Scala. We need that parameter in Python; otherwise, Python users will not be able to adjust that behavior (or even reproduce behavior from previous releases since the default changed). Author: Yanbo Liang <ybliang8@gmail.com> Closes #8457 from yanboliang/spark-10194.
*	[SPARK-6548] Adding stddev to DataFrame functions	JihongMa	2015-09-12	1	-18/+18
\| \| \| \| \| \| \| \| \| \| \|	Adding STDDEV support for DataFrame using 1-pass online /parallel algorithm to compute variance. Please review the code change. Author: JihongMa <linlin200605@gmail.com> Author: Jihong MA <linlin200605@gmail.com> Author: Jihong MA <jihongma@jihongs-mbp.usca.ibm.com> Author: Jihong MA <jihongma@Jihongs-MacBook-Pro.local> Closes #6297 from JihongMA/SPARK-SQL.
*	[SPARK-9014] [SQL] Allow Python spark API to use built-in exponential operator	0x0FFF	2015-09-11	2	-1/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR addresses (SPARK-9014)[https://issues.apache.org/jira/browse/SPARK-9014] Added functionality: `Column` object in Python now supports exponential operator `*` Example: ``` from pyspark.sql import df = sqlContext.createDataFrame([Row(a=2)]) df.select(3df.a,df.a3,df.a**df.a).collect() ``` Outputs: ``` [Row(POWER(3.0, a)=9.0, POWER(a, 3.0)=8.0, POWER(a, a)=4.0)] ``` Author: 0x0FFF <programmerag@gmail.com> Closes #8658 from 0x0FFF/SPARK-9014.
*	[PYTHON] Fixed typo in exception message	Icaro Medeiros	2015-09-11	1	-1/+1
\| \| \| \| \| \| \| \|	Just fixing a typo in exception message, raised when attempting to pickle SparkContext. Author: Icaro Medeiros <icaro.medeiros@gmail.com> Closes #8724 from icaromedeiros/master.
*	[SPARK-8530] [ML] add python API for MinMaxScaler	Yuhao Yang	2015-09-11	1	-5/+99
\| \| \| \| \| \| \| \| \| \| \|	jira: https://issues.apache.org/jira/browse/SPARK-8530 add python API for MinMaxScaler jira for MinMaxScaler: https://issues.apache.org/jira/browse/SPARK-7514 Author: Yuhao Yang <hhbyyh@gmail.com> Closes #7150 from hhbyyh/pythonMinMax.
*	[MINOR] [MLLIB] [ML] [DOC] Minor doc fixes for StringIndexer and MetadataUtils	Joseph K. Bradley	2015-09-11	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Changes: * Make Scala doc for StringIndexerInverse clearer. Also remove Scala doc from transformSchema, so that the doc is inherited. * MetadataUtils.scala: “ Helper utilities for tree-based algorithms” —> not just trees anymore CC: holdenk mengxr Author: Joseph K. Bradley <joseph@databricks.com> Closes #8679 from jkbradley/doc-fixes-1.5.
*	[SPARK-9773] [ML] [PySpark] Add Python API for MultilayerPerceptronClassifier	Yanbo Liang	2015-09-11	1	-1/+131
\| \| \| \| \| \| \| \|	Add Python API for ```MultilayerPerceptronClassifier```. Author: Yanbo Liang <ybliang8@gmail.com> Closes #8067 from yanboliang/SPARK-9773.
*	[SPARK-10026] [ML] [PySpark] Implement some common Params for regression in ↵	Yanbo Liang	2015-09-11	4	-96/+143
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PySpark LinearRegression and LogisticRegression lack of some Params for Python, and some Params are not shared classes which lead we need to write them for each class. These kinds of Params are list here: ```scala HasElasticNetParam HasFitIntercept HasStandardization HasThresholds ``` Here we implement them in shared params at Python side and make LinearRegression/LogisticRegression parameters peer with Scala one. Author: Yanbo Liang <ybliang8@gmail.com> Closes #8508 from yanboliang/spark-10026.
*	[SPARK-10027] [ML] [PySpark] Add Python API missing methods for ml.feature	Yanbo Liang	2015-09-10	3	-8/+59
\| \| \| \| \| \| \| \| \| \| \|	Missing method of ml.feature are listed here: ```StringIndexer``` lacks of parameter ```handleInvalid```. ```StringIndexerModel``` lacks of method ```labels```. ```VectorIndexerModel``` lacks of methods ```numFeatures``` and ```categoryMaps```. Author: Yanbo Liang <ybliang8@gmail.com> Closes #8313 from yanboliang/spark-10027.
*	[SPARK-7544] [SQL] [PySpark] pyspark.sql.types.Row implements __getitem__	Yanbo Liang	2015-09-10	1	-0/+15
\| \| \| \| \| \| \| \|	pyspark.sql.types.Row implements ```__getitem__``` Author: Yanbo Liang <ybliang8@gmail.com> Closes #8333 from yanboliang/spark-7544.
*	[SPARK-9772] [PYSPARK] [ML] Add Python API for ml.feature.VectorSlicer	Yanbo Liang	2015-09-09	1	-5/+90
\| \| \| \| \| \| \| \|	Add Python API for ml.feature.VectorSlicer. Author: Yanbo Liang <ybliang8@gmail.com> Closes #8102 from yanboliang/SPARK-9772.
*	[SPARK-9654] [ML] [PYSPARK] Add IndexToString to PySpark	Holden Karau	2015-09-08	2	-5/+72
\| \| \| \| \| \| \| \|	Adds IndexToString to PySpark. Author: Holden Karau <holden@pigscanfly.ca> Closes #7976 from holdenk/SPARK-9654-add-string-indexer-inverse-in-pyspark.
*	[SPARK-10094] Pyspark ML Feature transformers marked as experimental	noelsmith	2015-09-08	1	-0/+52
\| \| \| \| \| \| \| \|	Modified class-level docstrings to mark all feature transformers in pyspark.ml as experimental. Author: noelsmith <mail@noelsmith.com> Closes #8623 from noel-smith/SPARK-10094-mark-pyspark-ml-trans-exp.
*	[SPARK-10373] [PYSPARK] move @since into pyspark from sql	Davies Liu	2015-09-08	9	-25/+23
\| \| \| \| \| \| \| \|	cc mengxr Author: Davies Liu <davies@databricks.com> Closes #8657 from davies/move_since.
*	[SPARK-10440] [STREAMING] [DOCS] Update python API stuff in the programming ↵	Tathagata Das	2015-09-04	2	-0/+29
\| \| \| \| \| \| \| \| \| \| \|	guides and python docs - Fixed information around Python API tags in streaming programming guides - Added missing stuff in python docs Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #8595 from tdas/SPARK-10440.
*	[SPARK-10417] [SQL] Iterating through Column results in infinite loop	0x0FFF	2015-09-02	2	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	`pyspark.sql.column.Column` object has `__getitem__` method, which makes it iterable for Python. In fact it has `__getitem__` to address the case when the column might be a list or dict, for you to be able to access certain element of it in DF API. The ability to iterate over it is just a side effect that might cause confusion for the people getting familiar with Spark DF (as you might iterate this way on Pandas DF for instance) Issue reproduction: ``` df = sqlContext.jsonRDD(sc.parallelize(['{"name": "El Magnifico"}'])) for i in df["name"]: print i ``` Author: 0x0FFF <programmerag@gmail.com> Closes #8574 from 0x0FFF/SPARK-10417.
*	[SPARK-10392] [SQL] Pyspark - Wrong DateType support on JDBC connection	0x0FFF	2015-09-01	2	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR addresses issue [SPARK-10392](https://issues.apache.org/jira/browse/SPARK-10392) The problem is that for "start of epoch" date (01 Jan 1970) PySpark class DateType returns 0 instead of the `datetime.date` due to implementation of its return statement Issue reproduction on master: ``` >>> from pyspark.sql.types import * >>> a = DateType() >>> a.fromInternal(0) 0 >>> a.fromInternal(1) datetime.date(1970, 1, 2) ``` Author: 0x0FFF <programmerag@gmail.com> Closes #8556 from 0x0FFF/SPARK-10392.
*	[SPARK-10162] [SQL] Fix the timezone omitting for PySpark Dataframe filter ↵	0x0FFF	2015-09-01	2	-10/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	function This PR addresses [SPARK-10162](https://issues.apache.org/jira/browse/SPARK-10162) The issue is with DataFrame filter() function, if datetime.datetime is passed to it: * Timezone information of this datetime is ignored * This datetime is assumed to be in local timezone, which depends on the OS timezone setting Fix includes both code change and regression test. Problem reproduction code on master: ```python import pytz from datetime import datetime from pyspark.sql import * from pyspark.sql.types import * sqc = SQLContext(sc) df = sqc.createDataFrame([], StructType([StructField("dt", TimestampType())])) m1 = pytz.timezone('UTC') m2 = pytz.timezone('Etc/GMT+3') df.filter(df.dt > datetime(2000, 01, 01, tzinfo=m1)).explain() df.filter(df.dt > datetime(2000, 01, 01, tzinfo=m2)).explain() ``` It gives the same timestamp ignoring time zone: ``` >>> df.filter(df.dt > datetime(2000, 01, 01, tzinfo=m1)).explain() Filter (dt#0 > 946713600000000) Scan PhysicalRDD[dt#0] >>> df.filter(df.dt > datetime(2000, 01, 01, tzinfo=m2)).explain() Filter (dt#0 > 946713600000000) Scan PhysicalRDD[dt#0] ``` After the fix: ``` >>> df.filter(df.dt > datetime(2000, 01, 01, tzinfo=m1)).explain() Filter (dt#0 > 946684800000000) Scan PhysicalRDD[dt#0] >>> df.filter(df.dt > datetime(2000, 01, 01, tzinfo=m2)).explain() Filter (dt#0 > 946695600000000) Scan PhysicalRDD[dt#0] ``` PR [8536](https://github.com/apache/spark/pull/8536) was occasionally closed by me dropping the repo Author: 0x0FFF <programmerag@gmail.com> Closes #8555 from 0x0FFF/SPARK-10162.
*	[SPARK-9679] [ML] [PYSPARK] Add Python API for Stop Words Remover	Holden Karau	2015-09-01	2	-4/+89
\| \| \| \| \| \| \| \|	Add a python API for the Stop Words Remover. Author: Holden Karau <holden@pigscanfly.ca> Closes #8118 from holdenk/SPARK-9679-python-StopWordsRemover.
*	[SPARK-10355] [ML] [PySpark] Add Python API for SQLTransformer	Yanbo Liang	2015-08-31	1	-3/+54
\| \| \| \| \| \| \| \|	Add Python API for SQLTransformer Author: Yanbo Liang <ybliang8@gmail.com> Closes #8527 from yanboliang/spark-10355.
*	[SPARK-8472] [ML] [PySpark] Python API for DCT	Yanbo Liang	2015-08-31	1	-1/+64
\| \| \| \| \| \| \| \|	Add Python API for ml.feature.DCT. Author: Yanbo Liang <ybliang8@gmail.com> Closes #8485 from yanboliang/spark-8472.
*	[SPARK-10188] [PYSPARK] Pyspark CrossValidator with RMSE selects incorrect model	noelsmith	2015-08-27	3	-1/+104
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Added isLargerBetter() method to Pyspark Evaluator to match the Scala version. * JavaEvaluator delegates isLargerBetter() to underlying Scala object. * Added check for isLargerBetter() in CrossValidator to determine whether to use argmin or argmax. * Added test cases for where smaller is better (RMSE) and larger is better (R-Squared). (This contribution is my original work and that I license the work to the project under Sparks' open source license) Author: noelsmith <mail@noelsmith.com> Closes #8399 from noel-smith/pyspark-rmse-xval-fix.
*	[SPARK-9964] [PYSPARK] [SQL] PySpark DataFrameReader accept RDD of String ↵	Yanbo Liang	2015-08-26	1	-6/+22
\| \| \| \| \| \| \| \| \| \| \|	for JSON PySpark DataFrameReader should could accept an RDD of Strings (like the Scala version does) for JSON, rather than only taking a path. If this PR is merged, it should be duplicated to cover the other input types (not just JSON). Author: Yanbo Liang <ybliang8@gmail.com> Closes #8444 from yanboliang/spark-9964.
*	[SPARK-10305] [SQL] fix create DataFrame from Python class	Davies Liu	2015-08-26	2	-0/+18
\| \| \| \| \| \| \| \|	cc jkbradley Author: Davies Liu <davies@databricks.com> Closes #8470 from davies/fix_create_df.
*	[SPARK-9613] [CORE] Ban use of JavaConversions and migrate all existing uses ↵	Sean Owen	2015-08-25	2	-2/+14
\| \| \| \| \| \| \| \| \| \| \| \|	to JavaConverters Replace `JavaConversions` implicits with `JavaConverters` Most occurrences I've seen so far are necessary conversions; a few have been avoidable. None are in critical code as far as I see, yet. Author: Sean Owen <sowen@cloudera.com> Closes #8033 from srowen/SPARK-9613.
*	[SPARK-10168] [STREAMING] Fix the issue that maven publishes wrong artifact jars	zsxwing	2015-08-24	1	-21/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR removed the `outputFile` configuration from pom.xml and updated `tests.py` to search jars for both sbt build and maven build. I ran ` mvn -Pkinesis-asl -DskipTests clean install` locally, and verified the jars in my local repository were correct. I also checked Python tests for maven build, and it passed all tests. Author: zsxwing <zsxwing@gmail.com> Closes #8373 from zsxwing/SPARK-10168 and squashes the following commits: e0b5818 [zsxwing] Fix the sbt build c697627 [zsxwing] Add the jar pathes to the exception message be1d8a5 [zsxwing] Fix the issue that maven publishes wrong artifact jars