spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SPARK-7997][CORE] Remove Akka from Spark Core and Streaming	Shixiong Zhu	2016-01-22	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \|	- Remove Akka dependency from core. Note: the streaming-akka project still uses Akka. - Remove HttpFileServer - Remove Akka configs from SparkConf and SSLOptions - Rename `spark.akka.frameSize` to `spark.rpc.message.maxSize`. I think it's still worth to keep this config because using `DirectTaskResult` or `IndirectTaskResult` depends on it. - Update comments and docs Author: Shixiong Zhu <shixiong@databricks.com> Closes #10854 from zsxwing/remove-akka.
*	[SPARK-7799][SPARK-12786][STREAMING] Add "streaming-akka" project	Shixiong Zhu	2016-01-20	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	Include the following changes: 1. Add "streaming-akka" project and org.apache.spark.streaming.akka.AkkaUtils for creating an actorStream 2. Remove "StreamingContext.actorStream" and "JavaStreamingContext.actorStream" 3. Update the ActorWordCount example and add the JavaActorWordCount example 4. Make "streaming-zeromq" depend on "streaming-akka" and update the codes accordingly Author: Shixiong Zhu <shixiong@databricks.com> Closes #10744 from zsxwing/streaming-akka-2.
*	[SPARK-12847][CORE][STREAMING] Remove StreamingListenerBus and post all ↵	Shixiong Zhu	2016-01-20	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Streaming events to the same thread as Spark events Including the following changes: 1. Add StreamingListenerForwardingBus to WrappedStreamingListenerEvent process events in `onOtherEvent` to StreamingListener 2. Remove StreamingListenerBus 3. Merge AsynchronousListenerBus and LiveListenerBus to the same class LiveListenerBus 4. Add `logEvent` method to SparkListenerEvent so that EventLoggingListener can use it to ignore WrappedStreamingListenerEvents Author: Shixiong Zhu <shixiong@databricks.com> Closes #10779 from zsxwing/streaming-listener.
*	[SPARK-12855][SQL] Remove parser dialect developer API	Reynold Xin	2016-01-18	1	-1/+3
\| \| \| \| \| \| \| \|	This pull request removes the public developer parser API for external parsers. Given everything a parser depends on (e.g. logical plans and expressions) are internal and not stable, external parsers will break with every release of Spark. It is a bad idea to create the illusion that Spark actually supports pluggable parsers. In addition, this also reduces incentives for 3rd party projects to contribute parse improvements back to Spark. Author: Reynold Xin <rxin@databricks.com> Closes #10801 from rxin/SPARK-12855.
*	[SPARK-12667] Remove block manager's internal "external block store" API	Reynold Xin	2016-01-15	1	-1/+5
\| \| \| \| \| \| \| \| \| \|	This pull request removes the external block store API. This is rarely used, and the file system interface is actually a better, more standard way to interact with external storage systems. There are some other things to remove also, as pointed out by JoshRosen. We will do those as follow-up pull requests. Author: Reynold Xin <rxin@databricks.com> Closes #10752 from rxin/remove-offheap.
*	[SPARK-12692][BUILD][STREAMING] Scala style: Fix the style violation (Space ↵	Kousuke Saruta	2016-01-11	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \|	before "," or ":") Fix the style violation (space before , and :). This PR is a followup for #10643. Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes #10685 from sarutak/SPARK-12692-followup-streaming.
*	[SPARK-4819] Remove Guava's "Optional" from public API	Sean Owen	2016-01-08	1	-1/+10
\| \| \| \| \| \| \| \| \| \|	Replace Guava `Optional` with (an API clone of) Java 8 `java.util.Optional` (edit: and a clone of Guava `Optional`) See also https://github.com/apache/spark/pull/10512 Author: Sean Owen <sowen@cloudera.com> Closes #10513 from srowen/SPARK-4819.
*	[SPARK-12591][STREAMING] Register OpenHashMapBasedStateMap for Kryo	Shixiong Zhu	2016-01-07	1	-0/+4
\| \| \| \| \| \| \| \|	The default serializer in Kryo is FieldSerializer and it ignores transient fields and never calls `writeObject` or `readObject`. So we should register OpenHashMapBasedStateMap using `DefaultSerializer` to make it work with Kryo. Author: Shixiong Zhu <shixiong@databricks.com> Closes #10609 from zsxwing/SPARK-12591.
*	[SPARK-12510][STREAMING] Refactor ActorReceiver to support Java	Shixiong Zhu	2016-01-07	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	This PR includes the following changes: 1. Rename `ActorReceiver` to `ActorReceiverSupervisor` 2. Remove `ActorHelper` 3. Add a new `ActorReceiver` for Scala and `JavaActorReceiver` for Java 4. Add `JavaActorWordCount` example Author: Shixiong Zhu <shixiong@databricks.com> Closes #10457 from zsxwing/java-actor-stream.
*	[SPARK-12665][CORE][GRAPHX] Remove Vector, VectorSuite and ↵	Kousuke Saruta	2016-01-06	1	-0/+7
\| \| \| \| \| \| \| \| \| \|	GraphKryoRegistrator which are deprecated and no longer used Whole code of Vector.scala, VectorSuite.scala and GraphKryoRegistrator.scala are no longer used so it's time to remove them in Spark 2.0. Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes #10613 from sarutak/SPARK-12665.
*	[SPARK-12659] fix NPE in UnsafeExternalSorter (used by cartesian product)	Davies Liu	2016-01-05	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Cartesian product use UnsafeExternalSorter without comparator to do spilling, it will NPE if spilling happens. This bug also hitted by #10605 cc JoshRosen Author: Davies Liu <davies@databricks.com> Closes #10606 from davies/fix_spilling.
*	[SPARK-12615] Remove some deprecated APIs in RDD/SparkContext	Reynold Xin	2016-01-05	1	-1/+52
\| \| \| \| \| \| \| \|	I looked at each case individually and it looks like they can all be removed. The only one that I had to think twice was toArray (I even thought about un-deprecating it, until I realized it was a problem in Java to have toArray returning java.util.List). Author: Reynold Xin <rxin@databricks.com> Closes #10569 from rxin/SPARK-12615.
*	[SPARK-12600][SQL] Remove deprecated methods in Spark SQL	Reynold Xin	2016-01-04	1	-3/+11
\| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #10559 from rxin/remove-deprecated-sql.
*	Update MimaExcludes now Spark 1.6 is in Maven.	Reynold Xin	2016-01-03	1	-147/+11
\| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #10561 from rxin/update-mima.
*	[SPARK-12481][CORE][STREAMING][SQL] Remove usage of Hadoop deprecated APIs ↵	Sean Owen	2016-01-02	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	and reflection that supported 1.x Remove use of deprecated Hadoop APIs now that 2.2+ is required Author: Sean Owen <sowen@cloudera.com> Closes #10446 from srowen/SPARK-12481.
*	[SPARK-7995][SPARK-6280][CORE] Remove AkkaRpcEnv and remove systemName from ↵	Shixiong Zhu	2015-12-31	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	setupEndpointRef ### Remove AkkaRpcEnv Keep `SparkEnv.actorSystem` because Streaming still uses it. Will remove it and AkkaUtils after refactoring Streaming actorStream API. ### Remove systemName There are 2 places using `systemName`: * `RpcEnvConfig.name`. Actually, although it's used as `systemName` in `AkkaRpcEnv`, `NettyRpcEnv` uses it as the service name to output the log `Successfully started service * on port `. Since the service name in log is useful, I keep `RpcEnvConfig.name`. `def setupEndpointRef(systemName: String, address: RpcAddress, endpointName: String)`. Each `ActorSystem` has a `systemName`. Akka requires `systemName` in its URI and will refuse a connection if `systemName` is not matched. However, `NettyRpcEnv` doesn't use it. So we can remove `systemName` from `setupEndpointRef` since we are removing `AkkaRpcEnv`. ### Remove RpcEnv.uriOf `uriOf` exists because Akka uses different URI formats for with and without authentication, e.g., `akka.ssl.tcp...` and `akka.tcp://...`. But `NettyRpcEnv` uses the same format. So it's not necessary after removing `AkkaRpcEnv`. Author: Shixiong Zhu <shixiong@databricks.com> Closes #10459 from zsxwing/remove-akka-rpc-env.
*	[SPARK-12588] Remove HttpBroadcast in Spark 2.0.	Reynold Xin	2015-12-30	1	-1/+2
\| \| \| \| \| \| \| \|	We switched to TorrentBroadcast in Spark 1.1, and HttpBroadcast has been undocumented since then. It's time to remove it in Spark 2.0. Author: Reynold Xin <rxin@databricks.com> Closes #10531 from rxin/SPARK-12588.
*	[SPARK-2331] SparkContext.emptyRDD should return RDD[T] not EmptyRDD[T]	Reynold Xin	2015-12-21	1	-0/+3
\| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #10394 from rxin/SPARK-2331.
*	Bump master version to 2.0.0-SNAPSHOT.	Reynold Xin	2015-12-19	1	-0/+136
\| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #10387 from rxin/version-bump.
*	[SPARK-11530][MLLIB] Return eigenvalues with PCA model	Sean Owen	2015-12-10	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	Add `computePrincipalComponentsAndVariance` to also compute PCA's explained variance. CC mengxr Author: Sean Owen <sowen@cloudera.com> Closes #9736 from srowen/SPARK-11530.
*	[SPARK-11155][WEB UI] Stage summary json should include stage duration	Xin Ren	2015-12-08	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	The json endpoint for stages doesn't include information on the stage duration that is present in the UI. This looks like a simple oversight, they should be included. eg., the metrics should be included at api/v1/applications/<appId>/stages. Metrics I've added are: submissionTime, firstTaskLaunchedTime and completionTime Author: Xin Ren <iamshrek@126.com> Closes #10107 from keypointt/SPARK-11155.
*	[SPARK-11314][BUILD][HOTFIX] Add exclusion for moved YARN classes.	Marcelo Vanzin	2015-12-04	1	-1/+4
\| \| \| \| \| \|	Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #10147 from vanzin/SPARK-11314.
*	[SPARK-3580][CORE] Add Consistent Method To Get Number of RDD Partitions ↵	Jeroen Schot	2015-12-02	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Across Different Languages I have tried to address all the comments in pull request https://github.com/apache/spark/pull/2447. Note that the second commit (using the new method in all internal code of all components) is quite intrusive and could be omitted. Author: Jeroen Schot <jeroen.schot@surfsara.nl> Closes #9767 from schot/master.
*	[SPARK-11996][CORE] Make the executor thread dump work again	Shixiong Zhu	2015-11-26	1	-0/+8
\| \| \| \| \| \| \| \| \| \|	In the previous implementation, the driver needs to know the executor listening address to send the thread dump request. However, in Netty RPC, the executor doesn't listen to any port, so the executor thread dump feature is broken. This patch makes the driver use the endpointRef stored in BlockManagerMasterEndpoint to send the thread dump request to fix it. Author: Shixiong Zhu <shixiong@databricks.com> Closes #9976 from zsxwing/executor-thread-dump.
*	[SPARK-11947][SQL] Mark deprecated methods with "This will be removed in ↵	Reynold Xin	2015-11-24	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	Spark 2.0." Also fixed some documentation as I saw them. Author: Reynold Xin <rxin@databricks.com> Closes #9930 from rxin/SPARK-11947.
*	[SPARK-4557][STREAMING] Spark Streaming foreachRDD Java API method should ↵	Bryan Cutler	2015-11-18	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	accept a VoidFunction<...> Currently streaming foreachRDD Java API uses a function prototype requiring a return value of null. This PR deprecates the old method and uses VoidFunction to allow for more concise declaration. Also added VoidFunction2 to Java API in order to use in Streaming methods. Unit test is added for using foreachRDD with VoidFunction, and changes have been tested with Java 7 and Java 8 using lambdas. Author: Bryan Cutler <bjcutler@us.ibm.com> Closes #9488 from BryanCutler/foreachRDD-VoidFunction-SPARK-4557.
*	[SPARK-9065][STREAMING][PYSPARK] Add MessageHandler for Kafka Python API	jerryshao	2015-11-17	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Fixed the merge conflicts in #7410 Closes #7410 Author: Shixiong Zhu <shixiong@databricks.com> Author: jerryshao <saisai.shao@intel.com> Author: jerryshao <sshao@hortonworks.com> Closes #9742 from zsxwing/pr7410.
*	[SPARK-11732] Removes some MiMa false positives	Timothy Hunter	2015-11-17	1	-6/+1
\| \| \| \| \| \| \| \|	This adds an extra filter for private or protected classes. We only filter for package private right now. Author: Timothy Hunter <timhunter@databricks.com> Closes #9697 from thunterdb/spark-11732.
*	[SPARK-11766][MLLIB] add toJson/fromJson to Vector/Vectors	Xiangrui Meng	2015-11-17	1	-0/+4
\| \| \| \| \| \| \| \|	This is to support JSON serialization of Param[Vector] in the pipeline API. It could be used for other purposes too. The schema is the same as `VectorUDT`. jkbradley Author: Xiangrui Meng <meng@databricks.com> Closes #9751 from mengxr/SPARK-11766.
*	[SPARK-10565][CORE] add missing web UI stats to /api/v1/applications JSON	Charles Yeh	2015-11-09	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	I looked at the other endpoints, and they don't seem to be missing any fields. Added fields: ![image](https://cloud.githubusercontent.com/assets/613879/10948801/58159982-82e4-11e5-86dc-62da201af910.png) Author: Charles Yeh <charlesyeh@dropbox.com> Closes #9472 from CharlesYeh/api_vars.
*	[SPARK-11541][SQL] Break JdbcDialects.scala into multiple files and mark ↵	Reynold Xin	2015-11-05	1	-1/+18
\| \| \| \| \| \| \| \|	various dialects as private. Author: Reynold Xin <rxin@databricks.com> Closes #9511 from rxin/SPARK-11541.
*	Revert "[SPARK-11469][SQL] Allow users to define nondeterministic udfs."	Reynold Xin	2015-11-05	1	-47/+0
\| \| \| \|	This reverts commit 9cf56c96b7d02a14175d40b336da14c2e1c88339.
*	[SPARK-11485][SQL] Make DataFrameHolder and DatasetHolder public.	Reynold Xin	2015-11-04	1	-0/+3
\| \| \| \| \| \| \| \|	These two classes should be public, since they are used in public code. Author: Reynold Xin <rxin@databricks.com> Closes #9445 from rxin/SPARK-11485.
*	[SPARK-9492][ML][R] LogisticRegression in R should provide model statistics	Yanbo Liang	2015-11-04	1	-1/+3
\| \| \| \| \| \| \| \|	Like ml ```LinearRegression```, ```LogisticRegression``` should provide a training summary including feature names and their coefficients. Author: Yanbo Liang <ybliang8@gmail.com> Closes #9303 from yanboliang/spark-9492.
*	[SPARK-11469][SQL] Allow users to define nondeterministic udfs.	Yin Huai	2015-11-02	1	-0/+47
\| \| \| \| \| \| \| \|	This is the first task (https://issues.apache.org/jira/browse/SPARK-11469) of https://issues.apache.org/jira/browse/SPARK-11438 Author: Yin Huai <yhuai@databricks.com> Closes #9393 from yhuai/udfNondeterministic.
*	[SPARK-11423] remove MapPartitionsWithPreparationRDD	Davies Liu	2015-10-30	1	-1/+5
\| \| \| \| \| \| \| \| \| \|	Since we do not need to preserve a page before calling compute(), MapPartitionsWithPreparationRDD is not needed anymore. This PR basically revert #8543, #8511, #8038, #8011 Author: Davies Liu <davies@databricks.com> Closes #9381 from davies/remove_prepare2.
*	[SPARK-10708] Consolidate sort shuffle implementations	Josh Rosen	2015-10-22	1	-2/+7
\| \| \| \| \| \| \| \|	There's a lot of duplication between SortShuffleManager and UnsafeShuffleManager. Given that these now provide the same set of functionality, now that UnsafeShuffleManager supports large records, I think that we should replace SortShuffleManager's serialized shuffle implementation with UnsafeShuffleManager's and should merge the two managers together. Author: Josh Rosen <joshrosen@databricks.com> Closes #8829 from JoshRosen/consolidate-sort-shuffle-implementations.
*	[SPARK-10921][YARN] Completely remove the use of SparkContext.prefer…	Jacek Laskowski	2015-10-19	1	-0/+3
\| \| \| \| \| \| \| \|	…redNodeLocationData Author: Jacek Laskowski <jacek.laskowski@deepsense.io> Closes #8976 from jaceklaskowski/SPARK-10921.
*	[SPARK-10810] [SPARK-10902] [SQL] Improve session management in SQL	Davies Liu	2015-10-08	1	-1/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR improve the sessions management by replacing the thread-local based to one SQLContext per session approach, introduce separated temporary tables and UDFs/UDAFs for each session. A new session of SQLContext could be created by: 1) create an new SQLContext 2) call newSession() on existing SQLContext For HiveContext, in order to reduce the cost for each session, the classloader and Hive client are shared across multiple sessions (created by newSession). CacheManager is also shared by multiple sessions, so cache a table multiple times in different sessions will not cause multiple copies of in-memory cache. Added jars are still shared by all the sessions, because SparkContext does not support sessions. cc marmbrus yhuai rxin Author: Davies Liu <davies@databricks.com> Closes #8909 from davies/sessions.
*	[SPARK-10938] [SQL] remove typeId in columnar cache	Davies Liu	2015-10-06	1	-1/+3
\| \| \| \| \| \| \| \|	This PR remove the typeId in columnar cache, it's not needed anymore, it also remove DATE and TIMESTAMP (use INT/LONG instead). Author: Davies Liu <davies@databricks.com> Closes #8989 from davies/refactor_cache.
*	[SPARK-9642] [ML] LinearRegression should supported weighted data	Meihua Wu	2015-09-21	1	-2/+6
\| \| \| \| \| \| \| \| \| \|	In many modeling application, data points are not necessarily sampled with equal probabilities. Linear regression should support weighting which account the over or under sampling. work in progress. Author: Meihua Wu <meihuawu@umich.edu> Closes #8631 from rotationsymmetry/SPARK-9642.
*	[SPARK-9808] Remove hash shuffle file consolidation.	Reynold Xin	2015-09-18	1	-0/+4
\| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #8812 from rxin/SPARK-9808-1.
*	[SPARK-10381] Fix mixup of taskAttemptNumber & attemptId in ↵	Josh Rosen	2015-09-15	1	-1/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	OutputCommitCoordinator When speculative execution is enabled, consider a scenario where the authorized committer of a particular output partition fails during the OutputCommitter.commitTask() call. In this case, the OutputCommitCoordinator is supposed to release that committer's exclusive lock on committing once that task fails. However, due to a unit mismatch (we used task attempt number in one place and task attempt id in another) the lock will not be released, causing Spark to go into an infinite retry loop. This bug was masked by the fact that the OutputCommitCoordinator does not have enough end-to-end tests (the current tests use many mocks). Other factors contributing to this bug are the fact that we have many similarly-named identifiers that have different semantics but the same data types (e.g. attemptNumber and taskAttemptId, with inconsistent variable naming which makes them difficult to distinguish). This patch adds a regression test and fixes this bug by always using task attempt numbers throughout this code. Author: Josh Rosen <joshrosen@databricks.com> Closes #8544 from JoshRosen/SPARK-10381.
*	[SPARK-7685] [ML] Apply weights to different samples in Logistic Regression	DB Tsai	2015-09-15	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \|	In fraud detection dataset, almost all the samples are negative while only couple of them are positive. This type of high imbalanced data will bias the models toward negative resulting poor performance. In python-scikit, they provide a correction allowing users to Over-/undersample the samples of each class according to the given weights. In auto mode, selects weights inversely proportional to class frequencies in the training set. This can be done in a more efficient way by multiplying the weights into loss and gradient instead of doing actual over/undersampling in the training dataset which is very expensive. http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html On the other hand, some of the training data maybe more important like the training samples from tenure users while the training samples from new users maybe less important. We should be able to provide another "weight: Double" information in the LabeledPoint to weight them differently in the learning algorithm. Author: DB Tsai <dbt@netflix.com> Author: DB Tsai <dbt@dbs-mac-pro.corp.netflix.com> Closes #7884 from dbtsai/SPARK-7685.
*	Update version to 1.6.0-SNAPSHOT.	Reynold Xin	2015-09-15	1	-2/+11
\| \| \| \| \| \|	Author: Reynold Xin <rxin@databricks.com> Closes #8350 from rxin/1.6.
*	[SPARK-9767] Remove ConnectionManager.	Reynold Xin	2015-09-07	1	-627/+630
\| \| \| \| \| \| \| \|	We introduced the Netty network module for shuffle in Spark 1.2, and has turned it on by default for 3 releases. The old ConnectionManager is difficult to maintain. If we merge the patch now, by the time it is released, it would be 1 yr for which ConnectionManager is off by default. It's time to remove it. Author: Reynold Xin <rxin@databricks.com> Closes #8161 from rxin/SPARK-9767.
*	[SPARK-10004] [SHUFFLE] Perform auth checks when clients read shuffle data.	Marcelo Vanzin	2015-09-02	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To correctly isolate applications, when requests to read shuffle data arrive at the shuffle service, proper authorization checks need to be performed. This change makes sure that only the application that created the shuffle data can read from it. Such checks are only enabled when "spark.authenticate" is enabled, otherwise there's no secure way to make sure that the client is really who it says it is. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #8218 from vanzin/SPARK-10004.
*	[SPARK-9580] [SQL] Replace singletons in SQL tests	Andrew Or	2015-08-13	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	A fundamental limitation of the existing SQL tests is that there is simply no way to create your own `SparkContext`. This is a serious limitation because the user may wish to use a different master or config. As a case in point, `BroadcastJoinSuite` is entirely commented out because there is no way to make it pass with the existing infrastructure. This patch removes the singletons `TestSQLContext` and `TestData`, and instead introduces a `SharedSQLContext` that starts a context per suite. Unfortunately the singletons were so ingrained in the SQL tests that this patch necessarily needed to touch all the SQL test files. <!-- Reviewable:start --> [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/8111) <!-- Reviewable:end --> Author: Andrew Or <andrew@databricks.com> Closes #8111 from andrewor14/sql-tests-refactor.
*	[SPARK-9704] [ML] Made ProbabilisticClassifier, Identifiable, VectorUDT ↵	Joseph K. Bradley	2015-08-12	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	public APIs Made ProbabilisticClassifier, Identifiable, VectorUDT public. All are annotated as DeveloperApi. CC: mengxr EronWright Author: Joseph K. Bradley <joseph@databricks.com> Closes #8004 from jkbradley/ml-api-public-items and squashes the following commits: 7ebefda [Joseph K. Bradley] update per code review 7ff0768 [Joseph K. Bradley] attepting to add mima fix 756d84c [Joseph K. Bradley] VectorUDT annotated as AlphaComponent ae7767d [Joseph K. Bradley] added another warning 94fd553 [Joseph K. Bradley] Made ProbabilisticClassifier, Identifiable, VectorUDT public APIs
*	[SPARK-9763][SQL] Minimize exposure of internal SQL classes.	Reynold Xin	2015-08-10	1	-3/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are a few changes in this pull request: 1. Moved all data sources to execution.datasources, except the public JDBC APIs. 2. In order to maintain backward compatibility from 1, added a backward compatibility translation map in data source resolution. 3. Moved ui and metric package into execution. 4. Added more documentation on some internal classes. 5. Renamed DataSourceRegister.format -> shortName. 6. Added "override" modifier on shortName. 7. Removed IntSQLMetric. Author: Reynold Xin <rxin@databricks.com> Closes #8056 from rxin/SPARK-9763 and squashes the following commits: 9df4801 [Reynold Xin] Removed hardcoded name in test cases. d9babc6 [Reynold Xin] Shorten. e484419 [Reynold Xin] Removed VisibleForTesting. 171b812 [Reynold Xin] MimaExcludes. 2041389 [Reynold Xin] Compile ... 79dda42 [Reynold Xin] Compile. 0818ba3 [Reynold Xin] Removed IntSQLMetric. c46884f [Reynold Xin] Two more fixes. f9aa88d [Reynold Xin] [SPARK-9763][SQL] Minimize exposure of internal SQL classes.