| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
This reverts commit b77f87673d1f9f03d4c83cf583158227c551359b.
|
|
|
|
| |
This reverts commit 0a16abadc59082b7d3a24d7f3625236658632813.
|
|
|
|
|
|
| |
MetastoreRelation's sameresult method only compare databasename and table name)"
This reverts commit 54864403c4f132d9c1380c015122a849dd44dff8.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MetastoreRelation's sameresult method only compare databasename and table name)
override the MetastoreRelation's sameresult method only compare databasename and table name
because in previous :
cache table t1;
select count(*) from t1;
it will read data from memory but the sql below will not,instead it read from hdfs:
select count(*) from t1 t;
because cache data is keyed by logical plan and compare with sameResult ,so when table with alias the same table 's logicalplan is not the same logical plan with out alias so modify the sameresult method only compare databasename and table name
Author: seayi <405078363@qq.com>
Author: Michael Armbrust <michael@databricks.com>
Closes #3898 from seayi/branch-1.2 and squashes the following commits:
8f0c7d2 [seayi] Update CachedTableSuite.scala
a277120 [seayi] Update HiveMetastoreCatalog.scala
8d910aa [seayi] Update HiveMetastoreCatalog.scala
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes Spark 1.2.1rc2 work again on Windows.
Without it you get following log output on creating a Spark context:
INFO org.apache.spark.SparkEnv:59 - Registering BlockManagerMaster
ERROR org.apache.spark.util.Utils:75 - Failed to create local root dir in .... Ignoring this directory.
ERROR org.apache.spark.storage.DiskBlockManager:75 - Failed to create any local dir.
Author: Martin Weindel <martin.weindel@gmail.com>
Author: mweindel <m.weindel@usu-software.de>
Closes #4299 from MartinWeindel/branch-1.2 and squashes the following commits:
535cb7f [Martin Weindel] fixed last commit
f17072e [Martin Weindel] moved condition to caller to avoid confusion on chmod700() return value
4de5e91 [Martin Weindel] reverted to unix line ends
fe2740b [mweindel] moved comment
ac4749c [mweindel] fixed chmod700 for Windows
|
|
|
|
|
|
|
|
|
|
|
| |
Author: Nicholas Chammas <nicholas.chammas@gmail.com>
Closes #4312 from nchammas/patch-2 and squashes the following commits:
9d943aa [Nicholas Chammas] [Docs] Fix Building Spark link text
(cherry picked from commit 3f941b68a2336aa7876aeda99865e7c19b53bc5c)
Signed-off-by: Andrew Or <andrew@databricks.com>
|
| |
|
| |
|
|
|
|
| |
This reverts commit 3e2d7d310b76c293b9ac787f204e6880f508f6ec.
|
|
|
|
| |
This reverts commit f53a4319ba5f0843c077e64ae5a41e2fac835a5b.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fix python example of ALS in guide, use Rating instead of np.array.
Author: Davies Liu <davies@databricks.com>
Closes #4226 from davies/fix_als_guide and squashes the following commits:
1433d76 [Davies Liu] fix python example of als in guide
(cherry picked from commit fdaad4eb0388cfe43b5b6600927eb7b9182646f9)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here's one way to make the hashes match what Maven's plugins would create. It takes a little extra footwork since OS X doesn't have the same command line tools. An alternative is just to make Maven output these of course - would that be better? I ask in case there is a reason I'm missing, like, we need to hash files that Maven doesn't build.
Author: Sean Owen <sowen@cloudera.com>
Closes #4161 from srowen/SPARK-5308 and squashes the following commits:
70d09d0 [Sean Owen] Use $(...) syntax
e25eff8 [Sean Owen] Generate MD5, SHA1 hashes in a format like Maven's plugin
(cherry picked from commit ff356e2a21e31998cda3062e560a276a3bfaa7ab)
Signed-off-by: Patrick Wendell <patrick@databricks.com>
|
| |
|
| |
|
|
|
|
| |
This reverts commit e87eb2b42f137c22194cfbca2abf06fecdf943da.
|
|
|
|
| |
This reverts commit adfed7086f10fa8db4eeac7996c84cf98f625e9a.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Defer use of log4j class until it's known that log4j 1.2 is being used. This may avoid dealing with log4j dependencies for callers that reroute slf4j to another logging framework. The only change is to push one half of the check in the original `if` condition inside. This is a trivial change, may or may not actually solve a problem, but I think it's all that makes sense to do for SPARK-4147.
Author: Sean Owen <sowen@cloudera.com>
Closes #4190 from srowen/SPARK-4147 and squashes the following commits:
4e99942 [Sean Owen] Defer use of log4j class until it's known that log4j 1.2 is being used. This may avoid dealing with log4j dependencies for callers that reroute slf4j to another logging framework.
(cherry picked from commit 54e7b456dd56c9e52132154e699abca87563465b)
Signed-off-by: Patrick Wendell <patrick@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
j.u.c.ConcurrentHashMap is more battle tested.
cc rxin JoshRosen pwendell
Author: Davies Liu <davies@databricks.com>
Closes #4208 from davies/safe-conf and squashes the following commits:
c2182dc [Davies Liu] address comments, fix tests
3a1d821 [Davies Liu] fix test
da14ced [Davies Liu] Merge branch 'master' of github.com:apache/spark into safe-conf
ae4d305 [Davies Liu] change to j.u.c.ConcurrentMap
f8fa1cf [Davies Liu] change to TrieMap
a1d769a [Davies Liu] make SparkConf thread-safe
(cherry picked from commit 142093179a4c40bdd90744191034de7b94a963ff)
Signed-off-by: Josh Rosen <joshrosen@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Another trivial one. The RAT failure was due to temp files from `FailureSuite` not being cleaned up. This just makes the cleanup more reliable by using the standard temp dir mechanism.
Author: Sean Owen <sowen@cloudera.com>
Closes #4189 from srowen/SPARK-4430 and squashes the following commits:
9ea63ff [Sean Owen] Properly acquire a temp directory to ensure it is cleaned up at shutdown, which helps avoid a RAT check failure
(cherry picked from commit 0528b85cf96f9c9c074b5fbb5b9c5dd8071c0bc7)
Signed-off-by: Andrew Or <andrew@databricks.com>
|
|
|
|
|
|
| |
file was renamed to completed file"
This reverts commit 8f55beeb51e6ea72e63af3f276497f61dd24d09b.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
renamed to completed file
`FsHistoryProvider` tries to update application status but if `checkForLogs` is called before `.inprogress` file is renamed to completed file, the file is not recognized as completed.
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes #4132 from sarutak/SPARK-5344 and squashes the following commits:
9658008 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5344
d2c72b6 [Kousuke Saruta] Fixed update issue of FsHistoryProvider
(cherry picked from commit 8f5c827b01026bf45fc774ed7387f11a941abea8)
Signed-off-by: Andrew Or <andrew@databricks.com>
Conflicts:
core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
works in cluster mode
This is a trivial addendum to SPARK-4506, which was already resolved. noted by Asim Jalis in SPARK-4506.
Author: Sean Owen <sowen@cloudera.com>
Closes #4160 from srowen/SPARK-4506 and squashes the following commits:
5f5f7df [Sean Owen] Update more docs to reflect that standalone works in cluster mode
(cherry picked from commit 9f6435763d173d2abf82d16b5878983fa8bf3419)
Signed-off-by: Andrew Or <andrew@databricks.com>
|
|
|
|
|
|
|
|
|
|
| |
...bmit2.cmd if it is defined
Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
Closes #4177 from jacek-lewandowski/SPARK-5382-1.2 and squashes the following commits:
41cef25 [Jacek Lewandowski] SPARK-5382: Use SPARK_CONF_DIR in spark-class and spark-submit, spark-submit2.cmd if it is defined
|
|
|
|
|
|
|
|
| |
Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
Closes #4179 from jacek-lewandowski/SPARK-5382-1.3 and squashes the following commits:
55d7791 [Jacek Lewandowski] SPARK-5382: Use SPARK_CONF_DIR in spark-class if it is defined
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As per the JIRA. I copied the `spark.executor.extra*` text, but removed info that appears to be specific to the `executor` config and not `driver`.
Author: Sean Owen <sowen@cloudera.com>
Closes #4185 from srowen/SPARK-3852 and squashes the following commits:
f60a8a1 [Sean Owen] Document spark.driver.extra* configs
(cherry picked from commit c586b45dd25b50be7f195df2ce91b307e1ed71a9)
Signed-off-by: Andrew Or <andrew@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
also rename "slaveHostname" to "executorHostname"
Author: Ryan Williams <ryan.blake.williams@gmail.com>
Closes #4195 from ryan-williams/exec and squashes the following commits:
e60a7bb [Ryan Williams] log executor ID at executor-construction time
(cherry picked from commit aea25482c370fbcf712a464501605bc16ee4ed5d)
Signed-off-by: Andrew Or <andrew@databricks.com>
Conflicts:
core/src/main/scala/org/apache/spark/executor/Executor.scala
|
|
|
|
|
|
|
|
| |
Author: Ryan Williams <ryan.blake.williams@gmail.com>
Closes #4194 from ryan-williams/metrics and squashes the following commits:
7c5a33f [Ryan Williams] set executor ID before creating MetricsSystem
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Also fixed java link
Author: Jongyoul Lee <jongyoul@gmail.com>
Closes #4172 from jongyoul/SPARK-FIXDOC and squashes the following commits:
6be03e5 [Jongyoul Lee] [SPARK-5058] Part 2. Typos and broken URL - Also fixed java link
(cherry picked from commit 09e09c548e7722fca1cdc89bd37de2cee58f4ce9)
Signed-off-by: Reynold Xin <rxin@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
partitioner of EdgeRDDImp...
If the value of 'spark.default.parallelism' does not match the number of partitoins in EdgePartition(EdgeRDDImpl),
the following error occurs in ReplicatedVertexView.scala:72;
object GraphTest extends Logging {
def run[VD: ClassTag, ED: ClassTag](graph: Graph[VD, ED]): VertexRDD[Int] = {
graph.aggregateMessages(
ctx => {
ctx.sendToSrc(1)
ctx.sendToDst(2)
},
_ + _)
}
}
val g = GraphLoader.edgeListFile(sc, "graph.txt")
val rdd = GraphTest.run(g)
java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of partitions
at org.apache.spark.rdd.ZippedPartitionsBaseRDD.getPartitions(ZippedPartitionsRDD.scala:57)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:206)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:204)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:206)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:204)
at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:82)
at org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:80)
at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:193)
at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:191)
...
Author: Takeshi Yamamuro <linguin.m.s@gmail.com>
Closes #4136 from maropu/EdgePartitionBugFix and squashes the following commits:
0cd8942 [Ankur Dave] Use more concise getOrElse
aad4a2c [Ankur Dave] Add unit test for non-default number of edge partitions
0a2f32b [Takeshi Yamamuro] Do not use Partitioner.defaultPartitioner as a partitioner of EdgeRDDImpl
(cherry picked from commit e224dbb011789297cd6c6ba095f702c042869ed6)
Signed-off-by: Ankur Dave <ankurdave@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds more helpful error messages for invalid programs that define nested RDDs, broadcast RDDs, perform actions inside of transformations (e.g. calling `count()` from inside of `map()`), and call certain methods on stopped SparkContexts. Currently, these invalid programs lead to confusing NullPointerExceptions at runtime and have been a major source of questions on the mailing list and StackOverflow.
In a few cases, I chose to log warnings instead of throwing exceptions in order to avoid any chance that this patch breaks programs that worked "by accident" in earlier Spark releases (e.g. programs that define nested RDDs but never run any jobs with them).
In SparkContext, the new `assertNotStopped()` method is used to check whether methods are being invoked on a stopped SparkContext. In some cases, user programs will not crash in spite of calling methods on stopped SparkContexts, so I've only added `assertNotStopped()` calls to methods that always throw exceptions when called on stopped contexts (e.g. by dereferencing a null `dagScheduler` pointer).
Author: Josh Rosen <joshrosen@databricks.com>
Closes #3884 from JoshRosen/SPARK-5063 and squashes the following commits:
a38774b [Josh Rosen] Fix spelling typo
a943e00 [Josh Rosen] Convert two exceptions into warnings in order to avoid breaking user programs in some edge-cases.
2d0d7f7 [Josh Rosen] Fix test to reflect 1.2.1 compatibility
3f0ea0c [Josh Rosen] Revert two unintentional formatting changes
8e5da69 [Josh Rosen] Remove assertNotStopped() calls for methods that were sometimes safe to call on stopped SC's in Spark 1.2
8cff41a [Josh Rosen] IllegalStateException fix
6ef68d0 [Josh Rosen] Fix Python line length issues.
9f6a0b8 [Josh Rosen] Add improved error messages to PySpark.
13afd0f [Josh Rosen] SparkException -> IllegalStateException
8d404f3 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-5063
b39e041 [Josh Rosen] Fix BroadcastSuite test which broadcasted an RDD
99cc09f [Josh Rosen] Guard against calling methods on stopped SparkContexts.
34833e8 [Josh Rosen] Add more descriptive error message.
57cc8a1 [Josh Rosen] Add error message when directly broadcasting RDD.
15b2e6b [Josh Rosen] [SPARK-5063] Useful error messages for nested RDDs and actions inside of transformations
(cherry picked from commit cef1f092a628ac20709857b4388bb10e0b5143b0)
Signed-off-by: Josh Rosen <joshrosen@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because of lacking of `BlockAllocationEvent` in WAL recovery, the dangled event will mix into the new batch, which will lead to the wrong result. Details can be seen in [SPARK-5233](https://issues.apache.org/jira/browse/SPARK-5233).
Author: jerryshao <saisai.shao@intel.com>
Closes #4032 from jerryshao/SPARK-5233 and squashes the following commits:
f0b0c0b [jerryshao] Further address the comments
a237c75 [jerryshao] Address the comments
e356258 [jerryshao] Fix bug in unit test
558bdc3 [jerryshao] Correctly replay the WAL log when recovering from failure
(cherry picked from commit 3c3fa632e6ba45ce536065aa1145698385301fb2)
Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
|
|
|
|
| |
conversions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a refactored fix based on jerryshao 's PR #4037
This enabled deletion of old WAL files containing the received block data.
Improvements over #4037
- Respecting the rememberDuration of all receiver streams. In #4037, if there were two receiver streams with multiple remember durations, the deletion would have delete based on the shortest remember duration, thus deleting data prematurely for the receiver stream with longer remember duration.
- Added unit test to test creation of receiver WAL, automatic deletion, and respecting of remember duration.
jerryshao I am going to merge this ASAP to make it 1.2.1 Thanks for the initial draft of this PR. Made my job much easier.
Author: Tathagata Das <tathagata.das1565@gmail.com>
Author: jerryshao <saisai.shao@intel.com>
Closes #4149 from tdas/SPARK-5147 and squashes the following commits:
730798b [Tathagata Das] Added comments.
c4cf067 [Tathagata Das] Minor fixes
2579b27 [Tathagata Das] Refactored the fix to make sure that the cleanup respects the remember duration of all the receiver streams
2736fd1 [jerryshao] Delete the old WAL log periodically
(cherry picked from commit 3027f06b4127ab23a43c5ce8cebf721e3b6766e5)
Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The SparkConf is not thread-safe, but is accessed by many threads. The getAll() could return parts of the configs if another thread is access it.
This PR changes SparkConf.settings to a thread-safe TrieMap.
Author: Davies Liu <davies@databricks.com>
Closes #4143 from davies/safe-conf and squashes the following commits:
f8fa1cf [Davies Liu] change to TrieMap
a1d769a [Davies Liu] make SparkConf thread-safe
(cherry picked from commit 9bad062268676aaa66dcbddd1e0ab7f2d7742425)
Signed-off-by: Josh Rosen <joshrosen@databricks.com>
|
|
|
|
|
|
|
| |
Whenever a directory is created by the utility method, immediately restrict
its permissions so that only the owner has access to its contents.
Signed-off-by: Josh Rosen <joshrosen@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://issues.apache.org/jira/browse/SPARK-5006
I think the issue is produced in https://github.com/apache/spark/pull/1777.
Not digging mesos's backend yet. Maybe should add same logic either.
Author: WangTaoTheTonic <barneystinson@aliyun.com>
Author: WangTao <barneystinson@aliyun.com>
Closes #3841 from WangTaoTheTonic/SPARK-5006 and squashes the following commits:
8cdf96d [WangTao] indent thing
2d86d65 [WangTaoTheTonic] fix line length
7cdfd98 [WangTaoTheTonic] fit for new HttpServer constructor
61a370d [WangTaoTheTonic] some minor fixes
bc6e1ec [WangTaoTheTonic] rebase
67bcb46 [WangTaoTheTonic] put conf at 3rd position, modify suite class, add comments
f450cd1 [WangTaoTheTonic] startServiceOnPort will use a SparkConf arg
29b751b [WangTaoTheTonic] rebase as ExecutorRunnableUtil changed to ExecutorRunnable
396c226 [WangTaoTheTonic] make the grammar more like scala
191face [WangTaoTheTonic] invalid value name
62ec336 [WangTaoTheTonic] spark.port.maxRetries doesn't work
Conflicts:
external/mqtt/src/test/scala/org/apache/spark/streaming/mqtt/MQTTStreamSuite.scala
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
generator to prevent infinite loop
I looked into GraphGenerators#chooseCell, and found that chooseCell can't generate more edges than pow(2, (2 * (log2(numVertices)-1))) to make a Power-law graph. (Ex. numVertices:4 upperbound:4, numVertices:8 upperbound:16, numVertices:16 upperbound:64)
If we request more edges over the upperbound, rmatGraph fall into infinite loop. So, how about adding an argument validation?
Author: Kenji Kikushima <kikushima.kenji@lab.ntt.co.jp>
Closes #3950 from kj-ki/SPARK-5064 and squashes the following commits:
4ee18c7 [Ankur Dave] Reword error message and add unit test
d760bc7 [Kenji Kikushima] Add numEdges upperbound validation for R-MAT graph generator to prevent infinite loop.
(cherry picked from commit 3ee3ab592eee831d759c940eb68231817ad6d083)
Signed-off-by: Ankur Dave <ankurdave@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
"spark.driver.extraClassPath" is set in defaults.conf
Author: GuoQiang Li <witgo@qq.com>
Closes #3050 from witgo/SPARK-4161 and squashes the following commits:
abb6fa4 [GuoQiang Li] move usejavacp opt to spark-shell
89e39e7 [GuoQiang Li] review commit
c2a6f04 [GuoQiang Li] Spark shell class path is not correctly set if "spark.driver.extraClassPath" is set in defaults.conf
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Hi all - I've renamed the unhelpfully named variable and added a comment clarifying what's actually happening.
Author: Ilya Ganelin <ilya.ganelin@capitalone.com>
Closes #3666 from ilganeli/SPARK-4569B and squashes the following commits:
1810394 [Ilya Ganelin] [SPARK-4569] Rename 'externalSorting' in Aggregator
e2d2092 [Ilya Ganelin] [SPARK-4569] Rename 'externalSorting' in Aggregator
d7cefec [Ilya Ganelin] [SPARK-4569] Rename 'externalSorting' in Aggregator
5b3f39c [Ilya Ganelin] [SPARK-4569] Rename in Aggregator
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The driver hangs sometimes when we coalesce RDD partitions. See JIRA for more details and reproduction.
This is because our use of empty string as default preferred location in `CoalescedRDDPartition` causes the `TaskSetManager` to schedule the corresponding task on host `""` (empty string). The intended semantics here, however, is that the partition does not have a preferred location, and the TSM should schedule the corresponding task accordingly.
Author: Andrew Or <andrew@databricks.com>
Closes #3633 from andrewor14/coalesce-preferred-loc and squashes the following commits:
e520d6b [Andrew Or] Oops
3ebf8bd [Andrew Or] A few comments
f370a4e [Andrew Or] Fix tests
2f7dfb6 [Andrew Or] Avoid using empty string as default preferred location
(cherry picked from commit 4f93d0cabe5d1fc7c0fd0a33d992fd85df1fecb4)
Signed-off-by: Andrew Or <andrew@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Author: Kannan Rajah <rkannan82@gmail.com>
Closes #4108 from rkannan82/master and squashes the following commits:
eca095b [Kannan Rajah] Update pom.xml to pull MapR's Hadoop version 2.4.1.
(cherry picked from commit ec5b0f2cef4b30047c7f88bdc00d10b6aa308124)
Signed-off-by: Patrick Wendell <patrick@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Include the python source code into assembly jar.
cc mengxr pwendell
Author: Davies Liu <davies@databricks.com>
Closes #4128 from davies/build_streaming2 and squashes the following commits:
546af4c [Davies Liu] fix indent
48859b2 [Davies Liu] include python source code
(cherry picked from commit bad6c5721167153d7ed834b49f87bf2980c6ed67)
Signed-off-by: Patrick Wendell <patrick@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
from a projection(Backport to Spark-1.2)
This is a follow up of #3796 , which can not be merged back to Spark-1.2. Manually merge it.
Author: Cheng Hao <hao.cheng@intel.com>
Closes #4013 from chenghao-intel/spark_4959_backport and squashes the following commits:
1f6c93d [Cheng Hao] backport to Spark-1.2
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... by Piotr Kolaczkowski)
Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
Closes #4113 from jacek-lewandowski/SPARK-4660-master and squashes the following commits:
a5e84ca [Jacek Lewandowski] SPARK-4660: Use correct class loader in JavaSerializer (copy of PR #3840 by Piotr Kolaczkowski)
(cherry picked from commit c93a57f0d6dc32b127aa68dbe4092ab0b22a9667)
Signed-off-by: Patrick Wendell <patrick@databricks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- The ReceiverTracker receivers `RegisterReceiver` messages two times
1) When the actor at `ReceiverSupervisorImpl`'s preStart is invoked
2) After the receiver is started at the executor `onReceiverStart()` at `ReceiverSupervisorImpl`
Though, RegisterReceiver message uses the same streamId and the receiverInfo gets updated everytime
the message is processed at the `ReceiverTracker`, it makes sense to call register receiver only after the
receiver is started.
Author: Ilayaperumal Gopinathan <igopinathan@pivotal.io>
Closes #3648 from ilayaperumalg/RTActor-remove-prestart and squashes the following commits:
868efab [Ilayaperumal Gopinathan] Increase receiverInfo collector timeout to 2 secs
3118e5e [Ilayaperumal Gopinathan] Fix StreamingListenerSuite's startedReceiverStreamIds size
634abde [Ilayaperumal Gopinathan] Remove duplicate RegisterReceiver message
(cherry picked from commit 4afad9c7702239f6d5b1b49dc48ee08580964e17)
Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix run-example script to fail fast with useful error message if multiple
example assembly JARs are present.
Author: Venkata Ramana Gollamudi <ramana.gollamudi@huawei.com>
Closes #3377 from gvramana/run-example_fails and squashes the following commits:
fa7f481 [Venkata Ramana Gollamudi] Fixed review comments, avoiding ls output scanning.
6aa1ab7 [Venkata Ramana Gollamudi] Fix run-examples script error during multiple jars
(cherry picked from commit 74de94ea6db96a04b278c6106264313504d7b8f3)
Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Conflicts:
bin/compute-classpath.sh
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
warning
JIRA: https://issues.apache.org/jira/browse/SPARK-5282
fix the possible int overflow in the memory computation warning
Author: Yuhao Yang <hhbyyh@gmail.com>
Closes #4069 from hhbyyh/addscStop and squashes the following commits:
e54e5c8 [Yuhao Yang] change to MB based number
7afac23 [Yuhao Yang] 5282: fix int overflow in the warning
(cherry picked from commit 4432568aac1d4a44fa1a7c3469f095eb7a6ce945)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
|