| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR refactors LiveListenerBus and StreamingListenerBus and extracts the common codes to a parent class `ListenerBus`.
It also includes bug fixes in #3710:
1. Fix the race condition of queueFullErrorMessageLogged in LiveListenerBus and StreamingListenerBus to avoid outputing `queue-full-error` logs multiple times.
2. Make sure the SHUTDOWN message will be delivered to listenerThread, so that we can make sure listenerThread will always be able to exit.
3. Log the error from listener rather than crashing listenerThread in StreamingListenerBus.
During fixing the above bugs, we find it's better to make LiveListenerBus and StreamingListenerBus have the same bahaviors. Then there will be many duplicated codes in LiveListenerBus and StreamingListenerBus.
Therefore, I extracted their common codes to `ListenerBus` as a parent class: LiveListenerBus and StreamingListenerBus only need to extend `ListenerBus` and implement `onPostEvent` (how to process an event) and `onDropEvent` (do something when droppping an event).
Author: zsxwing <zsxwing@gmail.com>
Closes #4006 from zsxwing/SPARK-4859-refactor and squashes the following commits:
c8dade2 [zsxwing] Fix the code style after renaming
5715061 [zsxwing] Rename ListenerHelper to ListenerBus and the original ListenerBus to AsynchronousListenerBus
f0ef647 [zsxwing] Fix the code style
4e85ffc [zsxwing] Merge branch 'master' into SPARK-4859-refactor
d2ef990 [zsxwing] Add private[spark]
4539f91 [zsxwing] Remove final to pass MiMa tests
a9dccd3 [zsxwing] Remove SparkListenerShutdown
7cc04c3 [zsxwing] Refactor LiveListenerBus and StreamingListenerBus and make them share same code base
|
|
|
|
|
|
|
|
|
|
|
| |
Depends on [SPARK-5413](https://issues.apache.org/jira/browse/SPARK-5413) / #4209, included here, will rebase once the latter's merged.
Author: Ryan Williams <ryan.blake.williams@gmail.com>
Closes #4218 from ryan-williams/udp and squashes the following commits:
ebae393 [Ryan Williams] Add support for sending Graphite metrics via UDP
cb58262 [Ryan Williams] bump metrics dependency to v3.1.0
|
|
|
|
|
|
|
|
|
|
| |
These are more `javadoc` 8-related changes I spotted while investigating. These should be helpful in any event, but this does not nearly resolve SPARK-3359, which may never be feasible while using `unidoc` and `javadoc` 8.
Author: Sean Owen <sowen@cloudera.com>
Closes #4193 from srowen/SPARK-3359 and squashes the following commits:
5b33f66 [Sean Owen] Additional scaladoc fixes for javadoc 8; still not going to be javadoc 8 compatible
|
|
|
|
|
|
|
|
|
|
| |
Just in case there is a bug in the SerializationDebugger that makes error reporting worse than it was.
Author: Reynold Xin <rxin@databricks.com>
Closes #4297 from rxin/ser-config and squashes the following commits:
f1d4629 [Reynold Xin] [SPARK-5307] Add a config option for SerializationDebugger.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a SerializationDebugger that is used to add serialization path to a NotSerializableException. When a NotSerializableException is encountered, the debugger visits the object graph to find the path towards the object that cannot be serialized, and constructs information to help user to find the object.
The patch uses the internals of JVM serialization (in particular, heavy usage of ObjectStreamClass). Compared with an earlier attempt, this one provides extra information including field names, array offsets, writeExternal calls, etc.
An example serialization stack:
```
Serialization stack:
- object not serializable (class: org.apache.spark.serializer.NotSerializable, value: org.apache.spark.serializer.NotSerializable2c43caa4)
- element of array (index: 0)
- array (class [Ljava.lang.Object;, size 1)
- field (class: org.apache.spark.serializer.SerializableArray, name: arrayField, type: class [Ljava.lang.Object;)
- object (class org.apache.spark.serializer.SerializableArray, org.apache.spark.serializer.SerializableArray193c5908)
- writeExternal data
- externalizable object (class org.apache.spark.serializer.ExternalizableClass, org.apache.spark.serializer.ExternalizableClass320bdadc)
```
Author: Reynold Xin <rxin@databricks.com>
Closes #4098 from rxin/SerializationDebugger and squashes the following commits:
553b3ff [Reynold Xin] Update SerializationDebuggerSuite.scala
572d0cb [Reynold Xin] Disable automatically when reflection fails.
b349b77 [Reynold Xin] [SPARK-5307] SerializationDebugger to help debug NotSerializableException - take 2
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously I had tried to solve this with by adding a line in Spark's log4j-defaults.properties.
The issue with the message in log4j-defaults.properties was that the log4j.properties packaged inside Hadoop was getting picked up instead. While it would be ideal to fix that as well, we still want to quiet this in situations where a user supplies their own custom log4j properties.
Author: Sandy Ryza <sandy@cloudera.com>
Closes #4192 from sryza/sandy-spark-5393 and squashes the following commits:
4d5dedc [Sandy Ryza] Only set log level if unset
46e07c5 [Sandy Ryza] SPARK-5393. Flood of util.RackResolver log messages after SPARK-1714
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the Python process is released into pool only after the task had finished, it cause many process forked if coalesce() is called.
This PR will change it to release the process as soon as read all the data from it (finish the partition), then a process could be reused to process multiple partitions in a single task.
Author: Davies Liu <davies@databricks.com>
Closes #4238 from davies/py_leak and squashes the following commits:
ec80a43 [Davies Liu] add @volatile
6da437a [Davies Liu] address comments
24ed322 [Davies Liu] fix python process leak while coalesce()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have seen many use cases of `treeAggregate`/`treeReduce` outside the ML domain. Maybe it is time to move them to Core. pwendell
Author: Xiangrui Meng <meng@databricks.com>
Closes #4228 from mengxr/SPARK-5430 and squashes the following commits:
20ad40d [Xiangrui Meng] exclude tree* from mima
e89a43e [Xiangrui Meng] fix compile and update java doc
3ae1a4b [Xiangrui Meng] add treeReduce/treeAggregate to Python
6f948c5 [Xiangrui Meng] add treeReduce/treeAggregate to JavaRDDLike
d600b6c [Xiangrui Meng] move treeReduce and treeAggregate to core
|
|
|
|
|
|
|
|
|
|
|
| |
SerDeUtil.pairRDDToPython and SerDeUtil.pythonToPairRDD now both support empty RDDs by checking the result of take(1) instead of calling first which throws an exception.
Author: Michael Nazario <mnazario@palantir.com>
Closes #4236 from mnazario/feature/empty-first and squashes the following commits:
a531c0c [Michael Nazario] Added regression tests for SPARK-5441
e3b2fb6 [Michael Nazario] Added acceptance of the empty case
|
|
|
|
|
|
|
|
|
|
| |
This happens inside SparkEnv initialization as of #4194
Author: Ryan Williams <ryan.blake.williams@gmail.com>
Closes #4213 from ryan-williams/exec-id-set and squashes the following commits:
b3e4f7b [Ryan Williams] Remove redundant executor-id set() call
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In DriverSuite, we currently set a timeout of 60 seconds. If after this time the process has not terminated, we leak the process because we never destroy it.
In SparkSubmitSuite, we currently do not have a timeout so the test can hang indefinitely.
Author: Andrew Or <andrew@databricks.com>
Closes #4230 from andrewor14/fix-driver-suite and squashes the following commits:
f5c80fd [Andrew Or] Fix timeout behaviors in both suites
8092c36 [Andrew Or] Stop SparkContext after every individual test
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
construction in ConnectionManager
This change reshuffles the order of initialization in `ConnectionManager` so that the last thing that happens is running `selectorThread`, which invokes a method that relies on object state in `ConnectionManager`
zsxwing also reported a similar problem in `BlockManager` in the JIRA, but I can't find a similar pattern there. Maybe it was subsequently fixed?
Author: Sean Owen <sowen@cloudera.com>
Closes #4225 from srowen/SPARK-1934 and squashes the following commits:
c4dec3b [Sean Owen] Init all object state in ConnectionManager constructor before starting thread in constructor that accesses object's state
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is found through reading RDD from `sc.newAPIHadoopRDD` and writing it back using `rdd.saveAsNewAPIHadoopFile` in pyspark.
It turns out that whenever there are multiple RDD conversions from JavaRDD to PythonRDD then back to JavaRDD, the exception below happens:
```
15/01/16 10:28:31 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 7)
java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to java.util.ArrayList
at org.apache.spark.api.python.SerDeUtil$$anonfun$pythonToJava$1$$anonfun$apply$1.apply(SerDeUtil.scala:157)
at org.apache.spark.api.python.SerDeUtil$$anonfun$pythonToJava$1$$anonfun$apply$1.apply(SerDeUtil.scala:153)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308)
```
The test case code below reproduces it:
```
from pyspark.rdd import RDD
dl = [
(u'2', {u'director': u'David Lean'}),
(u'7', {u'director': u'Andrew Dominik'})
]
dl_rdd = sc.parallelize(dl)
tmp = dl_rdd._to_java_object_rdd()
tmp2 = sc._jvm.SerDe.javaToPython(tmp)
t = RDD(tmp2, sc)
t.count()
tmp = t._to_java_object_rdd()
tmp2 = sc._jvm.SerDe.javaToPython(tmp)
t = RDD(tmp2, sc)
t.count() # it blows up here during the 2nd time of conversion
```
Author: Winston Chen <wchen@quid.com>
Closes #4146 from wingchen/master and squashes the following commits:
903df7d [Winston Chen] SPARK-5361, update to toSeq based on the PR
5d90a83 [Winston Chen] SPARK-5361, make python pretty, so to pass PEP 8 checks
126be6b [Winston Chen] SPARK-5361, add in test case
4cf1187 [Winston Chen] SPARK-5361, add in test case
9f1a097 [Winston Chen] add in tuple handling while converting form python RDD back to JavaRDD
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SparkListenerExecutorAdded and SparkListenerExecutorRemoved
Recently `SparkListenerExecutorAdded` and `SparkListenerExecutorRemoved` are added.
I think it's useful if they have timestamp and the reason why an executor is removed.
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes #4082 from sarutak/SPARK-5291 and squashes the following commits:
a026ff2 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5291
979dfe1 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5291
cf9f9080 [Kousuke Saruta] Fixed test case
1f2a89b [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5291
243f2a60 [Kousuke Saruta] Modified MesosSchedulerBackendSuite
a527c35 [Kousuke Saruta] Added timestamp to SparkListenerExecutorAdded
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
from all FSs
...mbineFileSplits
Author: Sandy Ryza <sandy@cloudera.com>
Closes #4050 from sryza/sandy-spark-5199 and squashes the following commits:
864514b [Sandy Ryza] Add tests and fix bug
0d504f1 [Sandy Ryza] Prettify
915c7e6 [Sandy Ryza] Get metrics from all filesystems
cdbc3e8 [Sandy Ryza] SPARK-5199. Input metrics should show up for InputFormats that return CombineFileSplits
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
stage" is broken
This reenables and fixes this test, after addressing two issues:
- The Semaphore that was intended to be shared locally was being serialized and copied; it's now a static member in the companion object as in other tests
- Later changes to Spark means that cancelling the first task will not cancel the shared stage and therefore the second task should succeed
Author: Sean Owen <sowen@cloudera.com>
Closes #4180 from srowen/SPARK-960 and squashes the following commits:
43da66f [Sean Owen] Fix 'two jobs sharing the same stage' test and reenable it: truly share a Semaphore locally as intended, and update expectation of failure in non-cancelled task
|
|
|
|
|
|
|
|
|
|
| |
Defer use of log4j class until it's known that log4j 1.2 is being used. This may avoid dealing with log4j dependencies for callers that reroute slf4j to another logging framework. The only change is to push one half of the check in the original `if` condition inside. This is a trivial change, may or may not actually solve a problem, but I think it's all that makes sense to do for SPARK-4147.
Author: Sean Owen <sowen@cloudera.com>
Closes #4190 from srowen/SPARK-4147 and squashes the following commits:
4e99942 [Sean Owen] Defer use of log4j class until it's known that log4j 1.2 is being used. This may avoid dealing with log4j dependencies for callers that reroute slf4j to another logging framework.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
j.u.c.ConcurrentHashMap is more battle tested.
cc rxin JoshRosen pwendell
Author: Davies Liu <davies@databricks.com>
Closes #4208 from davies/safe-conf and squashes the following commits:
c2182dc [Davies Liu] address comments, fix tests
3a1d821 [Davies Liu] fix test
da14ced [Davies Liu] Merge branch 'master' of github.com:apache/spark into safe-conf
ae4d305 [Davies Liu] change to j.u.c.ConcurrentMap
f8fa1cf [Davies Liu] change to TrieMap
a1d769a [Davies Liu] make SparkConf thread-safe
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DisassociatedEvent
https://issues.apache.org/jira/browse/SPARK-5268
In CoarseGrainedExecutorBackend, we subscribe DisassociatedEvent in executor backend actor and exit the program upon receive such event...
let's consider the following case
The user may develop an Akka-based program which starts the actor with Spark's actor system and communicate with an external actor system (e.g. an Akka-based receiver in spark streaming which communicates with an external system) If the external actor system fails or disassociates with the actor within spark's system with purpose, we may receive DisassociatedEvent and the executor is restarted.
This is not the expected behavior.....
----
This is a simple fix to check the event before making the quit decision
Author: CodingCat <zhunansjtu@gmail.com>
Closes #4063 from CodingCat/SPARK-5268 and squashes the following commits:
4d7d48e [CodingCat] simplify the log
18c36f4 [CodingCat] more descriptive log
f299e0b [CodingCat] clean log
1632e79 [CodingCat] check whether DisassociatedEvent is relevant before quit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this change, here's what the UI looks like:
![image](https://cloud.githubusercontent.com/assets/1108612/5809994/1ec8a904-9ff4-11e4-8f24-6a59a1a858f7.png)
If you want to locally test this, you need to spin up multiple executors, because the shuffle read metrics are only shown for data read remotely.
Author: Kay Ousterhout <kayousterhout@gmail.com>
Closes #4110 from kayousterhout/SPARK-5326 and squashes the following commits:
610051e [Kay Ousterhout] Josh style comments
5feaa28 [Kay Ousterhout] What is the difference here??
aa129cb [Kay Ousterhout] Removed inadvertent change
721c742 [Kay Ousterhout] Improved tooltip
f3a7111 [Kay Ousterhout] Style fix
679b4e9 [Kay Ousterhout] [SPARK-5326] Show fetch wait time as optional metric in the UI
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
renamed to completed file
`FsHistoryProvider` tries to update application status but if `checkForLogs` is called before `.inprogress` file is renamed to completed file, the file is not recognized as completed.
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes #4132 from sarutak/SPARK-5344 and squashes the following commits:
9658008 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5344
d2c72b6 [Kousuke Saruta] Fixed update issue of FsHistoryProvider
|
|
|
|
|
|
|
|
|
|
| |
also rename "slaveHostname" to "executorHostname"
Author: Ryan Williams <ryan.blake.williams@gmail.com>
Closes #4195 from ryan-williams/exec and squashes the following commits:
e60a7bb [Ryan Williams] log executor ID at executor-construction time
|
|
|
|
|
|
|
|
| |
Author: Ryan Williams <ryan.blake.williams@gmail.com>
Closes #4194 from ryan-williams/metrics and squashes the following commits:
7c5a33f [Ryan Williams] set executor ID before creating MetricsSystem
|
|
|
|
|
|
|
|
|
|
| |
Added a comment about using math.min for choosing default partition count
Author: Idan Zalzberg <idanzalz@gmail.com>
Closes #4102 from idanz/patch-2 and squashes the following commits:
50e9d58 [Idan Zalzberg] Update SparkContext.scala
|
|
|
|
|
|
|
|
|
|
|
| |
event thread
Author: zsxwing <zsxwing@gmail.com>
Closes #4174 from zsxwing/SPARK-5214-unittest and squashes the following commits:
443e564 [zsxwing] Change the check interval to 5ms
7aaa2d7 [zsxwing] Add a test to demonstrate EventLoop can be stopped in the event thread
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds more helpful error messages for invalid programs that define nested RDDs, broadcast RDDs, perform actions inside of transformations (e.g. calling `count()` from inside of `map()`), and call certain methods on stopped SparkContexts. Currently, these invalid programs lead to confusing NullPointerExceptions at runtime and have been a major source of questions on the mailing list and StackOverflow.
In a few cases, I chose to log warnings instead of throwing exceptions in order to avoid any chance that this patch breaks programs that worked "by accident" in earlier Spark releases (e.g. programs that define nested RDDs but never run any jobs with them).
In SparkContext, the new `assertNotStopped()` method is used to check whether methods are being invoked on a stopped SparkContext. In some cases, user programs will not crash in spite of calling methods on stopped SparkContexts, so I've only added `assertNotStopped()` calls to methods that always throw exceptions when called on stopped contexts (e.g. by dereferencing a null `dagScheduler` pointer).
Author: Josh Rosen <joshrosen@databricks.com>
Closes #3884 from JoshRosen/SPARK-5063 and squashes the following commits:
a38774b [Josh Rosen] Fix spelling typo
a943e00 [Josh Rosen] Convert two exceptions into warnings in order to avoid breaking user programs in some edge-cases.
2d0d7f7 [Josh Rosen] Fix test to reflect 1.2.1 compatibility
3f0ea0c [Josh Rosen] Revert two unintentional formatting changes
8e5da69 [Josh Rosen] Remove assertNotStopped() calls for methods that were sometimes safe to call on stopped SC's in Spark 1.2
8cff41a [Josh Rosen] IllegalStateException fix
6ef68d0 [Josh Rosen] Fix Python line length issues.
9f6a0b8 [Josh Rosen] Add improved error messages to PySpark.
13afd0f [Josh Rosen] SparkException -> IllegalStateException
8d404f3 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-5063
b39e041 [Josh Rosen] Fix BroadcastSuite test which broadcasted an RDD
99cc09f [Josh Rosen] Guard against calling methods on stopped SparkContexts.
34833e8 [Josh Rosen] Add more descriptive error message.
57cc8a1 [Josh Rosen] Add error message when directly broadcasting RDD.
15b2e6b [Josh Rosen] [SPARK-5063] Useful error messages for nested RDDs and actions inside of transformations
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The SparkConf is not thread-safe, but is accessed by many threads. The getAll() could return parts of the configs if another thread is access it.
This PR changes SparkConf.settings to a thread-safe TrieMap.
Author: Davies Liu <davies@databricks.com>
Closes #4143 from davies/safe-conf and squashes the following commits:
f8fa1cf [Davies Liu] change to TrieMap
a1d769a [Davies Liu] make SparkConf thread-safe
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
description when it is very long
In some case the job description will be very long, such as a long sql. refer to #3718
This PR add a pop-up for job description when it is long.
![image](https://cloud.githubusercontent.com/assets/7018048/5847400/c757cbbc-a207-11e4-891f-528821c2e68d.png)
![image](https://cloud.githubusercontent.com/assets/7018048/5847409/d434b2b4-a207-11e4-8813-03a74b43d766.png)
Author: wangfei <wangfei1@huawei.com>
Closes #3819 from scwf/popup-descrip-ui and squashes the following commits:
ba02b83 [wangfei] address comments
a7c5e7b [wangfei] spot that it's been truncated
fbf6162 [wangfei] Merge branch 'master' into popup-descrip-ui
0bca96d [wangfei] remove no use val
4b55c3b [wangfei] fix style issue
353c6f4 [wangfei] pop up the description of job with a styled read-only text form field
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...llocator
The goal of this PR is to simplify YarnAllocator as much as possible and get it up to the level of code quality we see in the rest of Spark.
In service of this, it does a few things:
* Uses AMRMClient APIs for matching containers to requests.
* Adds calls to AMRMClient.removeContainerRequest so that, when we use a container, we don't end up requesting it again.
* Removes YarnAllocator's host->rack cache. YARN's RackResolver already does this caching, so this is redundant.
* Adds tests for basic YarnAllocator functionality.
* Breaks up the allocateResources method, which was previously nearly 300 lines.
* A little bit of stylistic cleanup.
* Fixes a bug that causes three times the requests to be filed when preferred host locations are given.
The patch is lossy. In particular, it loses the logic for trying to avoid containers bunching up on nodes. As I understand it, the logic that's gone is:
* If, in a single response from the RM, we receive a set of containers on a node, and prefer some number of containers on that node greater than 0 but less than the number we received, give back the delta between what we preferred and what we received.
This seems like a weird way to avoid bunching E.g. it does nothing to avoid bunching when we don't request containers on particular nodes.
Author: Sandy Ryza <sandy@cloudera.com>
Closes #3765 from sryza/sandy-spark-1714 and squashes the following commits:
32a5942 [Sandy Ryza] Muffle RackResolver logs
74f56dd [Sandy Ryza] Fix a couple comments and simplify requestTotalExecutors
60ea4bd [Sandy Ryza] Fix scalastyle
ca35b53 [Sandy Ryza] Simplify further
e9cf8a6 [Sandy Ryza] Fix YarnClusterSuite
257acf3 [Sandy Ryza] Remove locality stuff and more cleanup
59a3c5e [Sandy Ryza] Take out rack stuff
5f72fd5 [Sandy Ryza] Further documentation and cleanup
89edd68 [Sandy Ryza] SPARK-1714. Take advantage of AMRMClient APIs to simplify logic in YarnAllocator
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://issues.apache.org/jira/browse/SPARK-5336
Author: WangTao <barneystinson@aliyun.com>
Author: WangTaoTheTonic <barneystinson@aliyun.com>
Closes #4123 from WangTaoTheTonic/SPARK-5336 and squashes the following commits:
6c9676a [WangTao] Update ClientArguments.scala
9632d3a [WangTaoTheTonic] minor comment fix
d03d6fa [WangTaoTheTonic] import ordering should be alphabetical'
3112af9 [WangTao] spark.executor.cores must not be less than spark.task.cpus
|
|
|
|
|
|
|
|
|
|
|
|
| |
Completed Stages and Failed Stages" when they are empty
Related to SPARK-5228 and #4028, `AllStagesPage` also should hide the table for `ActiveStages`, `CompleteStages` and `FailedStages` when they are empty.
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes #4083 from sarutak/SPARK-5294 and squashes the following commits:
a7625c1 [Kousuke Saruta] Fixed conflicts
|
|
|
|
|
|
|
|
|
|
|
|
| |
UIWorkloadGenerator don't stop SparkContext. I ran UIWorkloadGenerator and try to watch the result at WebUI but Jobs are marked as finished.
It's because SparkContext is not stopped.
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes #4112 from sarutak/SPARK-5329 and squashes the following commits:
bcc0fa9 [Kousuke Saruta] Disabled scalastyle for a bock comment
86a3b95 [Kousuke Saruta] Fixed UIWorkloadGenerator to stop SparkContext in it
|
|
|
|
|
|
|
|
|
|
| |
... by Piotr Kolaczkowski)
Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
Closes #4113 from jacek-lewandowski/SPARK-4660-master and squashes the following commits:
a5e84ca [Jacek Lewandowski] SPARK-4660: Use correct class loader in JavaSerializer (copy of PR #3840 by Piotr Kolaczkowski)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Rewind ByteBuffer before making ByteString
(This fixes a bug introduced in #3849 / SPARK-4014)
Author: Jongyoul Lee <jongyoul@gmail.com>
Closes #4119 from jongyoul/SPARK-5333 and squashes the following commits:
c6693a8 [Jongyoul Lee] [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException - changed logDebug location
4141f58 [Jongyoul Lee] [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException - Added license information
2190606 [Jongyoul Lee] [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException - Adjusted imported libraries
b7f5517 [Jongyoul Lee] [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException - Rewind ByteBuffer before making ByteString
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pretty minor, but submitted for consideration -- this would at least help people make this check in the most efficient way I know.
Author: Sean Owen <sowen@cloudera.com>
Closes #4074 from srowen/SPARK-5270 and squashes the following commits:
66885b8 [Sean Owen] Add note that JavaRDDLike should not be implemented by user code
2e9b490 [Sean Owen] More tests, and Mima-exclude the new isEmpty method in JavaRDDLike
28395ff [Sean Owen] Add isEmpty to Java, Python
7dd04b7 [Sean Owen] Add efficient RDD.isEmpty()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR adds a simple `EventLoop` and use it to replace Actor in DAGScheduler. `EventLoop` is a general class to support that posting events in multiple threads and handling events in a single event thread.
Author: zsxwing <zsxwing@gmail.com>
Closes #4016 from zsxwing/event-loop and squashes the following commits:
aefa1ce [zsxwing] Add protected to on*** methods
5cfac83 [zsxwing] Remove null check of eventProcessLoop
dba35b2 [zsxwing] Add a test that onReceive swallows InterruptException
460f7b3 [zsxwing] Use volatile instead of Atomic things in unit tests
227bf33 [zsxwing] Add a stop flag and some tests
37f79c6 [zsxwing] Fix docs
55fb6f6 [zsxwing] Add private[spark] to EventLoop
1f73eac [zsxwing] Fix the import order
3b2e59c [zsxwing] Add EventLoop and change DAGScheduler to an EventLoop
|
|
|
|
|
|
|
|
|
|
|
|
| |
Author: Jongyoul Lee <jongyoul@gmail.com>
Closes #3897 from jongyoul/SPARK-5088 and squashes the following commits:
8232aa8 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Added a listenerBus for fixing test cases
932289f [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Rebased from master
613cb47 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Fixed code if spark.executor.uri doesn't have any value - Added test cases
ff57bda [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Adjusted orders of import
97e4bd4 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Changed command for using spark-class directly - Delete sbin/spark-executor and moved some codes into spark-class' case statement
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've updated the fields and all usages of these fields in the Spark code. I've verified that this did not break anything on my local repo.
Author: Ilya Ganelin <ilya.ganelin@capitalone.com>
Closes #4020 from ilganeli/SPARK-3288 and squashes the following commits:
39f3810 [Ilya Ganelin] resolved merge issues
e446287 [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3288
b8c05cb [Ilya Ganelin] Missed making a variable private
6444391 [Ilya Ganelin] Made inc/dec functions private[spark]
1149e78 [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3288
26b312b [Ilya Ganelin] Debugging tests
17146c2 [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3288
5525c20 [Ilya Ganelin] Completed refactoring to make vars in TaskMetrics class private
c64da4f [Ilya Ganelin] Partially updated task metrics to make some vars private
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AllStagesPage.
![screenshot from 2015-01-16 13 43 25](https://cloud.githubusercontent.com/assets/992952/5773256/d61df300-9d85-11e4-9b5a-6730058839fa.png)
This is a first step towards having time remaining estimates for queued and running jobs. See SPARK-5216
Author: Prashant Sharma <prashant.s@imaginea.com>
Closes #4043 from ScrapCodes/SPARK-5216/5217-show-waiting-stages and squashes the following commits:
3b11803 [Prashant Sharma] Review feedback.
0992842 [Prashant Sharma] Switched to Linked hashmap, changed the order to active->pending->completed->failed. And changed pending stages to not reverse sort.
c19d82a [Prashant Sharma] SPARK-5217 Spark UI should report pending stages during job execution on AllStagesPage.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we decrease the width of browsers, the header of WebUI wraps and collapses like as following image.
![2015-01-11 19 49 37](https://cloud.githubusercontent.com/assets/4736016/5698887/b0b9aeee-99cd-11e4-9020-08f3f0014de0.png)
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes #3995 from sarutak/fixed-collapse-webui-layout and squashes the following commits:
3e60b5b [Kousuke Saruta] Modified line-height property in webui.css
7bfb5fb [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into fixed-collapse-webui-layout
5d83e18 [Kousuke Saruta] Fixed collapse of WebUI layout
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
History Server doesn't show collect job submission time.
It's because `JobProgressListener` updates job submission time every time `onJobStart` method is invoked from `ReplayListenerBus`.
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes #4029 from sarutak/SPARK-5231 and squashes the following commits:
0af9e22 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5231
da8bd14 [Kousuke Saruta] Made submissionTime in SparkListenerJobStartas and completionTime in SparkListenerJobEnd as regular Long
0412a6a [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5231
26b9b99 [Kousuke Saruta] Fixed the test cases
2d47bd3 [Kousuke Saruta] Fixed to record job submission time and completion time collectly
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
method
There is an int overflow in the ParallelCollectionRDD.slice method. That's originally reported by SaintBacchus.
```
sc.makeRDD(1 to (Int.MaxValue)).count // result = 0
sc.makeRDD(1 to (Int.MaxValue - 1)).count // result = 2147483646 = Int.MaxValue - 1
sc.makeRDD(1 until (Int.MaxValue)).count // result = 2147483646 = Int.MaxValue - 1
```
see https://github.com/apache/spark/pull/2874 for more details.
This pr try to fix the overflow. However, There's another issue I don't address.
```
val largeRange = Int.MinValue to Int.MaxValue
largeRange.length // throws java.lang.IllegalArgumentException: -2147483648 to 2147483647 by 1: seqs cannot contain more than Int.MaxValue elements.
```
So, the range we feed to sc.makeRDD cannot contain more than Int.MaxValue elements. This is the limitation of Scala. However I think we may want to support that kind of range. But the fix is beyond this pr.
srowen andrewor14 would you mind take a look at this pr?
Author: Ye Xianjin <advancedxy@gmail.com>
Closes #4002 from advancedxy/SPARk-5201 and squashes the following commits:
96265a1 [Ye Xianjin] Update slice method comment and some responding docs.
e143d7a [Ye Xianjin] Update inclusive range check for splitting inclusive range.
b3f5577 [Ye Xianjin] We can include the last element in the last slice in general for inclusive range, hence eliminate the need to check Int.MaxValue or Int.MinValue.
7d39b9e [Ye Xianjin] Convert the two cases pattern matching to one case.
651c959 [Ye Xianjin] rename sign to needsInclusiveRange. add some comments
196f8a8 [Ye Xianjin] Add test cases for ranges end with Int.MaxValue or Int.MinValue
e66e60a [Ye Xianjin] Deal with inclusive and exclusive ranges in one case. If the range is inclusive and the end of the range is (Int.MaxValue or Int.MinValue), we should use inclusive range instead of exclusive
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Based on top of changes in https://github.com/apache/spark/pull/3806.
https://issues.apache.org/jira/browse/SPARK-1507
`--driver-cores` and `spark.driver.cores` for all cluster modes and `spark.yarn.am.cores` for yarn client mode.
Author: WangTaoTheTonic <barneystinson@aliyun.com>
Author: WangTao <barneystinson@aliyun.com>
Closes #4018 from WangTaoTheTonic/SPARK-1507 and squashes the following commits:
01419d3 [WangTaoTheTonic] amend the args name
b255795 [WangTaoTheTonic] indet thing
d86557c [WangTaoTheTonic] some comments amend
43c9392 [WangTao] fix compile error
b39a100 [WangTao] specify # cores for ApplicationMaster
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When calculating the input metrics there was an assumption that one task only reads from one block - this is not true for some operations including coalesce. This patch simply increments the task's input metrics if previous ones existed of the same read method.
A limitation to this patch is that if a task reads from two different blocks of different read methods, one will override the other.
Author: Kostas Sakellis <kostas@cloudera.com>
Closes #3120 from ksakellis/kostas-spark-4092 and squashes the following commits:
54e6658 [Kostas Sakellis] Drops metrics if conflicting read methods exist
f0e0cc5 [Kostas Sakellis] Add bytesReadCallback to InputMetrics
a2a36d4 [Kostas Sakellis] CR feedback
5a0c770 [Kostas Sakellis] [SPARK-4092] [CORE] Fix InputMetrics for coalesce'd Rdds
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds onExecutorAdded and onExecutorRemoved events to the SparkListener. This will allow a client to get notified when an executor has been added/removed and provide additional information such as how many vcores it is consuming.
In addition, this commit adds a SparkListenerAdapter to the Java API that provides default implementations to the SparkListener. This is to get around the fact that default implementations for traits don't work in Java. Having Java clients extend SparkListenerAdapter moving forward will prevent breakage in java when we add new events to SparkListener.
Author: Kostas Sakellis <kostas@cloudera.com>
Closes #3711 from ksakellis/kostas-spark-4857 and squashes the following commits:
946d2c5 [Kostas Sakellis] Added executorAdded/Removed events to MesosSchedulerBackend
b1d054a [Kostas Sakellis] Remove executorInfo from ExecutorRemoved event
1727b38 [Kostas Sakellis] Renamed ExecutorDetails back to ExecutorInfo and other CR feedback
14fe78d [Kostas Sakellis] Added executor added/removed events to json protocol
93d087b [Kostas Sakellis] [SPARK-4857] [CORE] Adds Executor membership events to SparkListener
|
|
|
|
|
|
|
|
|
|
| |
In BlockManager, there is a word `BlockTranserService` but I think it's typo for `BlockTransferService`.
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes #4046 from sarutak/fix-tiny-typo and squashes the following commits:
a3e2a2f [Kousuke Saruta] Fixed tiny typo in BlockManager
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`TaskContext.attemptId` is misleadingly-named, since it currently returns a taskId, which uniquely identifies a particular task attempt within a particular SparkContext, instead of an attempt number, which conveys how many times a task has been attempted.
This patch deprecates `TaskContext.attemptId` and add `TaskContext.taskId` and `TaskContext.attemptNumber` fields. Prior to this change, it was impossible to determine whether a task was being re-attempted (or was a speculative copy), which made it difficult to write unit tests for tasks that fail on early attempts or speculative tasks that complete faster than original tasks.
Earlier versions of the TaskContext docs suggest that `attemptId` behaves like `attemptNumber`, so there's an argument to be made in favor of changing this method's implementation. Since we've decided against making that change in maintenance branches, I think it's simpler to add better-named methods and retain the old behavior for `attemptId`; if `attemptId` behaved differently in different branches, then this would cause confusing build-breaks when backporting regression tests that rely on the new `attemptId` behavior.
Most of this patch is fairly straightforward, but there is a bit of trickiness related to Mesos tasks: since there's no field in MesosTaskInfo to encode the attemptId, I packed it into the `data` field alongside the task binary.
Author: Josh Rosen <joshrosen@databricks.com>
Closes #3849 from JoshRosen/SPARK-4014 and squashes the following commits:
89d03e0 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4014
5cfff05 [Josh Rosen] Introduce wrapper for serializing Mesos task launch data.
38574d4 [Josh Rosen] attemptId -> taskAttemptId in PairRDDFunctions
a180b88 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4014
1d43aa6 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4014
eee6a45 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4014
0b10526 [Josh Rosen] Use putInt instead of putLong (silly mistake)
8c387ce [Josh Rosen] Use local with maxRetries instead of local-cluster.
cbe4d76 [Josh Rosen] Preserve attemptId behavior and deprecate it:
b2dffa3 [Josh Rosen] Address some of Reynold's minor comments
9d8d4d1 [Josh Rosen] Doc typo
1e7a933 [Josh Rosen] [SPARK-4014] Change TaskContext.attemptId to return attempt number instead of task ID.
fd515a5 [Josh Rosen] Add failing test for SPARK-4014
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when they are empty
In current WebUI, tables for Active Stages, Completed Stages, Skipped Stages and Failed Stages are hidden when they are empty while tables for Active Jobs, Completed Jobs and Failed Jobs are not hidden though they are empty.
This is before my patch is applied.
![2015-01-13 14 13 03](https://cloud.githubusercontent.com/assets/4736016/5730793/2b73d6f4-9b32-11e4-9a24-1784d758c644.png)
And this is after my patch is applied.
![2015-01-13 14 38 13](https://cloud.githubusercontent.com/assets/4736016/5730797/359ea2da-9b32-11e4-97b0-544739ddbf4c.png)
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes #4028 from sarutak/SPARK-5228 and squashes the following commits:
b1e6e8b [Kousuke Saruta] Fixed a small typo
daab563 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5228
9493a1d [Kousuke Saruta] Modified AllJobPage.scala so that hide Active Jobs/Completed Jobs/Failed Jobs when they are empty
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I found some arguments in yarn module take environment variables before system properties while the latter override the former in core module.
Author: WangTaoTheTonic <barneystinson@aliyun.com>
Author: WangTao <barneystinson@aliyun.com>
Closes #3557 from WangTaoTheTonic/SPARK4697 and squashes the following commits:
836b9ef [WangTaoTheTonic] fix type mismatch
e3e486a [WangTaoTheTonic] remove the comma
1262d57 [WangTaoTheTonic] handle spark.app.name and SPARK_YARN_APP_NAME in SparkSubmitArguments
bee9447 [WangTaoTheTonic] wrong brace
81833bb [WangTaoTheTonic] rebase
40934b4 [WangTaoTheTonic] just switch blocks
5f43f45 [WangTao] System property can override environment variable
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://issues.apache.org/jira/browse/SPARK-5006
I think the issue is produced in https://github.com/apache/spark/pull/1777.
Not digging mesos's backend yet. Maybe should add same logic either.
Author: WangTaoTheTonic <barneystinson@aliyun.com>
Author: WangTao <barneystinson@aliyun.com>
Closes #3841 from WangTaoTheTonic/SPARK-5006 and squashes the following commits:
8cdf96d [WangTao] indent thing
2d86d65 [WangTaoTheTonic] fix line length
7cdfd98 [WangTaoTheTonic] fit for new HttpServer constructor
61a370d [WangTaoTheTonic] some minor fixes
bc6e1ec [WangTaoTheTonic] rebase
67bcb46 [WangTaoTheTonic] put conf at 3rd position, modify suite class, add comments
f450cd1 [WangTaoTheTonic] startServiceOnPort will use a SparkConf arg
29b751b [WangTaoTheTonic] rebase as ExecutorRunnableUtil changed to ExecutorRunnable
396c226 [WangTaoTheTonic] make the grammar more like scala
191face [WangTaoTheTonic] invalid value name
62ec336 [WangTaoTheTonic] spark.port.maxRetries doesn't work
|