| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...llocator
The goal of this PR is to simplify YarnAllocator as much as possible and get it up to the level of code quality we see in the rest of Spark.
In service of this, it does a few things:
* Uses AMRMClient APIs for matching containers to requests.
* Adds calls to AMRMClient.removeContainerRequest so that, when we use a container, we don't end up requesting it again.
* Removes YarnAllocator's host->rack cache. YARN's RackResolver already does this caching, so this is redundant.
* Adds tests for basic YarnAllocator functionality.
* Breaks up the allocateResources method, which was previously nearly 300 lines.
* A little bit of stylistic cleanup.
* Fixes a bug that causes three times the requests to be filed when preferred host locations are given.
The patch is lossy. In particular, it loses the logic for trying to avoid containers bunching up on nodes. As I understand it, the logic that's gone is:
* If, in a single response from the RM, we receive a set of containers on a node, and prefer some number of containers on that node greater than 0 but less than the number we received, give back the delta between what we preferred and what we received.
This seems like a weird way to avoid bunching E.g. it does nothing to avoid bunching when we don't request containers on particular nodes.
Author: Sandy Ryza <sandy@cloudera.com>
Closes #3765 from sryza/sandy-spark-1714 and squashes the following commits:
32a5942 [Sandy Ryza] Muffle RackResolver logs
74f56dd [Sandy Ryza] Fix a couple comments and simplify requestTotalExecutors
60ea4bd [Sandy Ryza] Fix scalastyle
ca35b53 [Sandy Ryza] Simplify further
e9cf8a6 [Sandy Ryza] Fix YarnClusterSuite
257acf3 [Sandy Ryza] Remove locality stuff and more cleanup
59a3c5e [Sandy Ryza] Take out rack stuff
5f72fd5 [Sandy Ryza] Further documentation and cleanup
89edd68 [Sandy Ryza] SPARK-1714. Take advantage of AMRMClient APIs to simplify logic in YarnAllocator
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://issues.apache.org/jira/browse/SPARK-5336
Author: WangTao <barneystinson@aliyun.com>
Author: WangTaoTheTonic <barneystinson@aliyun.com>
Closes #4123 from WangTaoTheTonic/SPARK-5336 and squashes the following commits:
6c9676a [WangTao] Update ClientArguments.scala
9632d3a [WangTaoTheTonic] minor comment fix
d03d6fa [WangTaoTheTonic] import ordering should be alphabetical'
3112af9 [WangTao] spark.executor.cores must not be less than spark.task.cpus
|
|
|
|
|
|
|
|
|
|
|
|
| |
Completed Stages and Failed Stages" when they are empty
Related to SPARK-5228 and #4028, `AllStagesPage` also should hide the table for `ActiveStages`, `CompleteStages` and `FailedStages` when they are empty.
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes #4083 from sarutak/SPARK-5294 and squashes the following commits:
a7625c1 [Kousuke Saruta] Fixed conflicts
|
|
|
|
|
|
|
|
|
|
|
|
| |
UIWorkloadGenerator don't stop SparkContext. I ran UIWorkloadGenerator and try to watch the result at WebUI but Jobs are marked as finished.
It's because SparkContext is not stopped.
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes #4112 from sarutak/SPARK-5329 and squashes the following commits:
bcc0fa9 [Kousuke Saruta] Disabled scalastyle for a bock comment
86a3b95 [Kousuke Saruta] Fixed UIWorkloadGenerator to stop SparkContext in it
|
|
|
|
|
|
|
|
|
|
| |
... by Piotr Kolaczkowski)
Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
Closes #4113 from jacek-lewandowski/SPARK-4660-master and squashes the following commits:
a5e84ca [Jacek Lewandowski] SPARK-4660: Use correct class loader in JavaSerializer (copy of PR #3840 by Piotr Kolaczkowski)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Rewind ByteBuffer before making ByteString
(This fixes a bug introduced in #3849 / SPARK-4014)
Author: Jongyoul Lee <jongyoul@gmail.com>
Closes #4119 from jongyoul/SPARK-5333 and squashes the following commits:
c6693a8 [Jongyoul Lee] [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException - changed logDebug location
4141f58 [Jongyoul Lee] [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException - Added license information
2190606 [Jongyoul Lee] [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException - Adjusted imported libraries
b7f5517 [Jongyoul Lee] [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException - Rewind ByteBuffer before making ByteString
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pretty minor, but submitted for consideration -- this would at least help people make this check in the most efficient way I know.
Author: Sean Owen <sowen@cloudera.com>
Closes #4074 from srowen/SPARK-5270 and squashes the following commits:
66885b8 [Sean Owen] Add note that JavaRDDLike should not be implemented by user code
2e9b490 [Sean Owen] More tests, and Mima-exclude the new isEmpty method in JavaRDDLike
28395ff [Sean Owen] Add isEmpty to Java, Python
7dd04b7 [Sean Owen] Add efficient RDD.isEmpty()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR adds a simple `EventLoop` and use it to replace Actor in DAGScheduler. `EventLoop` is a general class to support that posting events in multiple threads and handling events in a single event thread.
Author: zsxwing <zsxwing@gmail.com>
Closes #4016 from zsxwing/event-loop and squashes the following commits:
aefa1ce [zsxwing] Add protected to on*** methods
5cfac83 [zsxwing] Remove null check of eventProcessLoop
dba35b2 [zsxwing] Add a test that onReceive swallows InterruptException
460f7b3 [zsxwing] Use volatile instead of Atomic things in unit tests
227bf33 [zsxwing] Add a stop flag and some tests
37f79c6 [zsxwing] Fix docs
55fb6f6 [zsxwing] Add private[spark] to EventLoop
1f73eac [zsxwing] Fix the import order
3b2e59c [zsxwing] Add EventLoop and change DAGScheduler to an EventLoop
|
|
|
|
|
|
|
|
|
|
|
|
| |
Author: Jongyoul Lee <jongyoul@gmail.com>
Closes #3897 from jongyoul/SPARK-5088 and squashes the following commits:
8232aa8 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Added a listenerBus for fixing test cases
932289f [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Rebased from master
613cb47 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Fixed code if spark.executor.uri doesn't have any value - Added test cases
ff57bda [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Adjusted orders of import
97e4bd4 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Changed command for using spark-class directly - Delete sbin/spark-executor and moved some codes into spark-class' case statement
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've updated the fields and all usages of these fields in the Spark code. I've verified that this did not break anything on my local repo.
Author: Ilya Ganelin <ilya.ganelin@capitalone.com>
Closes #4020 from ilganeli/SPARK-3288 and squashes the following commits:
39f3810 [Ilya Ganelin] resolved merge issues
e446287 [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3288
b8c05cb [Ilya Ganelin] Missed making a variable private
6444391 [Ilya Ganelin] Made inc/dec functions private[spark]
1149e78 [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3288
26b312b [Ilya Ganelin] Debugging tests
17146c2 [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3288
5525c20 [Ilya Ganelin] Completed refactoring to make vars in TaskMetrics class private
c64da4f [Ilya Ganelin] Partially updated task metrics to make some vars private
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AllStagesPage.
![screenshot from 2015-01-16 13 43 25](https://cloud.githubusercontent.com/assets/992952/5773256/d61df300-9d85-11e4-9b5a-6730058839fa.png)
This is a first step towards having time remaining estimates for queued and running jobs. See SPARK-5216
Author: Prashant Sharma <prashant.s@imaginea.com>
Closes #4043 from ScrapCodes/SPARK-5216/5217-show-waiting-stages and squashes the following commits:
3b11803 [Prashant Sharma] Review feedback.
0992842 [Prashant Sharma] Switched to Linked hashmap, changed the order to active->pending->completed->failed. And changed pending stages to not reverse sort.
c19d82a [Prashant Sharma] SPARK-5217 Spark UI should report pending stages during job execution on AllStagesPage.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we decrease the width of browsers, the header of WebUI wraps and collapses like as following image.
![2015-01-11 19 49 37](https://cloud.githubusercontent.com/assets/4736016/5698887/b0b9aeee-99cd-11e4-9020-08f3f0014de0.png)
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes #3995 from sarutak/fixed-collapse-webui-layout and squashes the following commits:
3e60b5b [Kousuke Saruta] Modified line-height property in webui.css
7bfb5fb [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into fixed-collapse-webui-layout
5d83e18 [Kousuke Saruta] Fixed collapse of WebUI layout
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
History Server doesn't show collect job submission time.
It's because `JobProgressListener` updates job submission time every time `onJobStart` method is invoked from `ReplayListenerBus`.
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes #4029 from sarutak/SPARK-5231 and squashes the following commits:
0af9e22 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5231
da8bd14 [Kousuke Saruta] Made submissionTime in SparkListenerJobStartas and completionTime in SparkListenerJobEnd as regular Long
0412a6a [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5231
26b9b99 [Kousuke Saruta] Fixed the test cases
2d47bd3 [Kousuke Saruta] Fixed to record job submission time and completion time collectly
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
method
There is an int overflow in the ParallelCollectionRDD.slice method. That's originally reported by SaintBacchus.
```
sc.makeRDD(1 to (Int.MaxValue)).count // result = 0
sc.makeRDD(1 to (Int.MaxValue - 1)).count // result = 2147483646 = Int.MaxValue - 1
sc.makeRDD(1 until (Int.MaxValue)).count // result = 2147483646 = Int.MaxValue - 1
```
see https://github.com/apache/spark/pull/2874 for more details.
This pr try to fix the overflow. However, There's another issue I don't address.
```
val largeRange = Int.MinValue to Int.MaxValue
largeRange.length // throws java.lang.IllegalArgumentException: -2147483648 to 2147483647 by 1: seqs cannot contain more than Int.MaxValue elements.
```
So, the range we feed to sc.makeRDD cannot contain more than Int.MaxValue elements. This is the limitation of Scala. However I think we may want to support that kind of range. But the fix is beyond this pr.
srowen andrewor14 would you mind take a look at this pr?
Author: Ye Xianjin <advancedxy@gmail.com>
Closes #4002 from advancedxy/SPARk-5201 and squashes the following commits:
96265a1 [Ye Xianjin] Update slice method comment and some responding docs.
e143d7a [Ye Xianjin] Update inclusive range check for splitting inclusive range.
b3f5577 [Ye Xianjin] We can include the last element in the last slice in general for inclusive range, hence eliminate the need to check Int.MaxValue or Int.MinValue.
7d39b9e [Ye Xianjin] Convert the two cases pattern matching to one case.
651c959 [Ye Xianjin] rename sign to needsInclusiveRange. add some comments
196f8a8 [Ye Xianjin] Add test cases for ranges end with Int.MaxValue or Int.MinValue
e66e60a [Ye Xianjin] Deal with inclusive and exclusive ranges in one case. If the range is inclusive and the end of the range is (Int.MaxValue or Int.MinValue), we should use inclusive range instead of exclusive
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Based on top of changes in https://github.com/apache/spark/pull/3806.
https://issues.apache.org/jira/browse/SPARK-1507
`--driver-cores` and `spark.driver.cores` for all cluster modes and `spark.yarn.am.cores` for yarn client mode.
Author: WangTaoTheTonic <barneystinson@aliyun.com>
Author: WangTao <barneystinson@aliyun.com>
Closes #4018 from WangTaoTheTonic/SPARK-1507 and squashes the following commits:
01419d3 [WangTaoTheTonic] amend the args name
b255795 [WangTaoTheTonic] indet thing
d86557c [WangTaoTheTonic] some comments amend
43c9392 [WangTao] fix compile error
b39a100 [WangTao] specify # cores for ApplicationMaster
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When calculating the input metrics there was an assumption that one task only reads from one block - this is not true for some operations including coalesce. This patch simply increments the task's input metrics if previous ones existed of the same read method.
A limitation to this patch is that if a task reads from two different blocks of different read methods, one will override the other.
Author: Kostas Sakellis <kostas@cloudera.com>
Closes #3120 from ksakellis/kostas-spark-4092 and squashes the following commits:
54e6658 [Kostas Sakellis] Drops metrics if conflicting read methods exist
f0e0cc5 [Kostas Sakellis] Add bytesReadCallback to InputMetrics
a2a36d4 [Kostas Sakellis] CR feedback
5a0c770 [Kostas Sakellis] [SPARK-4092] [CORE] Fix InputMetrics for coalesce'd Rdds
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds onExecutorAdded and onExecutorRemoved events to the SparkListener. This will allow a client to get notified when an executor has been added/removed and provide additional information such as how many vcores it is consuming.
In addition, this commit adds a SparkListenerAdapter to the Java API that provides default implementations to the SparkListener. This is to get around the fact that default implementations for traits don't work in Java. Having Java clients extend SparkListenerAdapter moving forward will prevent breakage in java when we add new events to SparkListener.
Author: Kostas Sakellis <kostas@cloudera.com>
Closes #3711 from ksakellis/kostas-spark-4857 and squashes the following commits:
946d2c5 [Kostas Sakellis] Added executorAdded/Removed events to MesosSchedulerBackend
b1d054a [Kostas Sakellis] Remove executorInfo from ExecutorRemoved event
1727b38 [Kostas Sakellis] Renamed ExecutorDetails back to ExecutorInfo and other CR feedback
14fe78d [Kostas Sakellis] Added executor added/removed events to json protocol
93d087b [Kostas Sakellis] [SPARK-4857] [CORE] Adds Executor membership events to SparkListener
|
|
|
|
|
|
|
|
|
|
| |
In BlockManager, there is a word `BlockTranserService` but I think it's typo for `BlockTransferService`.
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes #4046 from sarutak/fix-tiny-typo and squashes the following commits:
a3e2a2f [Kousuke Saruta] Fixed tiny typo in BlockManager
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`TaskContext.attemptId` is misleadingly-named, since it currently returns a taskId, which uniquely identifies a particular task attempt within a particular SparkContext, instead of an attempt number, which conveys how many times a task has been attempted.
This patch deprecates `TaskContext.attemptId` and add `TaskContext.taskId` and `TaskContext.attemptNumber` fields. Prior to this change, it was impossible to determine whether a task was being re-attempted (or was a speculative copy), which made it difficult to write unit tests for tasks that fail on early attempts or speculative tasks that complete faster than original tasks.
Earlier versions of the TaskContext docs suggest that `attemptId` behaves like `attemptNumber`, so there's an argument to be made in favor of changing this method's implementation. Since we've decided against making that change in maintenance branches, I think it's simpler to add better-named methods and retain the old behavior for `attemptId`; if `attemptId` behaved differently in different branches, then this would cause confusing build-breaks when backporting regression tests that rely on the new `attemptId` behavior.
Most of this patch is fairly straightforward, but there is a bit of trickiness related to Mesos tasks: since there's no field in MesosTaskInfo to encode the attemptId, I packed it into the `data` field alongside the task binary.
Author: Josh Rosen <joshrosen@databricks.com>
Closes #3849 from JoshRosen/SPARK-4014 and squashes the following commits:
89d03e0 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4014
5cfff05 [Josh Rosen] Introduce wrapper for serializing Mesos task launch data.
38574d4 [Josh Rosen] attemptId -> taskAttemptId in PairRDDFunctions
a180b88 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4014
1d43aa6 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4014
eee6a45 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4014
0b10526 [Josh Rosen] Use putInt instead of putLong (silly mistake)
8c387ce [Josh Rosen] Use local with maxRetries instead of local-cluster.
cbe4d76 [Josh Rosen] Preserve attemptId behavior and deprecate it:
b2dffa3 [Josh Rosen] Address some of Reynold's minor comments
9d8d4d1 [Josh Rosen] Doc typo
1e7a933 [Josh Rosen] [SPARK-4014] Change TaskContext.attemptId to return attempt number instead of task ID.
fd515a5 [Josh Rosen] Add failing test for SPARK-4014
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when they are empty
In current WebUI, tables for Active Stages, Completed Stages, Skipped Stages and Failed Stages are hidden when they are empty while tables for Active Jobs, Completed Jobs and Failed Jobs are not hidden though they are empty.
This is before my patch is applied.
![2015-01-13 14 13 03](https://cloud.githubusercontent.com/assets/4736016/5730793/2b73d6f4-9b32-11e4-9a24-1784d758c644.png)
And this is after my patch is applied.
![2015-01-13 14 38 13](https://cloud.githubusercontent.com/assets/4736016/5730797/359ea2da-9b32-11e4-97b0-544739ddbf4c.png)
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes #4028 from sarutak/SPARK-5228 and squashes the following commits:
b1e6e8b [Kousuke Saruta] Fixed a small typo
daab563 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5228
9493a1d [Kousuke Saruta] Modified AllJobPage.scala so that hide Active Jobs/Completed Jobs/Failed Jobs when they are empty
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I found some arguments in yarn module take environment variables before system properties while the latter override the former in core module.
Author: WangTaoTheTonic <barneystinson@aliyun.com>
Author: WangTao <barneystinson@aliyun.com>
Closes #3557 from WangTaoTheTonic/SPARK4697 and squashes the following commits:
836b9ef [WangTaoTheTonic] fix type mismatch
e3e486a [WangTaoTheTonic] remove the comma
1262d57 [WangTaoTheTonic] handle spark.app.name and SPARK_YARN_APP_NAME in SparkSubmitArguments
bee9447 [WangTaoTheTonic] wrong brace
81833bb [WangTaoTheTonic] rebase
40934b4 [WangTaoTheTonic] just switch blocks
5f43f45 [WangTao] System property can override environment variable
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://issues.apache.org/jira/browse/SPARK-5006
I think the issue is produced in https://github.com/apache/spark/pull/1777.
Not digging mesos's backend yet. Maybe should add same logic either.
Author: WangTaoTheTonic <barneystinson@aliyun.com>
Author: WangTao <barneystinson@aliyun.com>
Closes #3841 from WangTaoTheTonic/SPARK-5006 and squashes the following commits:
8cdf96d [WangTao] indent thing
2d86d65 [WangTaoTheTonic] fix line length
7cdfd98 [WangTaoTheTonic] fit for new HttpServer constructor
61a370d [WangTaoTheTonic] some minor fixes
bc6e1ec [WangTaoTheTonic] rebase
67bcb46 [WangTaoTheTonic] put conf at 3rd position, modify suite class, add comments
f450cd1 [WangTaoTheTonic] startServiceOnPort will use a SparkConf arg
29b751b [WangTaoTheTonic] rebase as ExecutorRunnableUtil changed to ExecutorRunnable
396c226 [WangTaoTheTonic] make the grammar more like scala
191face [WangTaoTheTonic] invalid value name
62ec336 [WangTaoTheTonic] spark.port.maxRetries doesn't work
|
|
|
|
|
|
|
|
|
|
| |
Current spark lets you set the ip address using SPARK_LOCAL_IP, but then this is given to akka after doing a reverse DNS lookup. This makes it difficult to run spark in Docker. You can already change the hostname that is used programmatically, but it would be nice to be able to do this with an environment variable as well.
Author: Michael Armbrust <michael@databricks.com>
Closes #3893 from marmbrus/localHostnameEnv and squashes the following commits:
85045b6 [Michael Armbrust] Optionally read from SPARK_LOCAL_HOSTNAME
|
|
|
|
|
|
|
|
|
|
|
| |
CompressedMapStatus and HighlyCompressedMapStatus needs to be registered with Kryo, because they are subclass of MapStatus.
Author: lianhuiwang <lianhuiwang09@gmail.com>
Closes #4007 from lianhuiwang/SPARK-5102 and squashes the following commits:
9d2238a [lianhuiwang] remove register of MapStatus
05a285d [lianhuiwang] subclass of MapStatus needs to be registered with Kryo
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A few changes to fix this issue:
1. Handle the case that receiving `SparkListenerTaskStart` before `SparkListenerBlockManagerAdded`.
2. Don't add `executorId` to `removeTimes` when the executor is busy.
3. Use `HashMap.retain` to safely traverse the HashMap and remove items.
4. Use the same lock in ExecutorAllocationManager and ExecutorAllocationListener to fix the race condition in `totalPendingTasks`.
5. Move the blocking codes out of the message processing code in YarnSchedulerActor.
Author: zsxwing <zsxwing@gmail.com>
Closes #3783 from zsxwing/SPARK-4951 and squashes the following commits:
d51fa0d [zsxwing] Add comments
2e365ce [zsxwing] Remove expired executors from 'removeTimes' and add idle executors back when a new executor joins
49f61a9 [zsxwing] Eliminate duplicate executor registered warnings
d4c4e9a [zsxwing] Minor fixes for the code style
05f6238 [zsxwing] Move the blocking codes out of the message processing code
105ba3a [zsxwing] Fix the race condition in totalPendingTasks
d5c615d [zsxwing] Fix the issue that a busy executor may be killed
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because major OS page sizes is about 4KB, the default value of spark.storage.memoryMapThreshold is integrated to 2 * 4096
Author: lewuathe <lewuathe@me.com>
Closes #3900 from Lewuathe/integrate-memoryMapThreshold and squashes the following commits:
e417acd [lewuathe] [SPARK-5073] Update docs/configuration
834aba4 [lewuathe] [SPARK-5073] Fix style
adcea33 [lewuathe] [SPARK-5073] Integrate memory map threshold to 2MB
fcce2e5 [lewuathe] [SPARK-5073] spark.storage.memoryMapThreshold have two default value
|
|
|
|
|
|
|
|
|
|
|
|
| |
Author: wangfei <wangfei1@huawei.com>
Closes #3718 from scwf/sparksqlui and squashes the following commits:
e0d6b5d [wangfei] format fix
383b505 [wangfei] fix conflicts
4d2038a [wangfei] using setJobDescription
df79837 [wangfei] fix compile error
92ce834 [wangfei] show sql statement in spark ui when run sql use spark-sql
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Dealing with [SPARK-4737], the handling of serialization errors should not be the DAGScheduler's responsibility. The task set manager now catches the error and aborts the stage.
If the TaskSetManager throws a TaskNotSerializableException, the TaskSchedulerImpl will return an empty list of task descriptions, because no tasks were started. The scheduler should abort the stage gracefully.
Note that I'm not too familiar with this part of the codebase and its place in the overall architecture of the Spark stack. If implementing it this way will have any averse side effects please voice that loudly.
Author: mcheah <mcheah@palantir.com>
Closes #3638 from mccheah/task-set-manager-properly-handle-ser-err and squashes the following commits:
1545984 [mcheah] Some more style fixes from Andrew Or.
5267929 [mcheah] Fixing style suggestions from Andrew Or.
dfa145b [mcheah] Fixing style from Josh Rosen's feedback
b2a430d [mcheah] Not returning empty seq when a task set cannot be serialized.
94844d7 [mcheah] Fixing compilation error, one brace too many
5f486f4 [mcheah] Adding license header for fake task class
bf5e706 [mcheah] Fixing indentation.
097e7a2 [mcheah] [SPARK-4737] Catching task serialization exception in TaskSetManager
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
driver memory...
... size
Ways to set Application Master's memory on yarn-client mode:
1. `spark.yarn.am.memory` in SparkConf or System Properties
2. default value 512m
Note: this arguments is only available in yarn-client mode.
Author: WangTaoTheTonic <barneystinson@aliyun.com>
Closes #3607 from WangTaoTheTonic/SPARK4181 and squashes the following commits:
d5ceb1b [WangTaoTheTonic] spark.driver.memeory is used in both modes
6c1b264 [WangTaoTheTonic] rebase
b8410c0 [WangTaoTheTonic] minor optiminzation
ddcd592 [WangTaoTheTonic] fix the bug produced in rebase and some improvements
3bf70cc [WangTaoTheTonic] rebase and give proper hint
987b99d [WangTaoTheTonic] disable --driver-memory in client mode
2b27928 [WangTaoTheTonic] inaccurate description
b7acbb2 [WangTaoTheTonic] incorrect method invoked
2557c5e [WangTaoTheTonic] missing a single blank
42075b0 [WangTaoTheTonic] arrange the args and warn logging
69c7dba [WangTaoTheTonic] rebase
1960d16 [WangTaoTheTonic] fix wrong comment
7fa9e2e [WangTaoTheTonic] log a warning
f6bee0e [WangTaoTheTonic] docs issue
d619996 [WangTaoTheTonic] Merge branch 'master' into SPARK4181
b09c309 [WangTaoTheTonic] use code format
ab16bb5 [WangTaoTheTonic] fix bug and add comments
44e48c2 [WangTaoTheTonic] minor fix
6fd13e1 [WangTaoTheTonic] add overhead mem and remove some configs
0566bb8 [WangTaoTheTonic] yarn client mode Application Master memory size is same as driver memory size
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current TaskSchedulerImplSuite includes some tests that are
actually for the TaskSchedulerImpl, but the remainder of the tests avoid using
the TaskSchedulerImpl entirely, and actually test the pool and scheduling
algorithm mechanisms. This commit separates the pool/scheduling algorithm
tests into their own suite, and also simplifies those tests.
The pull request replaces #339.
Author: Kay Ousterhout <kayousterhout@gmail.com>
Closes #3967 from kayousterhout/SPARK-1143 and squashes the following commits:
8a898c4 [Kay Ousterhout] [SPARK-1143] Separate pool tests into their own suite.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change does a few things to make the hadoop-provided profile more useful:
- Create new profiles for other libraries / services that might be provided by the infrastructure
- Simplify and fix the poms so that the profiles are only activated while building assemblies.
- Fix tests so that they're able to run when the profiles are activated
- Add a new env variable to be used by distributions that use these profiles to provide the runtime
classpath for Spark jobs and daemons.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes #2982 from vanzin/SPARK-4048 and squashes the following commits:
82eb688 [Marcelo Vanzin] Add a comment.
eb228c0 [Marcelo Vanzin] Fix borked merge.
4e38f4e [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
9ef79a3 [Marcelo Vanzin] Alternative way to propagate test classpath to child processes.
371ebee [Marcelo Vanzin] Review feedback.
52f366d [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
83099fc [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
7377e7b [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
322f882 [Marcelo Vanzin] Fix merge fail.
f24e9e7 [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
8b00b6a [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
9640503 [Marcelo Vanzin] Cleanup child process log message.
115fde5 [Marcelo Vanzin] Simplify a comment (and make it consistent with another pom).
e3ab2da [Marcelo Vanzin] Fix hive-thriftserver profile.
7820d58 [Marcelo Vanzin] Fix CliSuite with provided profiles.
1be73d4 [Marcelo Vanzin] Restore flume-provided profile.
d1399ed [Marcelo Vanzin] Restore jetty dependency.
82a54b9 [Marcelo Vanzin] Remove unused profile.
5c54a25 [Marcelo Vanzin] Fix HiveThriftServer2Suite with *-provided profiles.
1fc4d0b [Marcelo Vanzin] Update dependencies for hive-thriftserver.
f7b3bbe [Marcelo Vanzin] Add snappy to hadoop-provided list.
9e4e001 [Marcelo Vanzin] Remove duplicate hive profile.
d928d62 [Marcelo Vanzin] Redirect child stderr to parent's log.
4d67469 [Marcelo Vanzin] Propagate SPARK_DIST_CLASSPATH on Yarn.
417d90e [Marcelo Vanzin] Introduce "SPARK_DIST_CLASSPATH".
2f95f0d [Marcelo Vanzin] Propagate classpath to child processes during testing.
1adf91c [Marcelo Vanzin] Re-enable maven-install-plugin for a few projects.
284dda6 [Marcelo Vanzin] Rework the "hadoop-provided" profile, add new ones.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
remaining even if application finished when external shuffle is enabled
When we enables external shuffle service, local directories in the driver of client-mode continue remaining even if application has finished.
I think local directories for drivers should be deleted.
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes #3811 from sarutak/SPARK-4973 and squashes the following commits:
ad944ab [Kousuke Saruta] Fixed DiskBlockManager to cleanup local directory if it's the driver
43770da [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-4973
88feecd [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-4973
d99718e [Kousuke Saruta] Fixed SparkSubmit.scala and DiskBlockManager.scala in order to delete local directories of the driver of local-mode when external shuffle service is enabled
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pull request is my own work and I license it under Spark's open-source license.
This contribution is an improvement to the documentation. I documented that the maximum number of values per key for groupByKey is limited by available RAM (see [Datablox][datablox link] and [the spark mailing list][list link]).
Just saying that better performance is available is not sufficient. Sometimes you need to do a group-by - your operation needs all the items available in order to complete. This warning explains the problem.
[datablox link]: http://databricks.gitbooks.io/databricks-spark-knowledge-base/content/best_practices/prefer_reducebykey_over_groupbykey.html
[list link]: http://apache-spark-user-list.1001560.n3.nabble.com/Understanding-RDD-GroupBy-OutOfMemory-Exceptions-tp11427p11466.html
Author: Eric Moyer <eric_moyer@yahoo.com>
Closes #3936 from RadixSeven/better-group-by-docs and squashes the following commits:
5b6f4e9 [Eric Moyer] groupByKey docs naming updates
238e81b [Eric Moyer] Doc that groupByKey will OOM for large keys
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The property `spark.executor.id` can represent both `driver` and `<driver>` for one driver.
It's inconsistent.
This issue is minor so I didn't file this in JIRA.
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes #3812 from sarutak/fix-driver-identifier and squashes the following commits:
d885498 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into fix-driver-identifier
4275663 [Kousuke Saruta] Fixed the value represented by spark.executor.id of local mode
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
standalone mode
when enabling eventlog in standalone mode, if give the wrong configuration, the standalone cluster will down (cause master restart, lose connection with workers).
How to reproduce: just give an invalid value to "spark.eventLog.dir", for example: spark.eventLog.dir=hdfs://tmp/logdir1, hdfs://tmp/logdir2. This will throw illegalArgumentException, which will cause the Master restart. And the whole cluster is not available.
Author: Zhang, Liye <liye.zhang@intel.com>
Closes #3824 from liyezhang556520/wrongConf4Cluster and squashes the following commits:
3c24d98 [Zhang, Liye] revert change with logwarning and excetption for FileNotFoundException
3c1ac2e [Zhang, Liye] change var to val
a49c52f [Zhang, Liye] revert wrong modification
12eee85 [Zhang, Liye] add more message in log and on webUI
5c1fa33 [Zhang, Liye] cache exceptions when eventlog with wrong conf
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
urls can crash the process.
Because `actorSelection` will return `deadLetters` for an invalid path, Worker keeps quiet for an invalid master url. It's better to log an error so that people can find such problem quickly.
This PR will check the url before sending to `actorSelection`, throw and log a SparkException for an invalid url.
Author: zsxwing <zsxwing@gmail.com>
Closes #3927 from zsxwing/SPARK-5126 and squashes the following commits:
9d429ee [zsxwing] Create a utility method in Utils to parse Spark url; verify urls before creating Actors so that invalid urls can crash the process.
8286e51 [zsxwing] Check the url before sending to Akka and log the error if the url is invalid
|
|
|
|
|
|
|
|
|
|
|
|
| |
SPARK-5132:
stageInfoToJson: Stage Attempt Id
stageInfoFromJson: Attempt Id
Author: hushan[胡珊] <hushan@xiaomi.com>
Closes #3932 from suyanNone/json-stage and squashes the following commits:
41419ab [hushan[胡珊]] Correct stage Attempt Id key in stageInfofromJson
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enabled HistoryServer to show incomplete applications.
We can see the log for incomplete applications by clicking the bottom link.
Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp>
Closes #3467 from tsudukim/feature/SPARK-2458-2 and squashes the following commits:
76205d2 [Masayoshi TSUZUKI] Fixed and added test code.
29a04a9 [Masayoshi TSUZUKI] Merge branch 'master' of github.com:tsudukim/spark into feature/SPARK-2458-2
f9ef854 [Masayoshi TSUZUKI] Added space between "if" and "(". Fixed "Incomplete" as capitalized in the web UI. Modified double negative variable name.
9b465b0 [Masayoshi TSUZUKI] Modified typo and better implementation.
3ed8a41 [Masayoshi TSUZUKI] Modified too long lines.
08ea14d [Masayoshi TSUZUKI] [SPARK-2458] Make failed application log visible on History Server
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR:
- Reenables `surefire`, and copies config from `scalatest` (which is itself an old fork of `surefire`, so similar)
- Tells `surefire` to test only Java tests
- Enables `surefire` and `scalatest` for all children, and in turn eliminates some duplication.
For me this causes the Scala and Java tests to be run once each, it seems, as desired. It doesn't affect the SBT build but works for Maven. I still need to verify that all of the Scala tests and Java tests are being run.
Author: Sean Owen <sowen@cloudera.com>
Closes #3651 from srowen/SPARK-4159 and squashes the following commits:
2e8a0af [Sean Owen] Remove specialized SPARK_HOME setting for REPL, YARN tests as it appears to be obsolete
12e4558 [Sean Owen] Append to unit-test.log instead of overwriting, so that both surefire and scalatest output is preserved. Also standardize/correct comments a bit.
e6f8601 [Sean Owen] Reenable Java tests by reenabling surefire with config cloned from scalatest; centralize test config in the parent
|
|
|
|
|
|
|
|
| |
Author: Reynold Xin <rxin@databricks.com>
Closes #3903 from rxin/timeout-120 and squashes the following commits:
7c2138e [Reynold Xin] [SPARK-5093] Set spark.network.timeout to 120s consistently.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...nt at all.
- fixed a scope of runAsSparkUser from MesosExecutorDriver.run to MesosExecutorBackend.launchTask
- See the Jira Issue for more details.
Author: Jongyoul Lee <jongyoul@gmail.com>
Closes #3741 from jongyoul/SPARK-4465 and squashes the following commits:
46ad71e [Jongyoul Lee] [SPARK-4465] runAsSparkUser doesn't affect TaskRunner in Mesos environment at all. - Removed unused import
3d6631f [Jongyoul Lee] [SPARK-4465] runAsSparkUser doesn't affect TaskRunner in Mesos environment at all. - Removed comments and adjusted indentations
2343f13 [Jongyoul Lee] [SPARK-4465] runAsSparkUser doesn't affect TaskRunner in Mesos environment at all. - fixed a scope of runAsSparkUser from MesosExecutorDriver.run to MesosExecutorBackend.launchTask
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://issues.apache.org/jira/browse/SPARK-5057
Author: WangTao <barneystinson@aliyun.com>
Author: WangTaoTheTonic <barneystinson@aliyun.com>
Closes #3875 from WangTaoTheTonic/SPARK-5057 and squashes the following commits:
1503487 [WangTao] use string interpolation
706c8a7 [WangTaoTheTonic] log more messages
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[SPARK-4688] Have a single shared network timeout in Spark
Author: Varun Saxena <vsaxena.varun@gmail.com>
Author: varunsaxena <vsaxena.varun@gmail.com>
Closes #3562 from varunsaxena/SPARK-4688 and squashes the following commits:
6e97f72 [Varun Saxena] [SPARK-4688] Single shared network timeout
cd783a2 [Varun Saxena] SPARK-4688
d6f8c29 [Varun Saxena] SCALA-4688
9562b15 [Varun Saxena] SPARK-4688
a75f014 [varunsaxena] SPARK-4688
594226c [varunsaxena] SPARK-4688
|
|
|
|
|
|
|
|
|
|
| |
Add `assert(sc.listenerBus.waitUntilEmpty(WAIT_TIMEOUT_MILLIS))` to make sure `sparkListener` receive the message.
Author: zsxwing <zsxwing@gmail.com>
Closes #3889 from zsxwing/SPARK-5074 and squashes the following commits:
e61c198 [zsxwing] Fix a non-deterministic test failure
|
|
|
|
|
|
|
|
|
|
| |
Because `sparkEnv.blockManager.master.removeBlock` is asynchronous, we need to make sure the block has already been removed before calling `super.enqueueSuccessfulTask`.
Author: zsxwing <zsxwing@gmail.com>
Closes #3894 from zsxwing/SPARK-5083 and squashes the following commits:
d97c03d [zsxwing] Fix a flaky test in TaskResultGetterSuite
|
|
|
|
|
|
|
|
|
|
| |
It's not necessary to set `TaskSchedulerImpl.dagScheduler` in preStart. It's safe to set it after `initializeEventProcessActor()`.
Author: zsxwing <zsxwing@gmail.com>
Closes #3887 from zsxwing/SPARK-5069 and squashes the following commits:
d95894f [zsxwing] Fix the race condition of TaskSchedulerImpl.dagScheduler
|
|
|
|
|
|
|
|
|
|
| |
A simple fix would be adding `assert(e1.appId == e2.appId)` for `SparkListenerApplicationStart`. But actually we can use `===` for well-defined case class directly. Therefore, instead of fixing this issue, I use `===` to compare those well-defined case classes (all fields have implemented a correct `equals` method, such as primitive types)
Author: zsxwing <zsxwing@gmail.com>
Closes #3886 from zsxwing/SPARK-5067 and squashes the following commits:
0a51711 [zsxwing] Use '===' to compare well-defined case class
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch disables output spec. validation for jobs launched through Spark Streaming, since this interferes with checkpoint recovery.
Hadoop OutputFormats have a `checkOutputSpecs` method which performs certain checks prior to writing output, such as checking whether the output directory already exists. SPARK-1100 added checks for FileOutputFormat, SPARK-1677 (#947) added a SparkConf configuration to disable these checks, and SPARK-2309 (#1088) extended these checks to run for all OutputFormats, not just FileOutputFormat.
In Spark Streaming, we might have to re-process a batch during checkpoint recovery, so `save` actions may be called multiple times. In addition to `DStream`'s own save actions, users might use `transform` or `foreachRDD` and call the `RDD` and `PairRDD` save actions. When output spec. validation is enabled, the second calls to these actions will fail due to existing output.
This patch automatically disables output spec. validation for jobs submitted by the Spark Streaming scheduler. This is done by using Scala's `DynamicVariable` to propagate the bypass setting without having to mutate SparkConf or introduce a global variable.
Author: Josh Rosen <joshrosen@databricks.com>
Closes #3832 from JoshRosen/SPARK-4835 and squashes the following commits:
36eaf35 [Josh Rosen] Add comment explaining use of transform() in test.
6485cf8 [Josh Rosen] Add test case in Streaming; fix bug for transform()
7b3e06a [Josh Rosen] Remove Streaming-specific setting to undo this change; update conf. guide
bf9094d [Josh Rosen] Revise disableOutputSpecValidation() comment to not refer to Spark Streaming.
e581d17 [Josh Rosen] Deduplicate isOutputSpecValidationEnabled logic.
762e473 [Josh Rosen] [SPARK-4835] Disable validateOutputSpecs for Spark Streaming jobs.
|
|
|
|
|
|
|
|
|
| |
Author: Dale <tigerquoll@outlook.com>
Closes #3809 from tigerquoll/SPARK-4787 and squashes the following commits:
5661e01 [Dale] [SPARK-4787] Ensure that call to stop() doesn't lose the exception by using a finally block.
2172578 [Dale] [SPARK-4787] Stop context properly if an exception occurs during DAGScheduler initialization.
|
|
|
|
|
|
|
|
|
|
| |
Removed `sleep()` from the `stop()` method of the `TaskSchedulerImpl` class which, from the JIRA ticket, is believed to be a legacy artifact slowing down testing originally introduced in the `ClusterScheduler` class.
Author: Brennon York <brennon.york@capitalone.com>
Closes #3851 from brennonyork/SPARK-794 and squashes the following commits:
04c3e64 [Brennon York] Removed sleep() from the stop() method
|