aboutsummaryrefslogtreecommitdiff
path: root/resource-managers
Commit message (Collapse)AuthorAgeFilesLines
* [SPARK-19227][SPARK-19251] remove unused imports and outdated commentsuncleGen2017-01-182-4/+1
| | | | | | | | | | | | ## What changes were proposed in this pull request? remove ununsed imports and outdated comments, and fix some minor code style issue. ## How was this patch tested? existing ut Author: uncleGen <hustyugm@gmail.com> Closes #16591 from uncleGen/SPARK-19227.
* [SPARK-19179][YARN] Change spark.yarn.access.namenodes config and update docsjerryshao2017-01-172-4/+9
| | | | | | | | | | | | | | ## What changes were proposed in this pull request? `spark.yarn.access.namenodes` configuration cannot actually reflects the usage of it, inside the code it is the Hadoop filesystems we get tokens, not NNs. So here propose to update the name of this configuration, also change the related code and doc. ## How was this patch tested? Local verification. Author: jerryshao <sshao@hortonworks.com> Closes #16560 from jerryshao/SPARK-19179.
* [MINOR][YARN] Move YarnSchedulerBackendSuite to resource-managers/yarn ↵Yanbo Liang2017-01-171-0/+58
| | | | | | | | | | | | | | directory. ## What changes were proposed in this pull request? #16092 moves YARN resource manager related code to resource-managers/yarn directory. The test case ```YarnSchedulerBackendSuite``` was added after that but with the wrong place. I move it to correct directory in this PR. ## How was this patch tested? Existing test. Author: Yanbo Liang <ybliang8@gmail.com> Closes #16595 from yanboliang/yarn.
* [SPARK-19021][YARN] Generailize HDFSCredentialProvider to support non HDFS ↵jerryshao2017-01-115-39/+48
| | | | | | | | | | | | | | security filesystems Currently Spark can only get token renewal interval from security HDFS (hdfs://), if Spark runs with other security file systems like webHDFS (webhdfs://), wasb (wasb://), ADLS, it will ignore these tokens and not get token renewal intervals from these tokens. These will make Spark unable to work with these security clusters. So instead of only checking HDFS token, we should generalize to support different DelegationTokenIdentifier. ## How was this patch tested? Manually verified in security cluster. Author: jerryshao <sshao@hortonworks.com> Closes #16432 from jerryshao/SPARK-19021.
* [SPARK-17931] Eliminate unnecessary task (de) serializationKay Ousterhout2017-01-065-95/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the existing code, there are three layers of serialization involved in sending a task from the scheduler to an executor: - A Task object is serialized - The Task object is copied to a byte buffer that also contains serialized information about any additional JARs, files, and Properties needed for the task to execute. This byte buffer is stored as the member variable serializedTask in the TaskDescription class. - The TaskDescription is serialized (in addition to the serialized task + JARs, the TaskDescription class contains the task ID and other metadata) and sent in a LaunchTask message. While it *is* necessary to have two layers of serialization, so that the JAR, file, and Property info can be deserialized prior to deserializing the Task object, the third layer of deserialization is unnecessary. This commit eliminates a layer of serialization by moving the JARs, files, and Properties into the TaskDescription class. This commit also serializes the Properties manually (by traversing the map), as is done with the JARs and files, which reduces the final serialized size. Unit tests This is a simpler alternative to the approach proposed in #15505. shivaram and I did some benchmarking of this and #15505 on a 20-machine m2.4xlarge EC2 machines (160 cores). We ran ~30 trials of code [1] (a very simple job with 10K tasks per stage) and measured the average time per stage: Before this change: 2490ms With this change: 2345 ms (so ~6% improvement over the baseline) With witgo's approach in #15505: 2046 ms (~18% improvement over baseline) The reason that #15505 has a more significant improvement is that it also moves the serialization from the TaskSchedulerImpl thread to the CoarseGrainedSchedulerBackend thread. I added that functionality on top of this change, and got almost the same improvement [1] as #15505 (average of 2103ms). I think we should decouple these two changes, both so we have some record of the improvement form each individual improvement, and because this change is more about simplifying the code base (the improvement is negligible) while the other is about performance improvement. The plan, currently, is to merge this PR and then merge the remaining part of #15505 that moves serialization. [1] The reason the improvement wasn't quite as good as with #15505 when we ran the benchmarks is almost certainly because, at the point when we ran the benchmarks, I hadn't updated the code to manually serialize the Properties (instead the code was using Java's default serialization for the Properties object, whereas #15505 manually serialized the Properties). This PR has since been updated to manually serialize the Properties, just like the other maps. Author: Kay Ousterhout <kayousterhout@gmail.com> Closes #16053 from kayousterhout/SPARK-17931.
* [MINOR][DOCS] Remove consecutive duplicated words/typo in Spark RepoNiranjan Padmanabhan2017-01-041-1/+1
| | | | | | | | | | | | ## What changes were proposed in this pull request? There are many locations in the Spark repo where the same word occurs consecutively. Sometimes they are appropriately placed, but many times they are not. This PR removes the inappropriately duplicated words. ## How was this patch tested? N/A since only docs or comments were updated. Author: Niranjan Padmanabhan <niranjan.padmanabhan@gmail.com> Closes #16455 from neurons/np.structure_streaming_doc.
* [SPARK-19073] LauncherState should be only set to SUBMITTED after the ↵mingfei2017-01-041-2/+3
| | | | | | | | | | | | | | | application is submitted ## What changes were proposed in this pull request? LauncherState should be only set to SUBMITTED after the application is submitted. Currently the state is set before the application is actually submitted. ## How was this patch tested? no test is added in this patch Author: mingfei <mingfei.smf@alipay.com> Closes #16459 from shimingfei/fixLauncher.
* [SPARK-15555][MESOS] Driver with --supervise option cannot be killed in ↵Devaraj K2017-01-032-2/+56
| | | | | | | | | | | | | | | Mesos mode ## What changes were proposed in this pull request? Not adding the Killed applications for retry. ## How was this patch tested? I have verified manually in the Mesos cluster, with the changes the killed applications move to Finished Drivers section and will not retry. Author: Devaraj K <devaraj@apache.org> Closes #13323 from devaraj-kavali/SPARK-15555.
* [MINOR][DOC] Minor doc change for YARN credential providersLiang-Chi Hsieh2017-01-021-0/+2
| | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? The configuration `spark.yarn.security.tokens.{service}.enabled` is deprecated. Now we should use `spark.yarn.security.credentials.{service}.enabled`. Some places in the doc is not updated yet. ## How was this patch tested? N/A. Just doc change. Please review http://spark.apache.org/contributing.html before opening a pull request. Author: Liang-Chi Hsieh <viirya@gmail.com> Closes #16444 from viirya/minor-credential-provider-doc.
* [SPARK-17807][CORE] split test-tags into test-JARRyan Williams2016-12-211-0/+2
| | | | | | | | | | Remove spark-tag's compile-scope dependency (and, indirectly, spark-core's compile-scope transitive-dependency) on scalatest by splitting test-oriented tags into spark-tags' test JAR. Alternative to #16303. Author: Ryan Williams <ryan.blake.williams@gmail.com> Closes #16311 from ryan-williams/tt.
* [SPARK-8425][SCHEDULER][HOTFIX] fix scala 2.10 compile errorImran Rashid2016-12-151-3/+3
| | | | | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? https://github.com/apache/spark/commit/93cdb8a7d0f124b4db069fd8242207c82e263c52 Introduced a compile error under scala 2.10, this fixes that error. ## How was this patch tested? locally ran ``` dev/change-version-to-2.10.sh build/sbt -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -Dscala-2.10 "project yarn" "test-only *YarnAllocatorSuite" ``` (which failed at test compilation before this change) Author: Imran Rashid <irashid@cloudera.com> Closes #16298 from squito/blacklist-2.10.
* [SPARK-8425][CORE] Application Level BlacklistingImran Rashid2016-12-154-13/+59
| | | | | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? This builds upon the blacklisting introduced in SPARK-17675 to add blacklisting of executors and nodes for an entire Spark application. Resources are blacklisted based on tasks that fail, in tasksets that eventually complete successfully; they are automatically returned to the pool of active resources based on a timeout. Full details are available in a design doc attached to the jira. ## How was this patch tested? Added unit tests, ran them via Jenkins, also ran a handful of them in a loop to check for flakiness. The added tests include: - verifying BlacklistTracker works correctly - verifying TaskSchedulerImpl interacts with BlacklistTracker correctly (via a mock BlacklistTracker) - an integration test for the entire scheduler with blacklisting in a few different scenarios Author: Imran Rashid <irashid@cloudera.com> Author: mwws <wei.mao@intel.com> Closes #14079 from squito/blacklist-SPARK-8425.
* [SPARK-18840][YARN] Avoid throw exception when getting token renewal ↵jerryshao2016-12-131-10/+11
| | | | | | | | | | | | | | | | | | interval in non HDFS security environment ## What changes were proposed in this pull request? Fix `java.util.NoSuchElementException` when running Spark in non-hdfs security environment. In the current code, we assume `HDFS_DELEGATION_KIND` token will be found in Credentials. But in some cloud environments, HDFS is not required, so we should avoid this exception. ## How was this patch tested? Manually verified in local environment. Author: jerryshao <sshao@hortonworks.com> Closes #16265 from jerryshao/SPARK-18840.
* [SPARK-18662] Move resource managers to separate directoryAnirudh2016-12-0679-0/+15723
## What changes were proposed in this pull request? * Moves yarn and mesos scheduler backends to resource-managers/ sub-directory (in preparation for https://issues.apache.org/jira/browse/SPARK-18278) * Corresponding change in top-level pom.xml. Ref: https://github.com/apache/spark/pull/16061#issuecomment-263649340 ## How was this patch tested? * Manual tests /cc rxin Author: Anirudh <ramanathana@google.com> Closes #16092 from foxish/fix-scheduler-structure-2.