aboutsummaryrefslogtreecommitdiff
path: root/common/network-yarn
Commit message (Collapse)AuthorAgeFilesLines
* [SPARK-19139][CORE] New auth mechanism for transport library.Marcelo Vanzin2017-01-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change introduces a new auth mechanism to the transport library, to be used when users enable strong encryption. This auth mechanism has better security than the currently used DIGEST-MD5. The new protocol uses symmetric key encryption to mutually authenticate the endpoints, and is very loosely based on ISO/IEC 9798. The new protocol falls back to SASL when it thinks the remote end is old. Because SASL does not support asking the server for multiple auth protocols, which would mean we could re-use the existing SASL code by just adding a new SASL provider, the protocol is implemented outside of the SASL API to avoid the boilerplate of adding a new provider. Details of the auth protocol are discussed in the included README.md file. This change partly undos the changes added in SPARK-13331; AES encryption is now decoupled from SASL authentication. The encryption code itself, though, has been re-used as part of this change. ## How was this patch tested? - Unit tests - Tested Spark 2.2 against Spark 1.6 shuffle service with SASL enabled - Tested Spark 2.2 against Spark 2.2 shuffle service with SASL fallback disabled Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #16521 from vanzin/SPARK-19139.
* [SPARK-17807][CORE] split test-tags into test-JARRyan Williams2016-12-211-0/+11
| | | | | | | | | | Remove spark-tag's compile-scope dependency (and, indirectly, spark-core's compile-scope transitive-dependency) on scalatest by splitting test-oriented tags into spark-tags' test JAR. Alternative to #16303. Author: Ryan Williams <ryan.blake.williams@gmail.com> Closes #16311 from ryan-williams/tt.
* [SPARK-18773][CORE] Make commons-crypto config translation consistent.Marcelo Vanzin2016-12-121-0/+7
| | | | | | | | | | | | | | | | | | | | | This change moves the logic that translates Spark configuration to commons-crypto configuration to the network-common module. It also extends TransportConf and ConfigProvider to provide the necessary interfaces for the translation to work. As part of the change, I removed SystemPropertyConfigProvider, which was mostly used as an "empty config" in unit tests, and adjusted the very few tests that required a specific config. I also changed the config keys for AES encryption to live under the "spark.network." namespace, which is more correct than their previous names under "spark.authenticate.". Tested via existing unit test. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #16200 from vanzin/SPARK-18773.
* [SPARK-18695] Bump master branch version to 2.2.0-SNAPSHOTReynold Xin2016-12-021-1/+1
| | | | | | | | | | | | ## What changes were proposed in this pull request? This patch bumps master branch version to 2.2.0-SNAPSHOT. ## How was this patch tested? N/A Author: Reynold Xin <rxin@databricks.com> Closes #16126 from rxin/SPARK-18695.
* [SPARK-17611][YARN][TEST] Make shuffle service test really test auth.Marcelo Vanzin2016-09-201-5/+6
| | | | | | | | | | | Currently, the code is just swallowing exceptions, and not really checking whether the auth information was being recorded properly. Fix both problems, and also avoid tests inadvertently affecting other tests by modifying the shared config variable (by making it not shared). Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #15161 from vanzin/SPARK-17611.
* [SPARK-17433] YarnShuffleService doesn't handle moving credentials levelDbThomas Graves2016-09-091-17/+39
| | | | | | | | | | | The secrets leveldb isn't being moved if you run spark shuffle services without yarn nm recovery on and then turn it on. This fixes that. I unfortunately missed this when I ported the patch from our internal branch 2 to master branch due to the changes for the recovery path. Note this only applies to master since it is the only place the yarn nm recovery dir is used. Unit tests ran and tested on 8 node cluster. Fresh startup with NM recovery, fresh startup no nm recovery, switching between no nm recovery and recovery. Also tested running applications to make sure wasn't affected by rolling upgrade. Author: Thomas Graves <tgraves@prevailsail.corp.gq1.yahoo.com> Author: Tom Graves <tgraves@apache.org> Closes #14999 from tgravescs/SPARK-17433.
* [SPARK-16711] YarnShuffleService doesn't re-init properly on YARN rolling ↵Thomas Graves2016-09-021-8/+127
| | | | | | | | | | | | | | upgrade The Spark Yarn Shuffle Service doesn't re-initialize the application credentials early enough which causes any other spark executors trying to fetch from that node during a rolling upgrade to fail with "java.lang.NullPointerException: Password cannot be null if SASL is enabled". Right now the spark shuffle service relies on the Yarn nodemanager to re-register the applications, unfortunately this is after we open the port for other executors to connect. If other executors connected before the re-register they get a null pointer exception which isn't a re-tryable exception and cause them to fail pretty quickly. To solve this I added another leveldb file so that it can save and re-initialize all the applications before opening the port for other executors to connect to it. Adding another leveldb was simpler from the code structure point of view. Most of the code changes are moving things to common util class. Patch was tested manually on a Yarn cluster with rolling upgrade was happing while spark job was running. Without the patch I consistently get the NullPointerException, with the patch the job gets a few Connection refused exceptions but the retries kick in and the it succeeds. Author: Thomas Graves <tgraves@staydecay.corp.gq1.yahoo.com> Closes #14718 from tgravescs/SPARK-16711.
* [SPARK-17332][CORE] Make Java Loggers static membersSean Owen2016-08-311-1/+1
| | | | | | | | | | | | | | ## What changes were proposed in this pull request? Make all Java Loggers static members ## How was this patch tested? Jenkins Author: Sean Owen <sowen@cloudera.com> Closes #14896 from srowen/SPARK-17332.
* [SPARK-16535][BUILD] In pom.xml, remove groupId which is redundant ↵Xin Ren2016-07-191-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | definition and inherited from the parent https://issues.apache.org/jira/browse/SPARK-16535 ## What changes were proposed in this pull request? When I scan through the pom.xml of sub projects, I found this warning as below and attached screenshot ``` Definition of groupId is redundant, because it's inherited from the parent ``` ![screen shot 2016-07-13 at 3 13 11 pm](https://cloud.githubusercontent.com/assets/3925641/16823121/744f893e-4916-11e6-8a52-042f83b9db4e.png) I've tried to remove some of the lines with groupId definition, and the build on my local machine is still ok. ``` <groupId>org.apache.spark</groupId> ``` As I just find now `<maven.version>3.3.9</maven.version>` is being used in Spark 2.x, and Maven-3 supports versionless parent elements: Maven 3 will remove the need to specify the parent version in sub modules. THIS is great (in Maven 3.1). ref: http://stackoverflow.com/questions/3157240/maven-3-worth-it/3166762#3166762 ## How was this patch tested? I've tested by re-building the project, and build succeeded. Author: Xin Ren <iamshrek@126.com> Closes #14189 from keypointt/SPARK-16535.
* [MINOR][BUILD] Fix Java Linter `LineLength` errorsDongjoon Hyun2016-07-191-1/+1
| | | | | | | | | | | | | | ## What changes were proposed in this pull request? This PR fixes four java linter `LineLength` errors. Those are all `LineLength` errors, but we had better remove all java linter errors before release. ## How was this patch tested? After pass the Jenkins, `./dev/lint-java`. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #14255 from dongjoon-hyun/minor_java_linter.
* [SPARK-16505][YARN] Optionally propagate error during shuffle service startup.Marcelo Vanzin2016-07-141-32/+43
| | | | | | | | | | | This prevents the NM from starting when something is wrong, which would lead to later errors which are confusing and harder to debug. Added a unit test to verify startup fails if something is wrong. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #14162 from vanzin/SPARK-16505.
* [SPARK-14963][MINOR][YARN] Fix typo in YarnShuffleService recovery file namejerryshao2016-07-141-1/+1
| | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? Due to the changes of [SPARK-14963](https://issues.apache.org/jira/browse/SPARK-14963), external shuffle recovery file name is changed mistakenly, so here change it back to the previous file name. This only affects the master branch, branch-2.0 is correct [here](https://github.com/apache/spark/blob/branch-2.0/common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java#L195). ## How was this patch tested? N/A Author: jerryshao <sshao@hortonworks.com> Closes #14197 from jerryshao/fix-typo-file-name.
* [SPARK-16477] Bump master version to 2.1.0-SNAPSHOTReynold Xin2016-07-111-1/+1
| | | | | | | | | | | | ## What changes were proposed in this pull request? After SPARK-16476 (committed earlier today as #14128), we can finally bump the version number. ## How was this patch tested? N/A Author: Reynold Xin <rxin@databricks.com> Closes #14130 from rxin/SPARK-16477.
* [SPARK-16018][SHUFFLE] Shade netty to load shuffle jar in NodemangerDhruve Ashar2016-06-171-0/+7
| | | | | | | | | | | | ## What changes were proposed in this pull request? Shade the netty.io namespace so that we can use it in shuffle independent of the dependencies being pulled by hadoop jars. ## How was this patch tested? Ran a decent job involving shuffle write/read and tested the new spark-x-yarn-shuffle jar. After shading netty.io namespace, the nodemanager loads and shuffle job completes successfully. Author: Dhruve Ashar <dhruveashar@gmail.com> Closes #13739 from dhruve/bug/SPARK-16018.
* [SPARK-15290][BUILD] Move annotations, like @Since / @DeveloperApi, into ↵Sean Owen2016-05-171-1/+1
| | | | | | | | | | | | | | | | | | spark-tags ## What changes were proposed in this pull request? (See https://github.com/apache/spark/pull/12416 where most of this was already reviewed and committed; this is just the module structure and move part. This change does not move the annotations into test scope, which was the apparently problem last time.) Rename `spark-test-tags` -> `spark-tags`; move common annotations like `Since` to `spark-tags` ## How was this patch tested? Jenkins tests. Author: Sean Owen <sowen@cloudera.com> Closes #13074 from srowen/SPARK-15290.
* [SPARK-14963][YARN] Using recoveryPath if NM recovery is enabledjerryshao2016-05-101-11/+53
| | | | | | | | | | | | | | ## What changes were proposed in this pull request? From Hadoop 2.5+, Yarn NM supports NM recovery which using recovery path for auxiliary services such as spark_shuffle, mapreduce_shuffle. So here change to use this path install of NM local dir if NM recovery is enabled. ## How was this patch tested? Unit test + local test. Author: jerryshao <sshao@hortonworks.com> Closes #12994 from jerryshao/SPARK-14963.
* Revert "[SPARK-14613][ML] Add @Since into the matrix and vector classes in ↵Yin Huai2016-04-281-1/+1
| | | | | | spark-mllib-local" This reverts commit dae538a4d7c36191c1feb02ba87ffc624ab960dc.
* [SPARK-14613][ML] Add @Since into the matrix and vector classes in ↵Pravin Gadakh2016-04-281-1/+1
| | | | | | | | | | | | | | | | spark-mllib-local ## What changes were proposed in this pull request? This PR adds `since` tag into the matrix and vector classes in spark-mllib-local. ## How was this patch tested? Scala-style checks passed. Author: Pravin Gadakh <prgadakh@in.ibm.com> Closes #12416 from pravingadakh/SPARK-14613.
* [SPARK-14134][CORE] Change the package name used for shading classes.Marcelo Vanzin2016-04-061-2/+2
| | | | | | | | | | | | | | | The current package name uses a dash, which is a little weird but seemed to work. That is, until a new test tried to mock a class that references one of those shaded types, and then things started failing. Most changes are just noise to fix the logging configs. For reference, SPARK-8815 also raised this issue, although at the time it did not cause any issues in Spark, so it was not addressed. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #11941 from vanzin/SPARK-14134.
* [SPARK-13622][YARN] Issue creating level db for YARN shuffle servicenfraison2016-03-281-3/+4
| | | | | | | | | | | | ## What changes were proposed in this pull request? This patch will ensure that we trim all path set in yarn.nodemanager.local-dirs and that the the scheme is well removed so the level db can be created. ## How was this patch tested? manual tests. Author: nfraison <nfraison@yahoo.fr> Closes #11475 from ashangit/level_db_creation_issue.
* [SPARK-13529][BUILD] Move network/* modules into common/network-*Reynold Xin2016-02-283-0/+414
## What changes were proposed in this pull request? As the title says, this moves the three modules currently in network/ into common/network-*. This removes one top level, non-user-facing folder. ## How was this patch tested? Compilation and existing tests. We should run both SBT and Maven. Author: Reynold Xin <rxin@databricks.com> Closes #11409 from rxin/SPARK-13529.