aboutsummaryrefslogtreecommitdiff
path: root/conf
Commit message (Collapse)AuthorAgeFilesLines
* [SPARK-17979][SPARK-14453] Remove deprecated SPARK_YARN_USER_ENV and ↵Yong Tang2017-03-101-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | SPARK_JAVA_OPTS This fix removes deprecated support for config `SPARK_YARN_USER_ENV`, as is mentioned in SPARK-17979. This fix also removes deprecated support for the following: ``` SPARK_YARN_USER_ENV SPARK_JAVA_OPTS SPARK_CLASSPATH SPARK_WORKER_INSTANCES ``` Related JIRA: [SPARK-14453]: https://issues.apache.org/jira/browse/SPARK-14453 [SPARK-12344]: https://issues.apache.org/jira/browse/SPARK-12344 [SPARK-15781]: https://issues.apache.org/jira/browse/SPARK-15781 Existing tests should pass. Author: Yong Tang <yong.tang.github@outlook.com> Closes #17212 from yongtang/SPARK-17979.
* [SPARK-11653][DEPLOY] Allow spark-daemon.sh to run in the foregroundMike Ihbe2016-10-201-0/+1
| | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? Add a SPARK_NO_DAEMONIZE environment variable flag to spark-daemon.sh that causes the process it would run to be run in the foreground. It looks like there has been some prior work in https://github.com/apache/spark/pull/3881, but there was some talk about these being refactored. I'm not sure if that happened or not, but that PR is almost 2 years old at this point so it was worth revisiting. ## How was this patch tested? ./dev/run-tests still seems to work. It doesn't look like these scripts have tests, but if I missed them just let me know. Author: Mike Ihbe <mikejihbe@gmail.com> Closes #15338 from mikejihbe/SPARK-11653.
* [SPARK-13238][CORE] Add ganglia dmax parameterEkasit Kijsipongse2016-08-051-0/+1
| | | | | | | | The current ganglia reporter doesn't set metric expiration time (dmax). The metrics of all finished applications are indefinitely left displayed in ganglia web. The dmax parameter allows user to set the lifetime of the metrics. The default value is 0 for compatibility with previous versions. Author: Ekasit Kijsipongse <ekasitk@gmail.com> Closes #11127 from ekasitk/ganglia-dmax.
* [SPARK-15806][DOCUMENTATION] update doc for SPARK_MASTER_IPbomeng2016-06-121-1/+1
| | | | | | | | | | | | | | ## What changes were proposed in this pull request? SPARK_MASTER_IP is a deprecated environment variable. It is replaced by SPARK_MASTER_HOST according to MasterArguments.scala. ## How was this patch tested? Manually verified. Author: bomeng <bmeng@us.ibm.com> Closes #13543 from bomeng/SPARK-15806.
* [YARN][DOC][MINOR] Remove several obsolete env variables and update the docjerryshao2016-05-271-4/+0
| | | | | | | | | | | | | | | | ## What changes were proposed in this pull request? Remove several obsolete env variables not supported for Spark on YARN now, also updates the docs to include several changes with 2.0. ## How was this patch tested? N/A CC vanzin tgravescs Author: jerryshao <sshao@hortonworks.com> Closes #13296 from jerryshao/yarn-doc.
* [SPARK-14134][CORE] Change the package name used for shading classes.Marcelo Vanzin2016-04-061-2/+2
| | | | | | | | | | | | | | | The current package name uses a dash, which is a little weird but seemed to work. That is, until a new test tried to mock a class that references one of those shaded types, and then things started failing. Most changes are just noise to fix the logging configs. For reference, SPARK-8815 also raised this issue, although at the time it did not cause any issues in Spark, so it was not addressed. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #11941 from vanzin/SPARK-14134.
* [SPARK-13264][DOC] Removed multi-byte characters in spark-env.sh.templateSasaki Toru2016-02-111-1/+1
| | | | | | | | In spark-env.sh.template, there are multi-byte characters, this PR will remove it. Author: Sasaki Toru <sasakitoa@nttdata.co.jp> Closes #11149 from sasakitoa/remove_multibyte_in_sparkenv.
* [SPARK-12983][CORE][DOC] Correct metrics.properties.templateBenFradet2016-01-271-34/+37
| | | | | | | | There are some typos or plain unintelligible sentences in the metrics template. Author: BenFradet <benjamin.fradet@gmail.com> Closes #10902 from BenFradet/SPARK-12983.
* [SPARK-11929][CORE] Make the repl log4j configuration override the root logger.Marcelo Vanzin2015-11-241-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | In the default Spark distribution, there are currently two separate log4j config files, with different default values for the root logger, so that when running the shell you have a different default log level. This makes the shell more usable, since the logs don't overwhelm the output. But if you install a custom log4j.properties, you lose that, because then it's going to be used no matter whether you're running a regular app or the shell. With this change, the overriding of the log level is done differently; the log level repl's main class (org.apache.spark.repl.Main) is used to define the root logger's level when running the shell, defaulting to WARN if it's not set explicitly. On a somewhat related change, the shell output about the "sc" variable was changed a bit to contain a little more useful information about the application, since when the root logger's log level is WARN, that information is never shown to the user. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #9816 from vanzin/shell-logging.
* [SPARK-11242][SQL] In conf/spark-env.sh.template SPARK_DRIVER_MEMORY is ↵guoxi2015-10-221-4/+4
| | | | | | | | | | documented incorrectly Minor fix on the comment Author: guoxi <guoxi@us.ibm.com> Closes #9201 from xguo27/SPARK-11242.
* [SPARK-10718] [BUILD] Update License on conf files and corresponding ↵Rekha Joshi2015-09-227-0/+120
| | | | | | | | | | | excludes file update Update License on conf files and corresponding excludes file update Author: Rekha Joshi <rekhajoshm@gmail.com> Author: Joshi <rekhajoshm@gmail.com> Closes #8842 from rekhajoshm/SPARK-10718.
* [SPARK-8118] [SQL] Redirects Parquet JUL logger via SLF4JCheng Lian2015-08-181-0/+2
| | | | | | | | | | Parquet hard coded a JUL logger which always writes to stdout. This PR redirects it via SLF4j JUL bridge handler, so that we can control Parquet logs via `log4j.properties`. This solution is inspired by https://github.com/Parquet/parquet-mr/issues/390#issuecomment-46064909. Author: Cheng Lian <lian@databricks.com> Closes #8196 from liancheng/spark-8118/redirect-parquet-jul.
* [SPARK-9558][DOCS]Update docs to follow the increase of memory defaults.Kousuke Saruta2015-08-031-0/+1
| | | | | | | | | | Now the memory defaults of master and slave in Standalone mode and History Server is 1g, not 512m. So let's update docs. Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes #7896 from sarutak/update-doc-for-daemon-memory and squashes the following commits: a77626c [Kousuke Saruta] Fix docs to follow the update of increase of memory defaults
* [SPARK-9183] confusing error message when looking up missing function in ↵Yijie Shen2015-07-231-0/+4
| | | | | | | | | | | | | | | | Spark SQL JIRA: https://issues.apache.org/jira/browse/SPARK-9183 cc rxin Author: Yijie Shen <henry.yijieshen@gmail.com> Closes #7613 from yjshen/npe_udf and squashes the following commits: 44f58f2 [Yijie Shen] add jira ticket number 903c963 [Yijie Shen] add explanation comments f44dd3c [Yijie Shen] Change two hive class LogLevel to avoid annoying messages
* [SPARK-3071] Increase default driver memoryIlya Ganelin2015-07-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | I've updated default values in comments, documentation, and in the command line builder to be 1g based on comments in the JIRA. I've also updated most usages to point at a single variable defined in the Utils.scala and JavaUtils.java files. This wasn't possible in all cases (R, shell scripts etc.) but usage in most code is now pointing at the same place. Please let me know if I've missed anything. Will the spark-shell use the value within the command line builder during instantiation? Author: Ilya Ganelin <ilya.ganelin@capitalone.com> Closes #7132 from ilganeli/SPARK-3071 and squashes the following commits: 4074164 [Ilya Ganelin] String fix 271610b [Ilya Ganelin] Merge branch 'SPARK-3071' of github.com:ilganeli/spark into SPARK-3071 273b6e9 [Ilya Ganelin] Test fix fd67721 [Ilya Ganelin] Update JavaUtils.java 26cc177 [Ilya Ganelin] test fix e5db35d [Ilya Ganelin] Fixed test failure 39732a1 [Ilya Ganelin] merge fix a6f7deb [Ilya Ganelin] Created default value for DRIVER MEM in Utils that's now used in almost all locations instead of setting manually in each 09ad698 [Ilya Ganelin] Update SubmitRestProtocolSuite.scala 19b6f25 [Ilya Ganelin] Missed one doc update 2698a3d [Ilya Ganelin] Updated default value for driver memory
* [DOC][Minor]Specify the common sources available for collectingYijie Shen2015-06-051-2/+9
| | | | | | | | | | | I was wondering what else common sources available until search the source code. Maybe better to make this clear. Author: Yijie Shen <henry.yijieshen@gmail.com> Closes #6641 from yijieshen/patch-1 and squashes the following commits: b5b99b4 [Yijie Shen] Make it clear that JvmSource is the only available additional source currently f23140c [Yijie Shen] [DOC][Minor]Specify the common sources available for collecting
* [SPARK-7811] Fix typo on slf4j configuration on metrics.properties.tem…Judy Nash2015-05-241-2/+2
| | | | | | | | | | Fix minor typo on metrics.properties.template where slf4j is incorrectly spelled as sl4j. Author: Judy Nash <judynash@microsoft.com> Closes #6362 from judynash/master and squashes the following commits: c644875 [Judy Nash] SPARK-7811: Fix typo on slf4j configuration on metrics.properties.template
* [SPARK-2691] [MESOS] Support for Mesos DockerInfoChris Heller2015-05-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds partial support for running spark on mesos inside of a docker container. Only fine-grained mode is presently supported, and there is no checking done to ensure that the version of libmesos is recent enough to have a DockerInfo structure in the protobuf (other than pinning a mesos version in the pom.xml). Author: Chris Heller <hellertime@gmail.com> Closes #3074 from hellertime/SPARK-2691 and squashes the following commits: d504af6 [Chris Heller] Assist type inference f64885d [Chris Heller] Fix errant line length 17c41c0 [Chris Heller] Base Dockerfile on mesosphere/mesos image 8aebda4 [Chris Heller] Simplfy Docker image docs 1ae7f4f [Chris Heller] Style points 974bd56 [Chris Heller] Convert map to flatMap 5d8bdf7 [Chris Heller] Factor out the DockerInfo construction. 7b75a3d [Chris Heller] Align to styleguide 80108e7 [Chris Heller] Bend to the will of RAT ba77056 [Chris Heller] Explicit RAT exclude abda5e5 [Chris Heller] Wildcard .rat-excludes 2f2873c [Chris Heller] Exclude spark-mesos from RAT a589a5b [Chris Heller] Add example Dockerfile b6825ce [Chris Heller] Remove use of EasyMock eae1b86 [Chris Heller] Move properties under 'spark.mesos.' c184d00 [Chris Heller] Use map on Option to be consistent with non-coarse code fb9501a [Chris Heller] Bumped mesos version to current release fa11879 [Chris Heller] Add listenerBus to EasyMock 882151e [Chris Heller] Changes to scala style b22d42d [Chris Heller] Exclude template from RAT db536cf [Chris Heller] Remove unneeded mocks dea1bd5 [Chris Heller] Force default protocol 7dac042 [Chris Heller] Add test for DockerInfo 5456c0c [Chris Heller] Adjust syntax style 521c194 [Chris Heller] Adjust version info 6e38f70 [Chris Heller] Document Mesos Docker properties 29572ab [Chris Heller] Support all DockerInfo fields b8c0dea [Chris Heller] Support for mesos DockerInfo in coarse-mode. 482a9fd [Chris Heller] Support for mesos DockerInfo in fine-grained mode.
* [SPARK-4286] Add an external shuffle service that can be run as a daemon.Iulian Dragos2015-04-281-1/+2
| | | | | | | | | | | | | | | | | | | | This allows Mesos deployments to use the shuffle service (and implicitly dynamic allocation). It does so by adding a new "main" class and two corresponding scripts in `sbin`: - `sbin/start-shuffle-service.sh` - `sbin/stop-shuffle-service.sh` Specific options can be passed in `SPARK_SHUFFLE_OPTS`. This is picking up work from #3861 /cc tnachen Author: Iulian Dragos <jaguarul@gmail.com> Closes #4990 from dragos/feature/external-shuffle-service and squashes the following commits: 6c2b148 [Iulian Dragos] Import order and wrong name fixup. 07804ad [Iulian Dragos] Moved ExternalShuffleService to the `deploy` package + other minor tweaks. 4dc1f91 [Iulian Dragos] Reviewer’s comments: 8145429 [Iulian Dragos] Add an external shuffle service that can be run as a daemon.
* [SPARK-6758]block the right jetty package in logWangTaoTheTonic2015-04-091-2/+2
| | | | | | | | | | | | https://issues.apache.org/jira/browse/SPARK-6758 I am not sure if it is ok to block them in test resources too (as we shade jetty in assembly?). Author: WangTaoTheTonic <wangtao111@huawei.com> Closes #5406 from WangTaoTheTonic/SPARK-6758 and squashes the following commits: e09605b [WangTaoTheTonic] block the right jetty package
* [SPARK-3619] Part 2. Upgrade to Mesos 0.21 to work around MESOS-1688Jongyoul Lee2015-03-151-1/+1
| | | | | | | | | | | | | | - MESOS_NATIVE_LIBRARY become deprecated - Chagned MESOS_NATIVE_LIBRARY to MESOS_NATIVE_JAVA_LIBRARY Author: Jongyoul Lee <jongyoul@gmail.com> Closes #4361 from jongyoul/SPARK-3619-1 and squashes the following commits: f1ea91f [Jongyoul Lee] Merge branch 'SPARK-3619-1' of https://github.com/jongyoul/spark into SPARK-3619-1 a6a00c2 [Jongyoul Lee] [SPARK-3619] Upgrade to Mesos 0.21 to work around MESOS-1688 - Removed 'Known issues' section 2e15a21 [Jongyoul Lee] [SPARK-3619] Upgrade to Mesos 0.21 to work around MESOS-1688 - MESOS_NATIVE_LIBRARY become deprecated - Chagned MESOS_NATIVE_LIBRARY to MESOS_NATIVE_JAVA_LIBRARY 0dace7b [Jongyoul Lee] [SPARK-3619] Upgrade to Mesos 0.21 to work around MESOS-1688 - MESOS_NATIVE_LIBRARY become deprecated - Chagned MESOS_NATIVE_LIBRARY to MESOS_NATIVE_JAVA_LIBRARY
* [Spark-5708] Add Slf4jSink to Spark MetricsJudy2015-02-241-0/+9
| | | | | | | | | | | | | | | Add Slf4jSink to Spark Metrics using Coda Hale's SlfjReporter. This sends metrics to log4j, allowing spark users to reuse log4j pipeline for metrics collection. Reviewed existing unit tests and didn't see any sink-related tests. Please advise on if tests should be added. Author: Judy <judynash@microsoft.com> Author: judynash <judynash@microsoft.com> Closes #4644 from judynash/master and squashes the following commits: 57ef214 [judynash] doc clarification and indent fixes a751a66 [Judy] Spark-5708: Add Slf4jSink to Spark Metrics
* [SPARK-5422] Add support for sending Graphite metrics via UDPRyan Williams2015-01-311-0/+1
| | | | | | | | | | | Depends on [SPARK-5413](https://issues.apache.org/jira/browse/SPARK-5413) / #4209, included here, will rebase once the latter's merged. Author: Ryan Williams <ryan.blake.williams@gmail.com> Closes #4218 from ryan-williams/udp and squashes the following commits: ebae393 [Ryan Williams] Add support for sending Graphite metrics via UDP cb58262 [Ryan Williams] bump metrics dependency to v3.1.0
* [SPARK-4889] update history server example cmdsRyan Williams2014-12-191-2/+2
| | | | | | | | | | | Author: Ryan Williams <ryan.blake.williams@gmail.com> Closes #3736 from ryan-williams/hist and squashes the following commits: 421d8ff [Ryan Williams] add another random typo fix 76d6a4c [Ryan Williams] remove hdfs example a2d0f82 [Ryan Williams] code review feedback 9ca7629 [Ryan Williams] [SPARK-4889] update history server example cmds
* SPARK-3663 Document SPARK_LOG_DIR and SPARK_PID_DIRAndrew Ash2014-11-141-1/+8
| | | | | | | | | | | | These descriptions are from the header of spark-daemon.sh Author: Andrew Ash <andrew@andrewash.com> Closes #2518 from ash211/SPARK-3663 and squashes the following commits: 058b257 [Andrew Ash] Complete hanging clause in SPARK_PID_DIR description a17cb4b [Andrew Ash] Update docs for default locations per SPARK-4110 af89096 [Andrew Ash] SPARK-3663 Document SPARK_LOG_DIR and SPARK_PID_DIR
* [SPARK-3584] sbin/slaves doesn't work when we use password authentication ↵Kousuke Saruta2014-09-251-0/+0
| | | | | | | | | | | | | | | | | for SSH Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes #2444 from sarutak/slaves-scripts-modification and squashes the following commits: eff7394 [Kousuke Saruta] Improve the description about Cluster Launch Script in docs/spark-standalone.md 7858225 [Kousuke Saruta] Modified sbin/slaves to use the environment variable "SPARK_SSH_FOREGROUND" as a flag 53d7121 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into slaves-scripts-modification e570431 [Kousuke Saruta] Added a description for SPARK_SSH_FOREGROUND variable 7120a0c [Kousuke Saruta] Added a description about default host for sbin/slaves 1bba8a9 [Kousuke Saruta] Added SPARK_SSH_FOREGROUND flag to sbin/slaves 88e2f17 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into slaves-scripts-modification 297e75d [Kousuke Saruta] Modified sbin/slaves not to export HOSTLIST
* HOTFIX: Minor typo in conf templatePatrick Wendell2014-08-261-1/+1
|
* [SPARK-2849] Handle driver configs separately in client modeAndrew Or2014-08-201-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In client deploy mode, the driver is launched from within `SparkSubmit`'s JVM. This means by the time we parse Spark configs from `spark-defaults.conf`, it is already too late to control certain properties of the driver's JVM. We currently ignore these configs in client mode altogether. ``` spark.driver.memory spark.driver.extraJavaOptions spark.driver.extraClassPath spark.driver.extraLibraryPath ``` This PR handles these properties before launching the driver JVM. It achieves this by spawning a separate JVM that runs a new class called `SparkSubmitDriverBootstrapper`, which spawns `SparkSubmit` as a sub-process with the appropriate classpath, library paths, java opts and memory. Author: Andrew Or <andrewor14@gmail.com> Closes #1845 from andrewor14/handle-configs-bash and squashes the following commits: bed4bdf [Andrew Or] Change a few comments / messages (minor) 24dba60 [Andrew Or] Merge branch 'master' of github.com:apache/spark into handle-configs-bash 08fd788 [Andrew Or] Warn against external usages of SparkSubmitDriverBootstrapper ff34728 [Andrew Or] Minor comments 51aeb01 [Andrew Or] Filter out JVM memory in Scala rather than Bash (minor) 9a778f6 [Andrew Or] Fix PySpark: actually kill driver on termination d0f20db [Andrew Or] Don't pass empty library paths, classpath, java opts etc. a78cb26 [Andrew Or] Revert a few changes in utils.sh (minor) 9ba37e2 [Andrew Or] Don't barf when the properties file does not exist 8867a09 [Andrew Or] A few more naming things (minor) 19464ad [Andrew Or] SPARK_SUBMIT_JAVA_OPTS -> SPARK_SUBMIT_OPTS d6488f9 [Andrew Or] Merge branch 'master' of github.com:apache/spark into handle-configs-bash 1ea6bbe [Andrew Or] SparkClassLauncher -> SparkSubmitDriverBootstrapper a91ea19 [Andrew Or] Fix precedence of library paths, classpath, java opts and memory 158f813 [Andrew Or] Remove "client mode" boolean argument c84f5c8 [Andrew Or] Remove debug print statement (minor) b71f52b [Andrew Or] Revert a few more changes (minor) 7d94a8d [Andrew Or] Merge branch 'master' of github.com:apache/spark into handle-configs-bash 3a8235d [Andrew Or] Only parse the properties file if special configs exist c37e08d [Andrew Or] Revert a few more changes a396eda [Andrew Or] Nullify my own hard work to simplify bash 0effa1e [Andrew Or] Add code in Scala that handles special configs c886568 [Andrew Or] Fix lines too long + a few comments / style (minor) 7a4190a [Andrew Or] Merge branch 'master' of github.com:apache/spark into handle-configs-bash 7396be2 [Andrew Or] Explicitly comment that multi-line properties are not supported fa11ef8 [Andrew Or] Parse the properties file only if the special configs exist 371cac4 [Andrew Or] Add function prefix (minor) be99eb3 [Andrew Or] Fix tests to not include multi-line configs bd0d468 [Andrew Or] Simplify parsing config file by ignoring multi-line arguments 56ac247 [Andrew Or] Use eval and set to simplify splitting 8d4614c [Andrew Or] Merge branch 'master' of github.com:apache/spark into handle-configs-bash aeb79c7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into handle-configs-bash 2732ac0 [Andrew Or] Integrate BASH tests into dev/run-tests + log error properly 8d26a5c [Andrew Or] Add tests for bash/utils.sh 4ae24c3 [Andrew Or] Fix bug: escape properly in quote_java_property b3c4cd5 [Andrew Or] Fix bug: count the number of quotes instead of detecting presence c2273fc [Andrew Or] Fix typo (minor) e793e5f [Andrew Or] Handle multi-line arguments 5d8f8c4 [Andrew Or] Merge branch 'master' of github.com:apache/spark into submit-driver-extra c7b9926 [Andrew Or] Minor changes to spark-defaults.conf.template a992ae2 [Andrew Or] Escape spark.*.extraJavaOptions correctly aabfc7e [Andrew Or] escape -> split (minor) 45a1eb9 [Andrew Or] Fix bug: escape escaped backslashes and quotes properly... 1cdc6b1 [Andrew Or] Fix bug: escape escaped double quotes properly c854859 [Andrew Or] Add small comment c13a2cb [Andrew Or] Merge branch 'master' of github.com:apache/spark into submit-driver-extra 8e552b7 [Andrew Or] Include an example of spark.*.extraJavaOptions de765c9 [Andrew Or] Print spark-class command properly a4df3c4 [Andrew Or] Move parsing and escaping logic to utils.sh dec2343 [Andrew Or] Only export variables if they exist fa2136e [Andrew Or] Escape Java options + parse java properties files properly ef12f74 [Andrew Or] Minor formatting 4ec22a1 [Andrew Or] Merge branch 'master' of github.com:apache/spark into submit-driver-extra e5cfb46 [Andrew Or] Collapse duplicate code + fix potential whitespace issues 4edcaa8 [Andrew Or] Redirect stdout to stderr for python 130f295 [Andrew Or] Handle spark.driver.memory too 98dd8e3 [Andrew Or] Add warning if properties file does not exist 8843562 [Andrew Or] Fix compilation issues... 75ee6b4 [Andrew Or] Remove accidentally added file 63ed2e9 [Andrew Or] Merge branch 'master' of github.com:apache/spark into submit-driver-extra 0025474 [Andrew Or] Revert SparkSubmit handling of --driver-* options for only cluster mode a2ab1b0 [Andrew Or] Parse spark.driver.extra* in bash 250cb95 [Andrew Or] Do not ignore spark.driver.extra* for client mode
* SPARK-1902 Silence stacktrace from logs when doing port failover to port n+1Andrew Ash2014-06-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before: ``` 14/06/08 23:58:23 WARN AbstractLifeCycle: FAILED SelectChannelConnector@0.0.0.0:4040: java.net.BindException: Address already in use java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.eclipse.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187) at org.eclipse.jetty.server.AbstractConnector.doStart(AbstractConnector.java:316) at org.eclipse.jetty.server.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:265) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.server.Server.doStart(Server.java:293) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.apache.spark.ui.JettyUtils$$anonfun$1.apply$mcV$sp(JettyUtils.scala:192) at org.apache.spark.ui.JettyUtils$$anonfun$1.apply(JettyUtils.scala:192) at org.apache.spark.ui.JettyUtils$$anonfun$1.apply(JettyUtils.scala:192) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.ui.JettyUtils$.connect$1(JettyUtils.scala:191) at org.apache.spark.ui.JettyUtils$.startJettyServer(JettyUtils.scala:205) at org.apache.spark.ui.WebUI.bind(WebUI.scala:99) at org.apache.spark.SparkContext.<init>(SparkContext.scala:223) at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:957) at $line3.$read$$iwC$$iwC.<init>(<console>:8) at $line3.$read$$iwC.<init>(<console>:14) at $line3.$read.<init>(<console>:16) at $line3.$read$.<init>(<console>:20) at $line3.$read$.<clinit>(<console>) at $line3.$eval$.<init>(<console>:7) at $line3.$eval$.<clinit>(<console>) at $line3.$eval.$print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:788) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1056) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:614) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:645) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:609) at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:796) at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:841) at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:753) at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:121) at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:120) at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:263) at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:120) at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:56) at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:913) at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:142) at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:56) at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:104) at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:56) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:930) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:884) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:982) at org.apache.spark.repl.Main$.main(Main.scala:31) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 14/06/08 23:58:23 WARN AbstractLifeCycle: FAILED org.eclipse.jetty.server.Server@7439e55a: java.net.BindException: Address already in use java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.eclipse.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187) at org.eclipse.jetty.server.AbstractConnector.doStart(AbstractConnector.java:316) at org.eclipse.jetty.server.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:265) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.server.Server.doStart(Server.java:293) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.apache.spark.ui.JettyUtils$$anonfun$1.apply$mcV$sp(JettyUtils.scala:192) at org.apache.spark.ui.JettyUtils$$anonfun$1.apply(JettyUtils.scala:192) at org.apache.spark.ui.JettyUtils$$anonfun$1.apply(JettyUtils.scala:192) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.ui.JettyUtils$.connect$1(JettyUtils.scala:191) at org.apache.spark.ui.JettyUtils$.startJettyServer(JettyUtils.scala:205) at org.apache.spark.ui.WebUI.bind(WebUI.scala:99) at org.apache.spark.SparkContext.<init>(SparkContext.scala:223) at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:957) at $line3.$read$$iwC$$iwC.<init>(<console>:8) at $line3.$read$$iwC.<init>(<console>:14) at $line3.$read.<init>(<console>:16) at $line3.$read$.<init>(<console>:20) at $line3.$read$.<clinit>(<console>) at $line3.$eval$.<init>(<console>:7) at $line3.$eval$.<clinit>(<console>) at $line3.$eval.$print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:788) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1056) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:614) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:645) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:609) at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:796) at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:841) at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:753) at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:121) at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:120) at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:263) at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:120) at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:56) at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:913) at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:142) at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:56) at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:104) at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:56) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:930) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:884) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:982) at org.apache.spark.repl.Main$.main(Main.scala:31) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 14/06/08 23:58:23 INFO JettyUtils: Failed to create UI at port, 4040. Trying again. 14/06/08 23:58:23 INFO JettyUtils: Error was: Failure(java.net.BindException: Address already in use) 14/06/08 23:58:23 INFO SparkUI: Started SparkUI at http://aash-mbp.local:4041 ```` After: ``` 14/06/09 00:04:12 INFO JettyUtils: Failed to create UI at port, 4040. Trying again. 14/06/09 00:04:12 INFO JettyUtils: Error was: Failure(java.net.BindException: Address already in use) 14/06/09 00:04:12 INFO Server: jetty-8.y.z-SNAPSHOT 14/06/09 00:04:12 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4041 14/06/09 00:04:12 INFO SparkUI: Started SparkUI at http://aash-mbp.local:4041 ``` Lengthy logging comes from this line of code in Jetty: http://grepcode.com/file/repo1.maven.org/maven2/org.eclipse.jetty.aggregate/jetty-all/9.1.3.v20140225/org/eclipse/jetty/util/component/AbstractLifeCycle.java#210 Author: Andrew Ash <andrew@andrewash.com> Closes #1019 from ash211/SPARK-1902 and squashes the following commits: 0dd02f7 [Andrew Ash] Leave old org.eclipse.jetty silencing in place 1e2866b [Andrew Ash] Address CR comments 9d85eed [Andrew Ash] SPARK-1902 Silence stacktrace from logs when doing port failover to port n+1
* [SPARK-1753 / 1773 / 1814] Update outdated docs for spark-submit, YARN, ↵Andrew Or2014-05-122-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | standalone etc. YARN - SparkPi was updated to not take in master as an argument; we should update the docs to reflect that. - The default YARN build guide should be in maven, not sbt. - This PR also adds a paragraph on steps to debug a YARN application. Standalone - Emphasize spark-submit more. Right now it's one small paragraph preceding the legacy way of launching through `org.apache.spark.deploy.Client`. - The way we set configurations / environment variables according to the old docs is outdated. This needs to reflect changes introduced by the Spark configuration changes we made. In general, this PR also adds a little more documentation on the new spark-shell, spark-submit, spark-defaults.conf etc here and there. Author: Andrew Or <andrewor14@gmail.com> Closes #701 from andrewor14/yarn-docs and squashes the following commits: e2c2312 [Andrew Or] Merge in changes in #752 (SPARK-1814) 25cfe7b [Andrew Or] Merge in the warning from SPARK-1753 a8c39c5 [Andrew Or] Minor changes 336bbd9 [Andrew Or] Tabs -> spaces 4d9d8f7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs 041017a [Andrew Or] Abstract Spark submit documentation to cluster-overview.html 3cc0649 [Andrew Or] Detail how to set configurations + remove legacy instructions 5b7140a [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs 85a51fc [Andrew Or] Update run-example, spark-shell, configuration etc. c10e8c7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs 381fe32 [Andrew Or] Update docs for standalone mode 757c184 [Andrew Or] Add a note about the requirements for the debugging trick f8ca990 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs 924f04c [Andrew Or] Revert addition of --deploy-mode d5fe17b [Andrew Or] Update the YARN docs
* [SPARK-1780] Non-existent SPARK_DAEMON_OPTS is lurking aroundAndrew Or2014-05-121-1/+1
| | | | | | | | | | What they really mean is SPARK_DAEMON_***JAVA***_OPTS Author: Andrew Or <andrewor14@gmail.com> Closes #751 from andrewor14/spark-daemon-opts and squashes the following commits: 70c41f9 [Andrew Or] SPARK_DAEMON_OPTS -> SPARK_DAEMON_JAVA_OPTS
* Assorted clean-up for Spark-on-YARN.Patrick Wendell2014-04-221-0/+2
| | | | | | | | | | | | In particular when the HADOOP_CONF_DIR is not not specified. Author: Patrick Wendell <pwendell@gmail.com> Closes #488 from pwendell/hadoop-cleanup and squashes the following commits: fe95f13 [Patrick Wendell] Changes based on Andrew's feeback 18d09c1 [Patrick Wendell] Review comments from Andrew 17929cc [Patrick Wendell] Assorted clean-up for Spark-on-YARN.
* Clean up and simplify Spark configurationPatrick Wendell2014-04-212-12/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Over time as we've added more deployment modes, this have gotten a bit unwieldy with user-facing configuration options in Spark. Going forward we'll advise all users to run `spark-submit` to launch applications. This is a WIP patch but it makes the following improvements: 1. Improved `spark-env.sh.template` which was missing a lot of things users now set in that file. 2. Removes the shipping of SPARK_CLASSPATH, SPARK_JAVA_OPTS, and SPARK_LIBRARY_PATH to the executors on the cluster. This was an ugly hack. Instead it introduces config variables spark.executor.extraJavaOpts, spark.executor.extraLibraryPath, and spark.executor.extraClassPath. 3. Adds ability to set these same variables for the driver using `spark-submit`. 4. Allows you to load system properties from a `spark-defaults.conf` file when running `spark-submit`. This will allow setting both SparkConf options and other system properties utilized by `spark-submit`. 5. Made `SPARK_LOCAL_IP` an environment variable rather than a SparkConf property. This is more consistent with it being set on each node. Author: Patrick Wendell <pwendell@gmail.com> Closes #299 from pwendell/config-cleanup and squashes the following commits: 127f301 [Patrick Wendell] Improvements to testing a006464 [Patrick Wendell] Moving properties file template. b4b496c [Patrick Wendell] spark-defaults.properties -> spark-defaults.conf 0086939 [Patrick Wendell] Minor style fixes af09e3e [Patrick Wendell] Mention config file in docs and clean-up docs b16e6a2 [Patrick Wendell] Cleanup of spark-submit script and Scala quick start guide af0adf7 [Patrick Wendell] Automatically add user jar a56b125 [Patrick Wendell] Responses to Tom's review d50c388 [Patrick Wendell] Merge remote-tracking branch 'apache/master' into config-cleanup a762901 [Patrick Wendell] Fixing test failures ffa00fe [Patrick Wendell] Review feedback fda0301 [Patrick Wendell] Note 308f1f6 [Patrick Wendell] Properly escape quotes and other clean-up for YARN e83cd8f [Patrick Wendell] Changes to allow re-use of test applications be42f35 [Patrick Wendell] Handle case where SPARK_HOME is not set c2a2909 [Patrick Wendell] Test compile fixes 4ee6f9d [Patrick Wendell] Making YARN doc changes consistent afc9ed8 [Patrick Wendell] Cleaning up line limits and two compile errors. b08893b [Patrick Wendell] Additional improvements. ace4ead [Patrick Wendell] Responses to review feedback. b72d183 [Patrick Wendell] Review feedback for spark env file 46555c1 [Patrick Wendell] Review feedback and import clean-ups 437aed1 [Patrick Wendell] Small fix 761ebcd [Patrick Wendell] Library path and classpath for drivers 7cc70e4 [Patrick Wendell] Clean up terminology inside of spark-env script 5b0ba8e [Patrick Wendell] Don't ship executor envs 84cc5e5 [Patrick Wendell] Small clean-up 1f75238 [Patrick Wendell] SPARK_JAVA_OPTS --> SPARK_MASTER_OPTS for master settings 4982331 [Patrick Wendell] Remove SPARK_LIBRARY_PATH 6eaf7d0 [Patrick Wendell] executorJavaOpts 0faa3b6 [Patrick Wendell] Stash of adding config options in submit script and YARN ac2d65e [Patrick Wendell] Change spark.local.dir -> SPARK_LOCAL_DIRS
* Revert "[SPARK-1150] fix repo location in create script"Patrick Wendell2014-03-011-1/+1
| | | | This reverts commit 9aa095711858ce8670e51488f66a3d7c1a821c30.
* [SPARK-1150] fix repo location in create scriptMark Grover2014-03-011-1/+1
| | | | | | | | | | | | | https://spark-project.atlassian.net/browse/SPARK-1150 fix the repo location in create_release script Author: Mark Grover <mark@apache.org> Closes #48 from CodingCat/script_fixes and squashes the following commits: 01f4bf7 [Mark Grover] Fixing some nitpicks d2244d4 [Mark Grover] SPARK-676: Abbreviation in SPARK_MEM but not in SPARK_WORKER_MEMORY
* [SPARK-1041] remove dead code in start script, remind user to set that in ↵CodingCat2014-02-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | spark-env.sh the lines in start-master.sh and start-slave.sh no longer work in ec2, the host name has changed, e.g. ubuntu@ip-172-31-36-93:~$ hostname ip-172-31-36-93 also, the URL to fetch public DNS name also changed, e.g. ubuntu@ip-172-31-36-93:~$ wget -q -O - http://instance-data.ec2.internal/latest/meta-data/public-hostname ubuntu@ip-172-31-36-93:~$ (returns nothing) since we have spark-ec2 project, we don't need to have such ec2-specific lines here, instead, user only need to set in spark-env.sh Author: CodingCat <zhunansjtu@gmail.com> Closes #588 from CodingCat/deadcode_in_sbin and squashes the following commits: e4236e0 [CodingCat] remove dead code in start script, remind user set that in spark-env.sh
* Typo: Standlone -> StandaloneAndrew Ash2014-02-141-1/+1
| | | | | | | | | | Author: Andrew Ash <andrew@andrewash.com> Closes #601 from ash211/typo and squashes the following commits: 9cd43ac [Andrew Ash] Change docs references to metrics.properties, not metrics.conf 3813ff1 [Andrew Ash] Typo: mulitcast -> multicast 873bd2f [Andrew Ash] Typo: Standlone -> Standalone
* Make DEBUG-level logs consummable.Patrick Wendell2014-01-101-1/+3
| | | | | | | Removes two things that caused issues with the debug logs: (a) Internal polling in the DAGScheduler was polluting the logs. (b) The Scala REPL logs were really noisy.
* Send logs to stderr by default (instead of stdout).Patrick Wendell2014-01-091-0/+1
|
* add the comments about SPARK_WORKER_DIRCodingCat2014-01-071-1/+1
| | | | this env variable seems to be forgotten …
* Add graphite sink for metricsRussell Cardullo2013-11-081-0/+8
| | | | | | | This adds a metrics sink for graphite. The sink must be configured with the host and port of a graphite node and optionally may be configured with a prefix that will be prepended to all metrics that are sent to graphite.
* Merge pull request #905 from mateiz/docs2Matei Zaharia2013-09-081-10/+8
|\ | | | | Job scheduling and cluster mode docs
| * More fair scheduler docs and property names.Matei Zaharia2013-09-081-10/+8
| | | | | | | | | | Also changed uses of "job" terminology to "application" when they referred to an entire Spark program, to avoid confusion.
* | Fixing package name in template confPatrick Wendell2013-09-081-7/+7
| |
* | Adding more docs and some code cleanupPatrick Wendell2013-09-081-1/+35
|/
* Another fix suggested by PatrickMatei Zaharia2013-08-311-1/+1
|
* Fixes suggested by PatrickMatei Zaharia2013-08-311-1/+1
|
* More updates, describing changes to recommended use of environment varsMatei Zaharia2013-08-311-13/+10
| | | | and new Python stuff
* fixing typoseanm2013-08-151-1/+1
|
* MetricsServlet code refactor according to commentsjerryshao2013-08-121-9/+4
|