aboutsummaryrefslogtreecommitdiff
path: root/docs/monitoring.md
Commit message (Collapse)AuthorAgeFilesLines
* [SPARK-4546] Improve HistoryServer first time user experienceAndrew Or2014-11-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | The documentation points the user to run the following ``` sbin/start-history-server.sh ``` The first thing this does is throw an exception that complains a log directory is not specified. The exception message itself does not say anything about what to set. Instead we should have a default and a landing page with a better message. The new default log directory is `file:/tmp/spark-events`. This is what it looks like as of this PR: ![after](https://issues.apache.org/jira/secure/attachment/12682985/after.png) Author: Andrew Or <andrew@databricks.com> Closes #3411 from andrewor14/minor-history-improvements and squashes the following commits: f33d6b3 [Andrew Or] Point user to set config if default log dir does not exist fc4c17a [Andrew Or] Improve HistoryServer UX (cherry picked from commit 9afcbe494a3535a9bf7958429b72e989972f82d9) Signed-off-by: Andrew Or <andrew@databricks.com>
* [SPARK-2098] All Spark processes should support spark-defaults.conf, config fileGuoQiang Li2014-10-141-0/+7
| | | | | | | | | | | | | This is another implementation about #1256 cc andrewor14 vanzin Author: GuoQiang Li <witgo@qq.com> Closes #2379 from witgo/SPARK-2098-new and squashes the following commits: 4ef1cbd [GuoQiang Li] review commit 49ef70e [GuoQiang Li] Refactor getDefaultPropertiesFile c45d20c [GuoQiang Li] All Spark processes should support spark-defaults.conf, config file
* Docs: monitoring, streaming programming guidekballou2014-07-311-2/+2
| | | | | | | | | | | | | | | Fix several awkward wordings and grammatical issues in the following documents: * docs/monitoring.md * docs/streaming-programming-guide.md Author: kballou <kballou@devnulllabs.io> Closes #1662 from kennyballou/grammar_fixes and squashes the following commits: e1b8ad6 [kballou] Docs: monitoring, streaming programming guide
* [SPARK-1768] History server enhancements.Marcelo Vanzin2014-06-231-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two improvements to the history server: - Separate the HTTP handling from history fetching, so that it's easy to add new backends later (thinking about SPARK-1537 in the long run) - Avoid loading all UIs in memory. Do lazy loading instead, keeping a few in memory for faster access. This allows the app limit to go away, since holding just the listing in memory shouldn't be too expensive unless the user has millions of completed apps in the history (at which point I'd expect other issues to arise aside from history server memory usage, such as FileSystem.listStatus() starting to become ridiculously expensive). I also fixed a few minor things along the way which aren't really worth mentioning. I also removed the app's log path from the UI since that information may not even exist depending on which backend is used (even though there is only one now). Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #718 from vanzin/hist-server and squashes the following commits: 53620c9 [Marcelo Vanzin] Add mima exclude, fix scaladoc wording. c21f8d8 [Marcelo Vanzin] Feedback: formatting, docs. dd8cc4b [Marcelo Vanzin] Standardize on using spark.history.* configuration. 4da3a52 [Marcelo Vanzin] Remove UI from ApplicationHistoryInfo. 2a7f68d [Marcelo Vanzin] Address review feedback. 4e72c77 [Marcelo Vanzin] Remove comment about ordering. 249bcea [Marcelo Vanzin] Remove offset / count from provider interface. ca5d320 [Marcelo Vanzin] Remove code that deals with unfinished apps. 6e2432f [Marcelo Vanzin] Second round of feedback. b2c570a [Marcelo Vanzin] Make class package-private. 4406f61 [Marcelo Vanzin] Cosmetic change to listing header. e852149 [Marcelo Vanzin] Initialize new app array to expected size. e8026f4 [Marcelo Vanzin] Review feedback. 49d2fd3 [Marcelo Vanzin] Fix a comment. 91e96ca [Marcelo Vanzin] Fix scalastyle issues. 6fbe0d8 [Marcelo Vanzin] Better handle failures when loading app info. eee2f5a [Marcelo Vanzin] Ensure server.stop() is called when shutting down. bda2fa1 [Marcelo Vanzin] Rudimentary paging support for the history UI. b284478 [Marcelo Vanzin] Separate history server from history backend.
* [SPARK-1566] consolidate programming guide, and general doc updatesMatei Zaharia2014-05-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a fairly large PR to clean up and update the docs for 1.0. The major changes are: * A unified programming guide for all languages replaces language-specific ones and shows language-specific info in tabs * New programming guide sections on key-value pairs, unit testing, input formats beyond text, migrating from 0.9, and passing functions to Spark * Spark-submit guide moved to a separate page and expanded slightly * Various cleanups of the menu system, security docs, and others * Updated look of title bar to differentiate the docs from previous Spark versions You can find the updated docs at http://people.apache.org/~matei/1.0-docs/_site/ and in particular http://people.apache.org/~matei/1.0-docs/_site/programming-guide.html. Author: Matei Zaharia <matei@databricks.com> Closes #896 from mateiz/1.0-docs and squashes the following commits: 03e6853 [Matei Zaharia] Some tweaks to configuration and YARN docs 0779508 [Matei Zaharia] tweak ef671d4 [Matei Zaharia] Keep frames in JavaDoc links, and other small tweaks 1bf4112 [Matei Zaharia] Review comments 4414f88 [Matei Zaharia] tweaks d04e979 [Matei Zaharia] Fix some old links to Java guide a34ed33 [Matei Zaharia] tweak 541bb3b [Matei Zaharia] miscellaneous changes fcefdec [Matei Zaharia] Moved submitting apps to separate doc 61d72b4 [Matei Zaharia] stuff 181f217 [Matei Zaharia] migration guide, remove old language guides e11a0da [Matei Zaharia] Add more API functions 6a030a9 [Matei Zaharia] tweaks 8db0ae3 [Matei Zaharia] Added key-value pairs section 318d2c9 [Matei Zaharia] tweaks 1c81477 [Matei Zaharia] New section on basics and function syntax e38f559 [Matei Zaharia] Actually added programming guide to Git a33d6fe [Matei Zaharia] First pass at updating programming guide to support all languages, plus other tweaks throughout 3b6a876 [Matei Zaharia] More CSS tweaks 01ec8bf [Matei Zaharia] More CSS tweaks e6d252e [Matei Zaharia] Change color of doc title bar to differentiate from 0.9.0
* Modify a typo in monitoring.mdKousuke Saruta2014-05-121-1/+1
| | | | | | | | | | | As I mentioned in SPARK-1765, there is a word 'JXM' in monitoring.md. I think it's typo for 'JMX'. Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes #698 from sarutak/SPARK-1765 and squashes the following commits: bae9843 [Kousuke Saruta] modified a typoe in monitoring.md
* Spark 1489 Fix the HistoryServer view aclsThomas Graves2014-04-241-0/+13
| | | | | | | | | | | | This allows the view acls set by the user to be enforced by the history server. It also fixes filters being applied properly. Author: Thomas Graves <tgraves@apache.org> Closes #509 from tgravescs/SPARK-1489 and squashes the following commits: 869c186 [Thomas Graves] change to either acls enabled or disabled 0d8333c [Thomas Graves] Add history ui policy to allow acls to either use application set, history server force acls on, or off 65148b5 [Thomas Graves] SPARK-1489 Fix the HistoryServer view acls
* Spark 1490 Add kerberos support to the HistoryServerThomas Graves2014-04-241-0/+24
| | | | | | | | | | | | | Here I've added the ability for the History server to login from a kerberos keytab file so that the history server can be run as a super user and stay up for along period of time while reading the history files from HDFS. Author: Thomas Graves <tgraves@apache.org> Closes #513 from tgravescs/SPARK-1490 and squashes the following commits: e204a99 [Thomas Graves] remove extra logging 5418daa [Thomas Graves] fix typo in config 0076b99 [Thomas Graves] Update docs 4d76545 [Thomas Graves] SPARK-1490 Add kerberos support to the HistoryServer
* [Fix #204] Eliminate delay between binding and log checkingAndrew Or2014-04-221-4/+15
| | | | | | | | | | | | | | | **Bug**: In the existing history server, there is a `spark.history.updateInterval` seconds delay before application logs show up on the UI. **Cause**: This is because the following events happen in this order: (1) The background thread that checks for logs starts, but realizes the server has not yet bound and so waits for N seconds, (2) server binds, (3) N seconds later the background thread finds that the server has finally bound to a port, and so finally checks for application logs. **Fix**: This PR forces the log checking thread to start immediately after binding. It also documents two relevant environment variables that are currently missing. Author: Andrew Or <andrewor14@gmail.com> Closes #441 from andrewor14/history-server-fix and squashes the following commits: b2eb46e [Andrew Or] Document SPARK_PUBLIC_DNS and SPARK_HISTORY_OPTS for the history server e8d1fbc [Andrew Or] Eliminate delay between binding and checking for logs
* [SPARK-1276] Add a HistoryServer to render persisted UIAndrew Or2014-04-101-5/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new feature of event logging, introduced in #42, allows the user to persist the details of his/her Spark application to storage, and later replay these events to reconstruct an after-the-fact SparkUI. Currently, however, a persisted UI can only be rendered through the standalone Master. This greatly limits the use case of this new feature as many people also run Spark on Yarn / Mesos. This PR introduces a new entity called the HistoryServer, which, given a log directory, keeps track of all completed applications independently of a Spark Master. Unlike Master, the HistoryServer needs not be running while the application is still running. It is relatively light-weight in that it only maintains static information of applications and performs no scheduling. To quickly test it out, generate event logs with ```spark.eventLog.enabled=true``` and run ```sbin/start-history-server.sh <log-dir-path>```. Your HistoryServer awaits on port 18080. Comments and feedback are most welcome. --- A few other changes introduced in this PR include refactoring the WebUI interface, which is beginning to have a lot of duplicate code now that we have added more functionality to it. Two new SparkListenerEvents have been introduced (SparkListenerApplicationStart/End) to keep track of application name and start/finish times. This PR also clarifies the semantics of the ReplayListenerBus introduced in #42. A potential TODO in the future (not part of this PR) is to render live applications in addition to just completed applications. This is useful when applications fail, a condition that our current HistoryServer does not handle unless the user manually signals application completion (by creating the APPLICATION_COMPLETION file). Handling live applications becomes significantly more challenging, however, because it is now necessary to render the same SparkUI multiple times. To avoid reading the entire log every time, which is inefficient, we must handle reading the log from where we previously left off, but this becomes fairly complicated because we must deal with the arbitrary behavior of each input stream. Author: Andrew Or <andrewor14@gmail.com> Closes #204 from andrewor14/master and squashes the following commits: 7b7234c [Andrew Or] Finished -> Completed b158d98 [Andrew Or] Address Patrick's comments 69d1b41 [Andrew Or] Do not block on posting SparkListenerApplicationEnd 19d5dd0 [Andrew Or] Merge github.com:apache/spark f7f5bf0 [Andrew Or] Make history server's web UI port a Spark configuration 2dfb494 [Andrew Or] Decouple checking for application completion from replaying d02dbaa [Andrew Or] Expose Spark version and include it in event logs 2282300 [Andrew Or] Add documentation for the HistoryServer 567474a [Andrew Or] Merge github.com:apache/spark 6edf052 [Andrew Or] Merge github.com:apache/spark 19e1fb4 [Andrew Or] Address Thomas' comments 248cb3d [Andrew Or] Limit number of live applications + add configurability a3598de [Andrew Or] Do not close file system with ReplayBus + fix bind address bc46fc8 [Andrew Or] Merge github.com:apache/spark e2f4ff9 [Andrew Or] Merge github.com:apache/spark 050419e [Andrew Or] Merge github.com:apache/spark 81b568b [Andrew Or] Fix strange error messages... 0670743 [Andrew Or] Decouple page rendering from loading files from disk 1b2f391 [Andrew Or] Minor changes a9eae7e [Andrew Or] Merge branch 'master' of github.com:apache/spark d5154da [Andrew Or] Styling and comments 5dbfbb4 [Andrew Or] Merge branch 'master' of github.com:apache/spark 60bc6d5 [Andrew Or] First complete implementation of HistoryServer (only for finished apps) 7584418 [Andrew Or] Report application start/end times to HistoryServer 8aac163 [Andrew Or] Add basic application table c086bd5 [Andrew Or] Add HistoryServer and scripts ++ Refactor WebUI interface
* SPARK-1167: Remove metrics-ganglia from default build due to LGPL issues...Patrick Wendell2014-03-111-1/+12
| | | | | | | | | | | | | | | | | | | This patch removes Ganglia integration from the default build. It allows users willing to link against LGPL code to use Ganglia by adding build flags or linking against a new Spark artifact called spark-ganglia-lgpl. This brings Spark in line with the Apache policy on LGPL code enumerated here: https://www.apache.org/legal/3party.html#options-optional Author: Patrick Wendell <pwendell@gmail.com> Closes #108 from pwendell/ganglia and squashes the following commits: 326712a [Patrick Wendell] Responding to review feedback 5f28ee4 [Patrick Wendell] SPARK-1167: Remove metrics-ganglia from default build due to LGPL issues.
* Typo: Standlone -> StandaloneAndrew Ash2014-02-141-3/+3
| | | | | | | | | | Author: Andrew Ash <andrew@andrewash.com> Closes #601 from ash211/typo and squashes the following commits: 9cd43ac [Andrew Ash] Change docs references to metrics.properties, not metrics.conf 3813ff1 [Andrew Ash] Typo: mulitcast -> multicast 873bd2f [Andrew Ash] Typo: Standlone -> Standalone
* Updated docs for SparkConf and handled review commentsMatei Zaharia2013-12-301-1/+2
|
* Add graphite sink for metricsRussell Cardullo2013-11-081-0/+1
| | | | | | | This adds a metrics sink for graphite. The sink must be configured with the host and port of a graphite node and optionally may be configured with a prefix that will be prepended to all metrics that are sent to graphite.
* Change port from 3030 to 4040Patrick Wendell2013-09-111-3/+3
|
* Merge pull request #905 from mateiz/docs2Matei Zaharia2013-09-081-9/+21
|\ | | | | Job scheduling and cluster mode docs
| * Added cluster overview doc, made logo higher-resolution, and added moreMatei Zaharia2013-09-081-9/+21
| | | | | | | | details on monitoring
* | Adding more docs and some code cleanupPatrick Wendell2013-09-081-0/+9
|/
* Docs describing Spark monitoring and instrumentationPatrick Wendell2013-09-061-0/+49