aboutsummaryrefslogtreecommitdiff
path: root/docs/tuning.md
Commit message (Collapse)AuthorAgeFilesLines
* [SPARK-1439, SPARK-1440] Generate unified Scaladoc across projects and JavadocsMatei Zaharia2014-04-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | I used the sbt-unidoc plugin (https://github.com/sbt/sbt-unidoc) to create a unified Scaladoc of our public packages, and generate Javadocs as well. One limitation is that I haven't found an easy way to exclude packages in the Javadoc; there is a SBT task that identifies Java sources to run javadoc on, but it's been very difficult to modify it from outside to change what is set in the unidoc package. Some SBT-savvy people should help with this. The Javadoc site also lacks package-level descriptions and things like that, so we may want to look into that. We may decide not to post these right now if it's too limited compared to the Scala one. Example of the built doc site: http://people.csail.mit.edu/matei/spark-unified-docs/ Author: Matei Zaharia <matei@databricks.com> This patch had conflicts when merged, resolved by Committer: Patrick Wendell <pwendell@gmail.com> Closes #457 from mateiz/better-docs and squashes the following commits: a63d4a3 [Matei Zaharia] Skip Java/Scala API docs for Python package 5ea1f43 [Matei Zaharia] Fix links to Java classes in Java guide, fix some JS for scrolling to anchors on page load f05abc0 [Matei Zaharia] Don't include java.lang package names 995e992 [Matei Zaharia] Skip internal packages and class names with $ in JavaDoc a14a93c [Matei Zaharia] typo 76ce64d [Matei Zaharia] Add groups to Javadoc index page, and a first package-info.java ed6f994 [Matei Zaharia] Generate JavaDoc as well, add titles, update doc site to use unified docs acb993d [Matei Zaharia] Add Unidoc plugin for the projects we want Unidoced
* Update tuning.mdAndrew Ash2014-04-101-2/+3
| | | | | | | | | | http://stackoverflow.com/questions/9699071/what-is-the-javas-internal-represention-for-string-modified-utf-8-utf-16 Author: Andrew Ash <andrew@andrewash.com> Closes #384 from ash211/patch-2 and squashes the following commits: da1b0be [Andrew Ash] Update tuning.md
* SPARK-929: Fully deprecate usage of SPARK_MEMAaron Davidson2014-03-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | (Continued from old repo, prior discussion at https://github.com/apache/incubator-spark/pull/615) This patch cements our deprecation of the SPARK_MEM environment variable by replacing it with three more specialized variables: SPARK_DAEMON_MEMORY, SPARK_EXECUTOR_MEMORY, and SPARK_DRIVER_MEMORY The creation of the latter two variables means that we can safely set driver/job memory without accidentally setting the executor memory. Neither is public. SPARK_EXECUTOR_MEMORY is only used by the Mesos scheduler (and set within SparkContext). The proper way of configuring executor memory is through the "spark.executor.memory" property. SPARK_DRIVER_MEMORY is the new way of specifying the amount of memory run by jobs launched by spark-class, without possibly affecting executor memory. Other memory considerations: - The repl's memory can be set through the "--drivermem" command-line option, which really just sets SPARK_DRIVER_MEMORY. - run-example doesn't use spark-class, so the only way to modify examples' memory is actually an unusual use of SPARK_JAVA_OPTS (which is normally overriden in all cases by spark-class). This patch also fixes a lurking bug where spark-shell misused spark-class (the first argument is supposed to be the main class name, not java options), as well as a bug in the Windows spark-class2.cmd. I have not yet tested this patch on either Windows or Mesos, however. Author: Aaron Davidson <aaron@databricks.com> Closes #99 from aarondav/sparkmem and squashes the following commits: 9df4c68 [Aaron Davidson] SPARK-929: Fully deprecate usage of SPARK_MEM
* update proportion of memoryChen Chao2014-03-031-2/+2
| | | | | | | | | | The default value of "spark.storage.memoryFraction" has been changed from 0.66 to 0.6 . So it should be 60% of the memory to cache while 40% used for task execution. Author: Chen Chao <crazyjvm@gmail.com> Closes #66 from CrazyJvm/master and squashes the following commits: 0f84d86 [Chen Chao] update proportion of memory
* Include reference to twitter/chill in tuning docsAndrew Ash2014-02-241-3/+6
| | | | | | | | Author: Andrew Ash <andrew@andrewash.com> Closes #647 from ash211/doc-tuning and squashes the following commits: b87de0a [Andrew Ash] Include reference to twitter/chill in tuning docs
* remove "-XX:+UseCompressedStrings" optionCrazyJvm2014-01-151-2/+1
| | | remove "-XX:+UseCompressedStrings" option from tuning guide since jdk7 no longer supports this.
* Updated docs for SparkConf and handled review commentsMatei Zaharia2013-12-301-10/+11
|
* Update tuning.mdAndrew Ash2013-11-251-1/+2
| | | Clarify when serializer is used based on recent user@ mailing list discussion.
* Fix Kryo Serializer buffer inconsistencyNeal Wiggins2013-11-201-1/+1
| | | The documentation here is inconsistent with the coded default and other documentation.
* Add docs for standalone scheduler fault toleranceAaron Davidson2013-10-081-1/+1
| | | | Also fix a couple HTML/Markdown issues in other files.
* Move some classes to more appropriate packages:Matei Zaharia2013-09-011-5/+5
| | | | | | * RDD, *RDDFunctions -> org.apache.spark.rdd * Utils, ClosureCleaner, SizeEstimator -> org.apache.spark.util * JavaSerializer, KryoSerializer -> org.apache.spark.serializer
* More fixesMatei Zaharia2013-09-011-6/+7
|
* Made use of spark.executor.memory setting consistent and documented itMatei Zaharia2013-06-301-3/+3
| | | | | | Conflicts: core/src/main/scala/spark/SparkContext.scala
* Update tuning.mdAndrew Ash2013-03-281-1/+1
| | | Make the example more compilable
* Merge branch 'master' into bettersplitsStephen Haberman2013-02-241-1/+1
|\ | | | | | | | | | | | | Conflicts: core/src/main/scala/spark/RDD.scala core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala core/src/test/scala/spark/ShuffleSuite.scala
| * Fixed a 404 -- missing '.html'Mark Hamstra2013-02-101-1/+1
| |
* | Update default.parallelism docs, have StandaloneSchedulerBackend use it.Stephen Haberman2013-02-161-4/+4
|/ | | | | | Only brand new RDDs (e.g. parallelize and makeRDD) now use default parallelism, everything else uses their largest parent's partitioner or partition size.
* Updated Kryo documentation for Kryo version update.Reynold Xin2012-12-211-14/+16
|
* Updates to documentation:Matei Zaharia2012-10-091-44/+61
| | | | | | | | - Edited quick start and tuning guide to simplify them a little - Simplified top menu bar - Made private a SparkContext constructor parameter that was left as public - Various small fixes
* Adds liquid variables to docs templating system so that they can be usedAndy Konwinski2012-10-081-8/+8
| | | | | | | | | throughout the docs: SPARK_VERSION, SCALA_VERSION, and MESOS_VERSION. To use them, e.g. use {{site.SPARK_VERSION}}. Also removes uses of {{HOME_PATH}} which were being resolved to "" by the templating system anyway.
* Some additions to the Tuning Guide.Patrick Wendell2012-10-031-7/+12
| | | | | | | | 1. Slight change in organization 2. Added pre-requisites 3. Made a new section about determining memory footprint of an RDD 4. Other small changes
* First cut at adding documentation for GC tuningShivaram Venkataraman2012-10-021-5/+63
|
* More updates to docs, including tuning guideMatei Zaharia2012-09-261-0/+168