Merge commit 'cba585d' into merge-2.11-to-2.12-june-1

author: Lukas Rytz <lukas.rytz@gmail.com> 2016-06-01 10:20:59 +0200
committer: Lukas Rytz <lukas.rytz@gmail.com> 2016-06-01 10:20:59 +0200
commit: 20dd825ec6b5d4015ce36cf4373ba1c1d917da94 (patch)
tree: 8853ed0b590f37607fdedaed36a74ca6fada34c8 /test/benchmarks
parent: 697158414f563a9b43ada696a1bb949eab96208b (diff)
parent: cba585d41b9c2c47f256cbce45115bb205ae58a2 (diff)
download: scala-20dd825ec6b5d4015ce36cf4373ba1c1d917da94.tar.gz
scala-20dd825ec6b5d4015ce36cf4373ba1c1d917da94.tar.bz2
scala-20dd825ec6b5d4015ce36cf4373ba1c1d917da94.zip
7 files changed, 457 insertions, 0 deletions
diff --git a/test/benchmarks/.gitignore b/test/benchmarks/.gitignore
new file mode 100644
index 0000000000..ce4d893417
--- /dev/null
+++ b/test/benchmarks/.gitignore
@@ -0,0 +1,14 @@
+/project/project/
+/project/target/
+/target/
+
+# what appears to be a Scala IDE-generated file
+.cache-main
+
+# standard Eclipse output directory
+/bin/
+
+# sbteclipse-generated Eclipse files
+/.classpath
+/.project
+/.settings/
diff --git a/test/benchmarks/README.md b/test/benchmarks/README.md
new file mode 100644
index 0000000000..370d610bc4
--- /dev/null
+++ b/test/benchmarks/README.md
@@ -0,0 +1,105 @@
+# Scala library benchmarks
+
+This directory is a standalone SBT project, within the Scala project,
+that makes use of the [SBT plugin](https://github.com/ktoso/sbt-jmh) for [JMH](http://openjdk.java.net/projects/code-tools/jmh/).
+
+## Running a benchmark
+
+The benchmarks require first building Scala into `../../build/pack` with `ant`.
+If you want to build with `sbt dist/mkPack` instead,
+you'll need to change `scalaHome` in this project.
+
+You'll then need to know the fully-qualified name of the benchmark runner class.
+The benchmarking classes are organized under `src/main/scala`,
+in the same package hierarchy as the classes that they test.
+Assuming that we're benchmarking `scala.collection.mutable.OpenHashMap`,
+the benchmark runner would likely be named `scala.collection.mutable.OpenHashMapRunner`.
+Using this example, one would simply run
+
+    jmh:runMain scala.collection.mutable.OpenHashMapRunner
+
+in SBT.
+SBT should be run _from this directory_.
+
+The JMH results can be found under `target/jmh-results/`.
+`target` gets deleted on an SBT `clean`,
+so you should copy these files out of `target` if you wish to preserve them.
+
+## Creating a benchmark and runner
+
+The benchmarking classes use the same package hierarchy as the classes that they test
+in order to make it easy to expose, in package scope, members of the class under test,
+should that be necessary for benchmarking.
+
+There are two types of classes in the source directory:
+those suffixed `Benchmark` and those suffixed `Runner`.
+The former are benchmarks that can be run directly using `jmh:run`;
+however, they are normally run from a corresponding class of the latter type,
+which is run using `jmh:runMain` (as described above).
+This …`Runner` class is useful for setting appropriate JMH command options,
+and for processing the JMH results into files that can be read by other tools, such as Gnuplot.
+
+The `benchmark.JmhRunner` trait should be woven into any runner class, for the standard behavior that it provides.
+This includes creating output files in a subdirectory of `target/jmh-results`
+derived from the fully-qualified package name of the `Runner` class.
+
+## Some useful HotSpot options
+Adding these to the `jmh:run` or `jmh:runMain` command line may help if you're using the HotSpot (Oracle, OpenJDK) compiler.
+They require prefixing with `-jvmArgs`.
+See [the Java documentation](http://docs.oracle.com/javase/8/docs/technotes/tools/unix/java.html) for more options. 
+
+### Viewing JIT compilation events
+Adding `-XX:+PrintCompilation` shows when Java methods are being compiled or deoptimized.
+At the most basic level,
+these messages will tell you whether the code that you're measuring is still being tuned,
+so that you know whether you're running enough warm-up iterations.
+See [Kris Mok's notes](https://gist.github.com/rednaxelafx/1165804#file-notes-md) to interpret the output in detail.
+
+### Consider GC events
+If you're not explicitly performing `System.gc()` calls outside of your benchmarking code,
+you should add the JVM option `-verbose:gc` to understand the effect that GCs may be having on your tests.
+
+### "Diagnostic" options
+These require the `-XX:+UnlockDiagnosticVMOptions` JVM option.
+
+#### Viewing inlining events
+Add `-XX:+PrintInlining`.
+
+#### Viewing the disassembled code
+If you're running OpenJDK or Oracle JVM,
+you may need to install the disassembler library (`hsdis-amd64.so` for the `amd64` architecture).
+In Debian, this is available in
+<a href="https://packages.debian.org/search?keywords=libhsdis0-fcml">the `libhsdis0-fcml` package</a>.
+For an Oracle (or other compatible) JVM not set up by your distribution,
+you may also need to copy or link the disassembler library
+to the `jre/lib/`_`architecture`_ directory inside your JVM installation directory.
+
+To show the assembly code corresponding to the code generated by the JIT compiler for specific methods,
+add `-XX:CompileCommand=print,scala.collection.mutable.OpenHashMap::*`,
+for example, to show all of the methods in the `scala.collection.mutable.OpenHashMap` class.
+
+To show it for _all_ methods, add `-XX:+PrintAssembly`.
+(This is usually excessive.)
+
+## Useful reading
+* [OpenJDK advice on microbenchmarks](https://wiki.openjdk.java.net/display/HotSpot/MicroBenchmarks)
+* Brian Goetz's "Java theory and practice" articles:
+  * "[Dynamic compilation and performance measurement](http://www.ibm.com/developerworks/java/library/j-jtp12214/)"
+  * "[Anatomy of a flawed benchmark](http://www.ibm.com/developerworks/java/library/j-jtp02225/)"
+* [Doug Lea's JSR 166 benchmarks](http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/test/loops/)
+* "[Measuring performance](http://docs.scala-lang.org/overviews/parallel-collections/performance.html)" of Scala parallel collections
+
+## Legacy frameworks
+
+An older version of the benchmarking framework is still present in this directory, in the following locations:
+
+<dl>
+<dt><code>bench</code></dt>
+<dd>A script to run the old benchmarks.</dd>
+<dt><code>source.list</code></dt>
+<dd>A temporary file used by <code>bench</code>.</dd>
+<dt><code>src/scala/</code></dt>
+<dd>The older benchmarks, including the previous framework.</dd>
+</dl>
+
+Another, older set of benchmarks is present in `../benchmarking/`.
diff --git a/test/benchmarks/build.sbt b/test/benchmarks/build.sbt
new file mode 100644
index 0000000000..4806ecdde8
--- /dev/null
+++ b/test/benchmarks/build.sbt
@@ -0,0 +1,11 @@
+scalaHome := Some(file("../../build/pack"))
+scalaVersion := "2.11.8"
+scalacOptions ++= Seq("-feature", "-Yopt:l:classpath")
+
+lazy val root = (project in file(".")).
+  enablePlugins(JmhPlugin).
+  settings(
+    name := "test-benchmarks",
+    version := "0.0.1",
+	libraryDependencies += "org.openjdk.jol" % "jol-core" % "0.4"
+  )
diff --git a/test/benchmarks/project/plugins.sbt b/test/benchmarks/project/plugins.sbt
new file mode 100644
index 0000000000..e11aa29f3b
--- /dev/null
+++ b/test/benchmarks/project/plugins.sbt
@@ -0,0 +1,2 @@
+addSbtPlugin("com.typesafe.sbteclipse" % "sbteclipse-plugin" % "4.0.0")
+addSbtPlugin("pl.project13.scala" % "sbt-jmh" % "0.2.6")
diff --git a/test/benchmarks/src/main/scala/benchmark/JmhRunner.scala b/test/benchmarks/src/main/scala/benchmark/JmhRunner.scala
new file mode 100644
index 0000000000..cc75be529d
--- /dev/null
+++ b/test/benchmarks/src/main/scala/benchmark/JmhRunner.scala
@@ -0,0 +1,16 @@
+package benchmark
+
+import java.io.File
+
+/** Common code for JMH runner objects. */
+trait JmhRunner {
+  private[this] val parentDirectory = new File("target", "jmh-results")
+
+  /** Return the output directory for this class, creating the directory if necessary. */
+  protected def outputDirectory: File = {
+    val subdir = getClass.getPackage.getName.replace('.', File.separatorChar)
+    val dir = new File(parentDirectory, subdir)
+    if (!dir.isDirectory) dir.mkdirs()
+    dir
+  }
+}
diff --git a/test/benchmarks/src/main/scala/scala/collection/mutable/OpenHashMapBenchmark.scala b/test/benchmarks/src/main/scala/scala/collection/mutable/OpenHashMapBenchmark.scala
new file mode 100644
index 0000000000..26e26b3065
--- /dev/null
+++ b/test/benchmarks/src/main/scala/scala/collection/mutable/OpenHashMapBenchmark.scala
@@ -0,0 +1,204 @@
+package scala.collection.mutable;
+
+import java.util.concurrent.TimeUnit
+import org.openjdk.jmh.annotations._
+import org.openjdk.jmh.infra.Blackhole
+import org.openjdk.jmh.infra.BenchmarkParams
+import org.openjdk.jol.info.GraphLayout
+import org.openjdk.jol.info.GraphWalker
+import org.openjdk.jol.info.GraphVisitor
+import org.openjdk.jmh.infra.IterationParams
+import org.openjdk.jmh.runner.IterationType
+
+/** Utilities for the [[OpenHashMapBenchmark]].
+  * 
+  * The method calls are tested by looping to the size desired for the map;
+  * instead of using the JMH harness, which iterates for a fixed length of time.
+  */
+private object OpenHashMapBenchmark {
+  /** State container for the `put()` bulk calling tests.
+    * 
+    * Provides an array of adequately-sized, empty maps to each invocation,
+    * so that hash table allocation won't be done during measurement.
+    * Provides enough maps to make each invocation long enough to avoid timing artifacts.
+    * Performs a GC after re-creating the empty maps before every invocation,
+    * so that only the GCs caused by the invocation contribute to the measurement.
+    * 
+    * Records the memory used by all the maps in the last invocation of each iteration.
+    */
+  @State(Scope.Thread)
+  @AuxCounters
+  class BulkPutState {
+    /** A lower-bound estimate of the number of nanoseconds per `put()` call */
+    private[this] val nanosPerPut: Double = 5
+
+    /** Minimum number of nanoseconds per invocation, so as to avoid timing artifacts. */
+    private[this] val minNanosPerInvocation = 1000000  // one millisecond
+
+    /** Size of the maps created in this trial. */
+    private[this] var size: Int = _
+
+    /** Total number of entries in all of the `maps` combined. */
+    var mapEntries: Int = _
+
+    /** Number of operations performed in the current invocation. */
+    var operations: Int = _
+
+    /** Bytes of memory used in the object graphs of all the maps. */
+    var memory: Long = _
+
+    var maps: Array[OpenHashMap[Int,Int]] = null
+
+    @Setup
+    def threadSetup(params: BenchmarkParams) {
+      size = params.getParam("size").toInt
+      val n = math.ceil(minNanosPerInvocation / (nanosPerPut * size)).toInt
+      mapEntries = size * n
+      maps = new Array(n)
+    }
+
+    @Setup(Level.Iteration)
+    def iterationSetup {
+      operations = 0
+    }
+
+    @Setup(Level.Invocation)
+    def setup(params: IterationParams) {
+      for (i <- 0 until maps.length) maps(i) = new OpenHashMap[Int,Int](size)
+
+      if (params.getType == IterationType.MEASUREMENT) {
+        operations += mapEntries
+        System.gc()  // clean up after last invocation
+      }
+    }
+
+    @TearDown(Level.Iteration)
+    def iterationTeardown(params: IterationParams) {
+      if (params.getType == IterationType.MEASUREMENT) {
+        // limit to smaller cases to avoid OOM
+        memory = if (mapEntries <= 1000000) GraphLayout.parseInstance(maps(0), maps.tail).totalSize else 0
+      }
+    }
+  }
+
+  /** State container for the `get()` bulk calling tests.
+    * 
+    * Provides a thread-scoped map of the expected size.
+    * Performs a GC after loading the map.
+    */
+  @State(Scope.Thread)
+  class BulkGetState {
+    val map = new OpenHashMap[Int,Int].empty
+
+    /** Load the map with keys from `1` to `size`. */
+    @Setup
+    def setup(params: BenchmarkParams) {
+      val size = params.getParam("size").toInt
+      put_Int(map, 1, size)
+      System.gc()
+    }
+  }
+
+  /** State container for the `get()` bulk calling tests with deleted entries.
+    * 
+    * Provides a thread-scoped map of the expected size, from which entries have been removed.
+    * Performs a GC after loading the map.
+    */
+  @State(Scope.Thread)
+  class BulkRemovedGetState {
+    val map = new OpenHashMap[Int,Int].empty
+
+    /** Load the map with keys from `1` to `size`, removing half of them. */
+    @Setup
+    def setup(params: BenchmarkParams) {
+      val size = params.getParam("size").toInt
+      put_remove_Int(map, size)
+      System.gc()
+    }
+  }
+
+  /** Put elements into the given map. */
+  private def put_Int(map: OpenHashMap[Int,Int], from: Int, to: Int) {
+    var i = from
+    while (i <= to) {  // using a `for` expression instead adds significant overhead
+      map.put(i, i)
+      i += 1
+    }
+  }
+
+  /** Put elements into the given map, removing half of them as they're added.
+    * 
+    * @param size number of entries to leave in the map on return
+    */
+  def put_remove_Int(map: OpenHashMap[Int,Int], size: Int) {
+    val blocks = 50  // should be a factor of `size`
+    val totalPuts = 2 * size  // add twice as many, because we remove half of them
+    val blockSize: Int = totalPuts / blocks
+    var base = 0
+    while (base < totalPuts) {
+      put_Int(map, base + 1, base + blockSize)
+
+      // remove every other entry
+      var i = base + 1
+      while (i <= base + blockSize) {
+        map.remove(i)
+        i += 2
+      }
+
+      base += blockSize
+    }
+  }
+
+  /** Get elements from the given map. */
+  def get_Int(map: OpenHashMap[Int,Int], size: Int, bh: Blackhole) {
+    var i = 1
+    while (i <= size) {
+      bh.consume(map.get(i).getOrElse(0))
+      i += 1
+    }
+  }
+}
+
+/** Benchmark for the library's [[OpenHashMap]]. */
+@BenchmarkMode(Array(Mode.AverageTime))
+@Fork(6)
+@Threads(1)
+@Warmup(iterations = 20)
+@Measurement(iterations = 6)
+@OutputTimeUnit(TimeUnit.NANOSECONDS)
+@State(Scope.Benchmark)
+class OpenHashMapBenchmark {
+  import OpenHashMapBenchmark._
+
+  @Param(Array("25", "50", "100", "250", "1000", "2500", "10000", "25000", "100000", "250000", "1000000", "2500000",
+               "5000000", "7500000", "10000000", "25000000"))
+  var size: Int = _
+
+  /** Test putting elements to a map of `Int` to `Int`. */
+  @Benchmark
+  def put_Int(state: BulkPutState) {
+    var i = 0
+    while (i < state.maps.length) {
+      OpenHashMapBenchmark.put_Int(state.maps(i), 1, size)
+      i += 1
+    }
+  }
+
+  /** Test putting and removing elements to a growing map of `Int` to `Int`. */
+  @Benchmark
+  def put_remove_Int(state: BulkPutState) {
+    var i = 0
+    while (i < state.maps.length) {
+      OpenHashMapBenchmark.put_remove_Int(state.maps(i), size)
+      i += 1
+    }
+  }
+
+  /** Test getting elements from a map of `Int` to `Int`. */
+  @Benchmark
+  def put_get_Int(state: BulkGetState, bh: Blackhole) = OpenHashMapBenchmark.get_Int(state.map, size, bh)
+
+  /** Test getting elements from a map of `Int` to `Int` from which elements have been removed. */
+  @Benchmark
+  def put_remove_get_Int(state: BulkRemovedGetState, bh: Blackhole) = OpenHashMapBenchmark.get_Int(state.map, size, bh)
+}
diff --git a/test/benchmarks/src/main/scala/scala/collection/mutable/OpenHashMapRunner.scala b/test/benchmarks/src/main/scala/scala/collection/mutable/OpenHashMapRunner.scala
new file mode 100644
index 0000000000..1a58b18ee9
--- /dev/null
+++ b/test/benchmarks/src/main/scala/scala/collection/mutable/OpenHashMapRunner.scala
@@ -0,0 +1,105 @@
+package scala.collection.mutable
+
+import java.io.BufferedWriter
+import java.io.File
+import java.io.FileOutputStream
+import java.io.OutputStreamWriter
+import java.io.PrintWriter
+import scala.collection.JavaConversions
+import scala.language.existentials
+import org.openjdk.jmh.results.RunResult
+import org.openjdk.jmh.runner.Runner
+import org.openjdk.jmh.runner.options.CommandLineOptions
+import org.openjdk.jmh.runner.options.Options
+import benchmark.JmhRunner
+import org.openjdk.jmh.runner.options.OptionsBuilder
+import org.openjdk.jmh.runner.options.VerboseMode
+import org.openjdk.jmh.results.Result
+
+/** Replacement JMH application that runs the [[OpenHashMap]] benchmark.
+  * 
+  * Outputs the results in a form consumable by a Gnuplot script.
+  */
+object OpenHashMapRunner extends JmhRunner {
+  /** File that will be created for the output data set. */
+  private[this] val outputFile = new File(outputDirectory, "OpenHashMap.dat")
+
+  /** Qualifier to add to the name of a memory usage data set. */
+  private[this] val memoryDatasetQualifier = "-memory"
+
+  private[this] implicit class MyRunResult(r: RunResult) {
+    /** Return the dataset label. */
+    def label = r.getPrimaryResult.getLabel
+
+    /** Return the value of the JMH parameter for the number of map entries per invocation. */
+    def size: String = r.getParams.getParam("size")
+
+    /** Return the operation counts. */
+    def operations = Option(r.getSecondaryResults.get("operations"))
+
+    /** Return the number of map entries. */
+    def entries = r.getSecondaryResults.get("mapEntries")
+
+    /** Return the memory usage. */
+    def memory = Option(r.getSecondaryResults.get("memory"))
+  }
+
+  /** Return the statistics of the given result as a string. */
+  private[this] def stats(r: Result[_]) = r.getScore + " " + r.getStatistics.getStandardDeviation
+
+
+  def main(args: Array[String]) {
+    import scala.collection.JavaConversions._
+    import scala.language.existentials
+  
+    val opts = new CommandLineOptions(args: _*)
+    var builder = new OptionsBuilder().parent(opts).jvmArgsPrepend("-Xmx6000m")
+    if (!opts.verbosity.hasValue)  builder = builder.verbosity(VerboseMode.SILENT)
+
+    val results = new Runner(builder.build).run()
+
+    // Sort the results
+
+    /** Map from data set name to data set. */
+    val datasetByName = Map.empty[String, Set[RunResult]]
+
+    /** Ordering for the results within a data set. Orders by increasing number of map entries. */
+    val ordering = Ordering.by[RunResult, Int](_.size.toInt)
+
+    def addToDataset(key: String, result: RunResult): Unit =
+      datasetByName.getOrElseUpdate(key, SortedSet.empty(ordering)) += result
+
+    results.foreach { result =>
+      addToDataset(result.label, result)
+
+      // Create another data set for trials that track memory usage
+      if (result.memory.isDefined)
+        addToDataset(result.label + memoryDatasetQualifier, result)
+    }
+
+    //TODO Write out test parameters
+    //    val jvm = params.getJvm
+    //    val jvmArgs = params.getJvmArgs.mkString(" ")
+
+    val f = new PrintWriter(outputFile, "UTF-8")
+    try {
+      datasetByName.foreach(_ match { case (label: String, dataset: Iterable[RunResult]) => {
+        f.println(s"# [$label]")
+
+        val isMemoryUsageDataset = label.endsWith(memoryDatasetQualifier)
+        dataset.foreach { r =>
+          f.println(r.size + " " + (
+            if (isMemoryUsageDataset)
+              stats(r.entries) + " " + stats(r.memory.get)
+            else
+              stats(r.operations getOrElse r.getPrimaryResult)
+          ))
+        }
+
+        f.println(); f.println()  // data set separator
+      }})
+    } finally {
+      f.close()
+    }
+  }
+}
author	Lukas Rytz <lukas.rytz@gmail.com>	2016-06-01 10:20:59 +0200
committer	Lukas Rytz <lukas.rytz@gmail.com>	2016-06-01 10:20:59 +0200
commit	20dd825ec6b5d4015ce36cf4373ba1c1d917da94 (patch)
tree	8853ed0b590f37607fdedaed36a74ca6fada34c8 /test/benchmarks
parent	697158414f563a9b43ada696a1bb949eab96208b (diff)
parent	cba585d41b9c2c47f256cbce45115bb205ae58a2 (diff)
download	scala-20dd825ec6b5d4015ce36cf4373ba1c1d917da94.tar.gz scala-20dd825ec6b5d4015ce36cf4373ba1c1d917da94.tar.bz2 scala-20dd825ec6b5d4015ce36cf4373ba1c1d917da94.zip