I’m working on some benchmarks and MethodHandles usage. So far I didn’t gain anything with MethodHandles, but my newest benchmarks show that megamorphic methods that additionally aren’t well predicted by CPU’s branch predictor lead to significant increase in running time (e.g. 50%) even when wrapped in Futures.
Benchmark code:
import java.util.concurrent.TimeUnit
import org.openjdk.jmh.annotations._
import org.openjdk.jmh.infra.Blackhole
import pl.tarsa.megamorphic_overhead.jmh.FutureMapFunCallOverhead._
import scala.annotation.tailrec
import scala.concurrent.duration.Duration
import scala.concurrent.{Await, ExecutionContext, Future, Promise}
import scala.util.Random
@State(Scope.Benchmark)
@BenchmarkMode(Array(Mode.AverageTime))
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Timeout(time = 10, timeUnit = TimeUnit.SECONDS)
@Warmup(iterations = 1)
@Measurement(iterations = 3)
@Fork(value = 1,
jvmArgsAppend = Array("-Xmx1G", "-Xms1G", "-XX:+AlwaysPreTouch"))
@Threads(value = 1)
class FutureMapFunCallOverhead {
@Param(Array[String]("global", "parasitic", "trivial"))
var ecName: String = _
var ec: ExecutionContext = _
@Param(Array[String]("fixed1", "regular2", "regular8", "random2", "random8"))
var indicesType: String = _
var indices: Array[Int] = _
@Setup
def setup(): Unit = {
ec = ecName match {
case "global" => ExecutionContext.global
case "parasitic" => ExecutionContext.parasitic
case "trivial" =>
new ExecutionContext {
override def execute(runnable: Runnable): Unit = runnable.run()
override def reportFailure(cause: Throwable): Unit = ???
}
}
indices = indicesType match {
case "fixed1" => indices1
case "regular2" => regularIndices2
case "regular8" => regularIndices8
case "random2" => randomIndices2
case "random8" => randomIndices8
}
}
@tailrec
private def go(
previousFuture: Future[Int],
iterations: Int,
indices: Array[Int])(implicit ec: ExecutionContext): Future[Int] = {
if (iterations > 0) {
val index = indices(iterations - 1)
go(previousFuture.map(closures(index)(iterations)),
iterations - 1,
indices)
} else {
previousFuture
}
}
@Benchmark
def pre(bh: Blackhole): Int = {
val promise = Promise.successful(5)
val resultFut = go(promise.future, depth, indices)(ec)
Await.result(resultFut, Duration.Inf)
}
@Benchmark
def post(bh: Blackhole): Int = {
val promise = Promise[Int]()
val resultFut = go(promise.future, depth, indices)(ec)
promise.success(5)
Await.result(resultFut, Duration.Inf)
}
}
object FutureMapFunCallOverhead {
val depth = 1000
val random = new Random(0)
import random.nextInt
/* Array(3, 3, 3, ...) */
val indices1: Array[Int] = Array.fill[Int](depth)(3)
/* Array(6, 7, 6, 7, ...) */
val regularIndices2: Array[Int] = Array.tabulate[Int](depth)(_.%(2).+(6))
/* Array(0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 2, 3, ...) */
val regularIndices8: Array[Int] = Array.tabulate[Int](depth)(_ % 8)
/* Array(2 or 5, 2 or 5, 2 or 5, 2 or 5, 2 or 5, ...), randomly chosen */
val randomIndices2: Array[Int] =
Array.fill[Int](depth)(if (nextInt(2) == 0) 2 else 5)
/* Array(0 to 7, 0 to 7, 0 to 7, 0 to 7, 0 to 7, ...), randomly chosen */
val randomIndices8: Array[Int] = Array.fill[Int](depth)(nextInt(8))
private val closures: Array[Int => Int => Int] = Array(
a => b => a * 2452 + b * 3242,
a => b => a * 7643 + b * 3567,
a => b => a * 9287 + b * 3623,
a => b => a * 6234 + b * 7343,
a => b => a * 6224 + b * 5321,
a => b => a * 3432 + b * 5212,
a => b => a * 5323 + b * 5312,
a => b => a * 6123 + b * 5132,
)
}
Results on Ubuntu, Core i5-4670, Oracle Java 1.8u201:
[info] Benchmark (ecName) (indicesType) Mode Cnt Score Error Units
[info] FutureMapFunCallOverhead.post global fixed1 avgt 3 77,079 ± 9,585 us/op
[info] FutureMapFunCallOverhead.post global regular2 avgt 3 79,084 ± 1,767 us/op
[info] FutureMapFunCallOverhead.post global regular8 avgt 3 94,292 ± 11,221 us/op
[info] FutureMapFunCallOverhead.post global random2 avgt 3 83,378 ± 9,528 us/op
[info] FutureMapFunCallOverhead.post global random8 avgt 3 122,085 ± 10,965 us/op
[info] FutureMapFunCallOverhead.post parasitic fixed1 avgt 3 38,998 ± 0,505 us/op
[info] FutureMapFunCallOverhead.post parasitic regular2 avgt 3 40,317 ± 0,436 us/op
[info] FutureMapFunCallOverhead.post parasitic regular8 avgt 3 47,622 ± 1,134 us/op
[info] FutureMapFunCallOverhead.post parasitic random2 avgt 3 42,660 ± 0,819 us/op
[info] FutureMapFunCallOverhead.post parasitic random8 avgt 3 69,109 ± 2,144 us/op
[info] FutureMapFunCallOverhead.post trivial fixed1 avgt 3 39,661 ± 0,393 us/op
[info] FutureMapFunCallOverhead.post trivial regular2 avgt 3 40,372 ± 0,455 us/op
[info] FutureMapFunCallOverhead.post trivial regular8 avgt 3 44,751 ± 0,376 us/op
[info] FutureMapFunCallOverhead.post trivial random2 avgt 3 40,763 ± 5,830 us/op
[info] FutureMapFunCallOverhead.post trivial random8 avgt 3 63,548 ± 1,991 us/op
[info] FutureMapFunCallOverhead.pre global fixed1 avgt 3 80,742 ± 0,074 us/op
[info] FutureMapFunCallOverhead.pre global regular2 avgt 3 87,824 ± 3,529 us/op
[info] FutureMapFunCallOverhead.pre global regular8 avgt 3 66,557 ± 3,282 us/op
[info] FutureMapFunCallOverhead.pre global random2 avgt 3 91,869 ± 3,204 us/op
[info] FutureMapFunCallOverhead.pre global random8 avgt 3 75,667 ± 17,878 us/op
[info] FutureMapFunCallOverhead.pre parasitic fixed1 avgt 3 33,588 ± 1,901 us/op
[info] FutureMapFunCallOverhead.pre parasitic regular2 avgt 3 34,134 ± 0,116 us/op
[info] FutureMapFunCallOverhead.pre parasitic regular8 avgt 3 44,550 ± 0,659 us/op
[info] FutureMapFunCallOverhead.pre parasitic random2 avgt 3 36,047 ± 0,642 us/op
[info] FutureMapFunCallOverhead.pre parasitic random8 avgt 3 58,970 ± 4,753 us/op
[info] FutureMapFunCallOverhead.pre trivial fixed1 avgt 3 29,749 ± 1,475 us/op
[info] FutureMapFunCallOverhead.pre trivial regular2 avgt 3 29,908 ± 0,415 us/op
[info] FutureMapFunCallOverhead.pre trivial regular8 avgt 3 36,730 ± 0,485 us/op
[info] FutureMapFunCallOverhead.pre trivial random2 avgt 3 30,461 ± 0,198 us/op
[info] FutureMapFunCallOverhead.pre trivial random8 avgt 3 47,008 ± 0,822 us/op
Running on global
introduces quite a bit of variability and confuction (random8
is faster than random2
, regular8
is faster than regular2
and fixed1
), so I’ll probably continue with some more predictable monad than Future.