new-Vector performance on Scala.js

My feeling is it shouldn’t block the rollout. Not only is JVM Scala much bigger, I would also argue:

  • The use cases are also a lot more performance sensitive. Frontend code tends to be more limited by complexity than perf

  • And a lot of browser work that is performance sensitive is not bound on Scala.js execution speed: script download speed, script parsing speed, JS library speed (e.g. React), network round trips (front code tends to be ajaxy), DOM perf, CSS rendering perf (my personal most recent Scala.js perf battle)

  • Browser perf is a mercurial thing. While the JVM behaves the same yesterday as today, browser optimizations change rapidly and perf for the same code can swing around massively over time, nevermind different browsers. IME the standard perf advice is to try to optimize code “semantically”, but not to micro-optimize with the browser optimizer in mind as such wins or losses tend to be fleeting.

4 Likes

This is pure speculation, but a possible reason for the performance regression is the new design with 7 subclasses. Vector is final in 2.13.1, so operations on a vector (e.g., appended) have a single implementation. In the new design, each subclass overrides appended.

flatMap is not overridden in Vector, the implementation uses the new VectorBuilder which was heavily optimized and brings significant perormance benefits compared to the old vector on JVM. This seems to translate to JS too.

More concerning than the performance regression is the TypeError: this.Zd.c[c] is undefined that shows up on one of the benchmarks – @sjrd, what are your thoughts on this?

1 Like

Doesn’t Scala.js already replace some small parts of the std library with custom implementations? But maintaining a custom implementation of something as complex as Vector might not be feasible.

I don’t know enough about the implementations (old and new) of Vector to make any reasonable guess.

In general, significant performance profile differences between JVM and JS can be observed for two main reasons:

  • Preventing Scala.js from inlining and stack allocating. The JVM does this at run-time based on profiles. Scala.js can only do that at link time using static information, but when it can, it’s better at it. Sometimes a method wouldn’t be inline by the JVM, possibly preventing class allocation, but Scala.js would inline it. Making the code profile different could still not inline on the JVM but also would prevent Scala.js from inlining as well. This can result in performance drops in Scala.js that are not observable on the JVM.
  • Type tests on interfaces. They are slow on JS. Every time you do a type test with an interface type to “optimize” some code paths, it makes it worse on JS.

More concerning than the performance regression is the TypeError: this.Zd.c[c] is undefined that shows up on one of the benchmarks – @sjrd, what are your thoughts on this?

Is this only in fullOpt mode? What’s the corresponding error message in fastOpt mode? The message would more intelligible. This looks like an ArrayIndexOutOfBoundsException to me, but the fastOpt code would show that for sure (because they’re checked in fastOpt mode).

I suppose it’s not an option to have both implementations in stdblib, and maybe even somehow default JVM to one and JS to the other (but both implementations exist everywhere and can be accessed directly if needed)?

No, it’s not an option, because I don’t have the bandwidth nor knowledge to maintain an entirely different version of Vector.

1 Like

Huh! :scream: It appears the new Vector intentionally catches ArrayIndexOutOfBoundsExceptions. That is a big big no-no in Scala.js! It won’t work. AIIOBEs are undefined behavior. This is absolutely, utterly terrible.

Please don’t ship that!

Also, this can explain the performance regressions in JS, because try..catches are expensive in JS, even when they don’t actually catch anything.

1 Like

OK, let me tone that down a bit. It only catches ArrayIndexOutOfBoundsExceptions to then throw more IndexOutOfBoundsExceptions and NoSuchElementExceptions. So cases for UBE in Scala.js would throw UBEs in fastOpt (instead of IIOBE) but that can be considered acceptable. It does mean that the contract of Vector is worse in Scala.js: calling .head on an empty vector is UB, now, instead of reliably throwing NoSuchElementException. The error message of the UB is also worse because it mentions internal implementation details of an array index exception, instead of a NoSuchElementException. NoSuchElementExceptions were never considered UB before, so this is a breach of portability contract nevertheless.

It could still explain the performance regressions.

Another possibility is that mega-morphic dispatch of the new Vector is problematic on some JIT’s but less so on others. The old vector implementation specifically avoided generating subclasses for simple cases in order not to run into that problem. Maybe the JVM has gotten better at megamorphic dispatch (Graal certainly falls into that camp), but JS JITs haven’t? I am only guessing here.

1 Like

I submitted the issue https://github.com/scala/bug/issues/11920 as well as two PRs https://github.com/scala/scala/pull/8827 and https://github.com/scala/scala/pull/8828.

2 Likes

artificial intelligence’s out of body experience

Not knowing anything about JS performance tuning, this is my best guess, too. There is no exception handling involved in appended and prepended. Both are as straight-forward as they could be in the new implementation: a single megamorphic call up front (which wasn’t the case in the old implementation) and then some auxiliary calls to small methods which are all monomorphic and inlinable.

What does “undefined behaviour” mean in this case? Your further explanations seem to indicate that it reliably throws an exception but a different kind. Is this correct? Why can’t it throw the right one?

Could we have separate code paths for JVM and Scala.js? (Hey, wasn’t there some kind of preprocessor SIP for this kind of thing? :thinking:)

It looks like performance on JVM and Scala.js is very much at odds here and there is no single solution that will be fast on both. On the JVM an exception handler is essentially free but we save a few percent (in apply performance tests) for not doing the bounds checking twice.

BTW, does Scala.js honor @inline annotations? All JVM benchmarks were done with full inlining (as used by a normal Scala release build) and we rely on inlining by the Scala compiler for small methods for best performance.

That is explained at https://www.scala-js.org/doc/semantics.html#undefined-behaviors. In fastOpt mode (only), it reliably throws a org.scalajs.linker.runtime.UndefinedBehaviorError, whose getCause() is the exception that should have been thrown. This is only for debugging purposes. In fullOpt mode, however, this becomes unchecked for performance reasons, and the optimizer/compiler is allowed whatever it pleases with it (including removing the code, or let it pass without exception, or whatever really).

That increases maintenance burden and decreases testability. Every time we have a different path in the JVM and in JS, that increases the chances that a bug in one goes undetected and that libraries are not portable.

Scala.js honors @inline, even more than Scala/JVM. And that is always, not just using some flags. Our optimizer is better than scalac’s optimizer, because we always have a closed world assumption.

3 Likes

The implementation was now changed to no longer catch ArrayIndexOOBE (#8827 and #8829). It would be really nice to check if that fixes the TypeError and changes the performance (bounds checks might be cheaper than exception handlers on JS, according to @sjrd).

@japgolly could you maybe do the magic again? The current version number is 2.13.2-bin-ca30256.

@lrytz no worries, updated! https://japgolly.github.io/scalajs-benchmark

Also did we find out whether v :+ a / a +: v experience a big performance regression on the JVM too?

2 Likes

@japgolly Could you double-check that the website was indeed updated? It still mentions the commit b4428c8, which was before the recent changes?

@sjrd Try a force-reload. I think Github Pages doesn’t do caching properly. I can see ca30256 when I view it.

Hum, I had already tried force-reload. I tried again and it didn’t change anything. I see ca30256 at https://japgolly.github.io/scalajs-benchmark/, but when I click on it I get to https://japgolly.github.io/scalajs-benchmark/res/scala-2.13.2.html which mentions b4428c8, and executing Vector index still reports the JavaScriptException. I also tried with another browser, which has a clean cache, and the same thing happens.

@sjrd My apologies! The GP caching problem affects me so often that I just force-refreshed the index, saw the update and thought it was good. Instead I’d put the results in the wrong directory :grimacing: I’ve fixed, re-uploaded and confirmed all the way through now. Please try again and sorry about that.

Cool, thanks! Now it works. I’m not sure how performance changed, but at least the JavaScriptException is gone :slight_smile:

1 Like

Appending and prepending individual elements is much faster with the new Vector on the JVM. I think this was from the last complete run that I made and should reflect the current status for these operations: http://szeiger.de/tmp/alice_bignvector_opt_jdk8/vector-results.html

2 Likes