Scala Native Next Steps

sjrd · May 5, 2020, 8:31am

@kostaskougios Hum, I’m confused by your experience. Scala Native does have a garbage collector. If there really are memory leaks for simple transformations on lists, that would definitely be a bug. I have not heard other reports like that, so if you could provide a reproduction, that would be very helpful.

kostaskougios · May 5, 2020, 10:25am

Oh yes sorry you’re right, just managed to use the scala lib’s List and saw that native uses a GC.

ekrich · May 5, 2020, 3:46pm

The recommended version is 0.4.0-M2 because it is much better than the 0.3.x series even though it is pre-release.

markehammons · May 5, 2020, 4:21pm

Super excited to hear this. I’ve gone back and forth on whether to use scala native or jextract for a project I’m working on, and always landed on jextract cause scala native appeared fairly dead.

Sciss · May 5, 2020, 10:19pm

Anything on WebAssembly support?

sjrd · May 6, 2020, 6:58am

No, WebAssembly is not directly in our plans. Wasm still lacks today primitives required for efficient implementation of GCs, so it’s not worth putting core resources into it. Experimentation by third parties is welcome, of course.

ekrich · May 6, 2020, 4:08pm

There is a demo here for Web Assembly if you would like to experiment.

hepin1989 · May 6, 2020, 4:50pm

There is a keynote by sjrd, you can check it out on youtube.

Sciss · May 6, 2020, 6:40pm

thanks, I am aware of this, but I cannot run a project on something that is uncertain to work in the future, so I was hoping there is some sentiment to put wasm support into the mainline SN. but then of course it would be competition to sjs

Sciss · May 6, 2020, 6:49pm

You can think of wasm applications which do not require a GC, just like people were using SN for terminal applications with No-GC. I don’t see how this is a prohibitive criterion. By this argument, Rust or Kotlin for wasm wouldn’t have a reason to exist, either?

schrepfler · May 6, 2020, 7:04pm

Yes, but Rust does it’s own memory management no?

nafg · May 6, 2020, 7:44pm

Rust doesn’t need a GC. But for Scala, no GC only makes sense for short-lived applications. Web pages can be long-lived.

That said, there are use cases for WASM other than the browser, so it does have uses.

People in this thread interested in WASM support, what is your intended use case?

Sciss · May 6, 2020, 10:37pm

I would be interested in real-time audio applications in the browser, interfacing to AudioWorklet. This could even be dual-native, running as well outside wasm with a native audio API back-end.

sjrd · May 7, 2020, 2:30am

I’d like to point out that, so far, there is no evidence that Scala Native compiled to wasm with a custom GC (using slowed down encodings to work around the limitations of wasm) would be any faster than Scala.js using the built-in GC and JIT of the browsers.

rssh · May 7, 2020, 10:41am

Btw, WASM is a target for smart-contract bytecode in proposed Ethereum-2.0, EOS, Polkadot edgeware, and several other blockchain systems. It looks like WASM will become a de-facto standard for smart-contract VM.

scalavision · May 8, 2020, 4:00pm

I think this is amazing news!! I work currently in fields of Bioinformatics, and the possibility to interop with C is essential for performance reasons.

I also think lambda functions in the cloud could benefit from this, but I don’t know much about that .

tarsa · May 9, 2020, 4:30pm

While a custom tracing GC implemented in WASM MVP would surely be slow there is a way to work around that. Scala-Native could implement unmanaged mode (for lack of a better word) in which access to managed heap is restricted (and also causing some temporary slowness). That unmanaged mode could be modeled after JNI of Java platform or something simpler can be devised.

First let me describe how I understand how OpenJDK works:

it has multiple tracing GCs to choose from
such GCs can move objects throughout their lifetimes
it supports multiple threads so it has to ensure thread safety of its operations
to reduce GC overhead in multithreaded environment stop-the-world (STW) pauses are employed
to achieve STW pause all threads must be stopped
threads are stopped at the so called safepoints, which are littered throughout the generated (by JIT compiler) native code
when all application threads are stopped GC is free to move objects around and rewrite references to them without worrying about data races
safepoints are efficiently implemented using page fault trapping, i.e. page fault trapping is very expensive when it happens, but since safepoints are rarely invoked then the amortized cost for safepoints checks is very low
some threads can execute unmanaged native code (let’s consider only code invoked through JNI interface) while e.g. GC tries to achieve STW pause
it turns out threads executing native code don’t have to be stopped right away during STW pause because unmanaged native code can’t directly access managed Java objects
unmanaged native code can access managed Java objects only through JNI API which takes care about safepoints (i.e. it waits for STW pause to end) and overall safety of such operations
JNI API is relatively slow but the idea of JNI is to have a long running unmanaged code that does few or no JNI API calls

Scala Native could have something similar, i.e. split between managed native code and unmanaged native code. Let’s see what would be similar to situation in OpenJDK if we want to implement JNI-like mechanism in Scala Native:

there would be separate managed and unmanaged heaps
safepoints and direct managed heap allocation would only be present on managed code
unmanaged code would be free of safepoints and direct managed heap allocations
unmanaged code that doesn’t access managed heap would run as fast as there was no GC present
unmanaged code would need some ugly boilerplate (or sophisticated trickery) to access managed objects
accessing managed objects from unmanaged code would entail relatively high overhead, but as in JNI the point of unmanaged code is to mostly avoid accessing managed heap

Above scheme has one big drawback - almost everything in Scala library expects a GC as objects are never directly freed (well, managed platforms prohibit explicitly deleting managed objects anyway). Therefore unmanaged Scala Native code would need separate standard library, plenty of macros to rewrite e.g. for comprehensions to code that doesn’t generate garbage, etc

Given the complexity of above scheme I propose a much simpler one. First, forget everything above as it’s not relevant anymore. Second, let’s see how the new idea look like:

there’s still split between managed code and unmanaged code (e.g. unmanaged code could be annotated by @unmanaged annotation)
unmanaged code doesn’t contain any safepoints, memory barriers, pointer healing or any other GC related awareness (except new keyword which obviously need to allocate in managed heap which is under GC control)
there’s no penalty (neither in ugliness nor in performance) for accessing managed objects from unmanaged code
most important point: the only difference that unmanaged code brings it that it disables garbage collection entirely (on first transition from managed code to unmanaged code) until the end of that unmanaged code
when no unmanaged code is running and there are currently no stack frames related to unmanaged code then GC is enabled back again and can collect any garbage
no garbage collection during unmanaged code execution means no need for thread synchronization, so code can run at full speed, but also we risk running out of memory
unmanaged code is reentrant and there is a thread local count of sequences of unmanaged stack frames - only when that count goes back to zero (for all threads) the GC is enabled back
turning garbage collection on and off (globally) is costly so it shouldn’t be done often - this is similar to JNI where very short JNI calls have too high overhead to be profitable at all

Example:

object Main {
  val uCnt = new ThreadLocal(0) // unmanagedCount

  def main(args: Array[String]): Unit = {
    // uCnt == 0, GC is turned on
    val greeting = prepareGreeting(args.head)
    // still uCnt == 0

    // uCnt += 1 due to transition from managed code to unmanaged code
    // since we've changed between uCnt == 0 and uCnt == 1 we must call GC
    // in this case block him from collecting garbage
    printLn(greeting)
    // uCnt -= 1 due to transition from unmanaged code to managed code
    // since we've changed between uCnt == 0 and uCnt == 1 we must call GC
    // in this case inform it that current thread doesn't required blocking of GC activity

    // uCnt == 0, we're entering this method with GC turned on
    managedA()
  }

  def prepareGreeting(who: String): String =
    s"Hello, $who"

  @unmanaged
  def printLn(line: String): String =
    println(line)

  def managedA(): Unit = {
    // incrementing uCnt on transition from managed to unmanaged
    // uCnt: 0 --> 1 : GC must be blocked, we need to call GC to do that
    unmanagedB()
    // decrementing uCnt on transition from unmanaged to managed
    // uCnt: 1 --> 0 : GC was blocked and current thread doesn't need that blocking now
    //  we need to call GC to let him know that
    //  if all threads have uCnt == 0 then GC can and should be enabled again
  }

  @unmanaged
  def unmanagedB(): Unit = {
    // calling managed code doesn't change uCnt
    // uCnt: 1 --> 1 : GC was blocked and must stay blocked, so no GC call needed
    managedC()
  }

  def managedC(): Unit = {
    // incrementing uCnt on transition from managed to unmanaged
    // uCnt: 1 --> 2 : GC was blocked and must stay blocked, no GC call needed
    unmanagedD()
    // decrementing uCnt on transition from unmanaged to managed
    // uCnt: 2 --> 1 : GC was blocked and must stay blocked, no GC call needed
  }

  @unmanaged
  def unmanagedD(): Unit = {
    // uCnt stays the same as we're calling unmanaged code from unmanaged code
    unmanagedE()
    // when inlining managed code to unmanaged code, the managed one becomes unmanaged
    //   as we know that GC is stopped anyway
    managedF()
  }

  @unmanaged
  def unmanagedE(): Unit = {
    ... // something
  }

  inline def managedF(): Unit = {
    ... // something
  }
}

Tell me if that makes any sense and if I understood the problem correctly.

Update:
Actually, @gcBlocking would be a better annotation name than @unmanaged in that second (simpler) proposal.

sjrd · May 9, 2020, 4:37pm

I don’t see anything obviously wrong with what you suggest. However, that would be a very significant departure from Scala Native’s core design and philosophy. That is not a direction I am willing to take, but if someone would like to try and research that direction, they can do so.

tarsa · May 10, 2020, 3:08pm

Where are the Scala Native’s core design and philosophy listed? How it is violated by my second simpler proposal (i.e. @gcBlocking methods)?

That @gcBlocking methods would not only be faster (not counting transitioning between enabled and disabled GC) under WASM environment, but also in regular non-sandboxed ones (i.e. native x86, ARM, etc code). It would provide speedup regardless of GC implementation (given it would be a tracing GC which is practically almost always the case under JIT compilers).

texasbruce · May 17, 2020, 12:21am

Is there any plan for Windows support?

It will also be awesome to have a GUI library. There’s a SN port for GTK but it seems to be abandoned.