Design of -Xasync

bmeesters · July 14, 2020, 9:29am

First of all I want to thank everyone that is actively helping with the Scala releases. I am thoroughly enjoying Scala and really appreciate what you are doing.

Without being to demanding I hope I can ask for some clarification regarding the scala-async. To be honest it doesn’t really feel very ‘Scala’ like to make a ‘one-off’ feature. Something like monadless seems more like Scala in trying to provide an abstraction that works for anything that can be mapped and flatmapped over. It would be nice if we could get some clarification in why this decision was made like it was. Because I (and others) miss something in why a more general approach is less favorable.

If this is the wrong place to ask, feel free to branch it out into its own topic.

sjrd · July 14, 2020, 12:19pm

I believe that what made it into the compiler, under -Xasync is actually very similar to Monadless. There’s nothing in there that is specific to Futures. All it works on is a pair of marker methods (like lift/unlift in Monadless, but customizable to have more meaningful names), and a structural interface of methods called for some shapes of trees (like rescue/ensure in Monadless). The unit tests contain at least two different implementations: one for Futures with async/await and one for Option with optionally/value. So IIUC your comment, your wish has already been fulfilled.

bmeesters · July 14, 2020, 2:41pm

Thanks @sjrd for the quick reply. It seems indeed that this is the generality I expect and love about Scala. I was a bit put off by the name (specifically named the async DSL), since it doesn’t convey the generality you describe. Since this is still experimental I understand this takes time. But some documentation about this setup would be great.

sjrd · July 14, 2020, 2:57pm

The release notes for 2.12.12 are a bit terse on the topic. The earlier release notes for 2.13.3 actually contain some more caveats that it is undocumented for now, but that the compiler team will publish a separate blog post with more documentation. Hopefully that will help.

Experimental -Xasync

This successor to scala-async allows usage with other effect systems besides scala.concurrrent.Future .

Compiler support for scala-async; enable with -Xasync (#8816)

We will publish a blog post with more detail on this work by @retronym, building on his earlier collaboration with @phaller. In the meantime, see the PR description.

This feature will also be included in the 2.12.12 release.

djspiewak · July 15, 2020, 4:54pm

What made it into the compiler is still substantially incompatible with what Monadless does. The mechanism in the compiler makes the assumption that the F[_] type under transformation at the very least behaves like Future in the sense that it must run eagerly (or approximate such) and memoize (or emulate memoization). Lazy types in general do not work for this, and types which cannot be run to “a value or an error” also do not comply.

For example, it is possible to define a Cats Effect adapter for -Xasync (Jason has a prototype of such), but you need Effect to make it work due to the fact that the state machine needs to run the effect. For those unfamiliar with the Cats Effect hierarchy, Effect is the “yeah you’re abstract, but secretly you’re actually just IO” typeclass. It is a constraint which is very rarely present in real Cats Effect-using codebases, and for very good reason.

All of this means that -Xasync is effectively not applicable to things other than Future. It is applicable to other types, but those types need to themselves be Future or something isomorphic to Future.

Now, Monadless also has its own series of problems with this. It handles fewer constructs than scala-async does (partially for this reason), and it doesn’t correctly deal with side effects. In theory, a scala-async adapter for Cats Effect should require only Monad or Applicative for most restructurings (this is what Monadless does now), and should require either Sync or Async whenever side-effects are suspended.

The fundamental difference here is that scala-async runs the effect and then attempts to re-wrap the running. There are a litany of problems with this from a thread pool prioritization standpoint, but it mostly works with Future which is already playing fast-and-loose with pools. Monadless simply restructures the code to build a new effect which can then be run. There’s no reason why scala-async can’t do this, but it certainly can’t do it with the current finite state machine API, which is heavily biased towards Future-like datatypes.

To be clear, I really like -Xasync and I want it to work with Cats Effect (and other similar types), it just doesn’t right now. I think it could, but it would require reworking the adapter API to remove assumptions about execution model, which is possible (as Monadless proves) but certainly non-trivial.

Edit: If none of what I said above regarding the finite state machine mechanism remains true, then I would absolutely love to be corrected. I’m basing this on the limited documentation available, Jason’s older work, and the source code of the scalac PR that landed in 2.13.3.

schrepfler · July 15, 2020, 5:24pm

There was also this interesting discussion on Twitter where @fwbrasil suggests an alternative implementation using statement rewrite-rules https://twitter.com/flaviusbraz/status/1276525386491297794 and considering his experience in optimising and implementing a highly efficient Future I’d like to hear more about this approach as well.
Is it fair to say the state machine based approach was inspired by Kotlin?

djspiewak · July 15, 2020, 5:26pm

My impression based on looking at the code is that the state machine approach was inspired by Future itself. It honestly looks like something that was written to make Future work in the most direct way possible, then moderately generalized to support things that aren’t nominally Future, but have the same semantics.

nafg · July 15, 2020, 5:46pm

Javascript async await is implemented with a state machine. Or did you mean the specifics of the particular state machine approach?

djspiewak · July 15, 2020, 5:52pm

The specifics of this particular approach. There’s going to be a state machine either way, but this is a very imperative state machine that relies on eager evaluation and memoization. That’s exactly what Future does in general, but not what functional effects like IO do.

szeiger · July 15, 2020, 7:54pm

That’s right. The original Scala Async 1 already had support for different Future or Future-like types so you could support 3rd-party Future implementations and also have custom implementations for the unit tests. It requires more than just the basic monad functions to generate simpler and more efficient code.Simpler code was also the main goal of Async 2 (integrated into the compiler). Generating the state machine later keeps the AST smaller during the intermediate phases.

djspiewak · July 15, 2020, 8:10pm

That doesn’t surprise me at all, but it’s important to understand that the approach chosen makes it impossible to support anything that isn’t Future or all-but identical to it. This is exceptionally restrictive. Monadless represents a relatively decent existence proof that a straightforward transformation is possible even with just the basic monad operators. Is there a more specific example of why such an approach was rejected?

I’m not trying to run down any of the excellent work done here. My hope is just that we can take this opportunity while it’s behind an -X flag to generalize the mechanism so that it can benefit more than just a very narrowly-defined slice of the ecosystem.

fwbrasil · July 15, 2020, 9:58pm

The FSM approach is flawed in its core:

it makes unreasonable assumptions on how the execution will happen
it’s considerably harder to extend. See https://github.com/scala/scala/blob/2.13.x/src/partest/scala/tools/partest/async/OptionDsl.scala vs https://github.com/monadless/monadless/blob/master/monadless-stdlib/src/main/scala/io/monadless/stdlib/MonadlessOption.scala as an example
it can have worse performance because it introduces more indirection, making the work of the JIT compiler harder

I still don’t understand how it landed in an official language release without a broader discussion with the interested parties in the community. Considering the feedback on Twitter, several people would prefer a more generic solution like Monadless: @djspiewak (Cats), @alexandru (Monix), @jdegoes (Zio), and myself (Twitter Future).

Now, Monadless also has its own series of problems with this. It handles fewer constructs than scala-async does (partially for this reason)

@djspiewak could you elaborate? Monadless supports a superset of the constructs supported by scala-async. A few examples are short-circuiting boolean logic, try/catch, functions, classes, methods, and others. For more details see GitHub - monadless/monadless: Syntactic sugar for monad composition in Scala

and it doesn’t correctly deal with side effects

could you expand on this?

djspiewak · July 15, 2020, 10:48pm

Okay to head off the firestorm a bit, there was broader discussion, it just wasn’t trumpeted super-loudly. Jason had a discussion with Alexandru and myself on public GitHub that started with a prototype of a scala-async adapter for Monix Task and ultimately included a prototype of an implementation for any Cats Effect. So there was discussion, it was just further under the radar than you might have expected. I gave much of this same feedback on that discussion, but I think Jason didn’t have time to address it. I certainly was surprised when this landed in the official compiler without any further comment; I don’t blame anyone, I just wish it was handled a bit differently.

I’m getting out a bit over my skis here. Please correct me where I’m in err. My understanding is that scala-async was able to handle certain higher-order function cases that Monadless couldn’t (since not all things are Traverse).

could you expand on this?

Something I’ve been thinking about is the reason people want to use async/await. The answer to that is basically that they want to have an imperative control flow within an execution environment which is callback-oriented. -Xasync and Monadless both assume the CPS control flow is encoded by a monad (in the case of scala-async, specifically Future). But this means that they probably want to squeeze effects in here somewhere:

async {
  writeToFile(await(fa) + await(fb))
  println("done!")
  await(finalizeF)
}

I dunno. I’m making up examples here. The point being that I think the body of the async needs to be treated as a place wherein effects may need to be captured. Right now, I would assume that most of this with monadless falls into accidentally-lazy land within map and flatMap statements? The more correct thing to do, if we buy into my assumption, is to take a Sync instance when such effects may need to be wrapped. So rather than effectively turning writeToDatabase(...) into pure(writeToDatabase(...)), you would turn it into delay(writeToDatabase(...)).

This would also open the door to taking an Async and allowing for a third construct wherein people await a callback. I’m not sure if that’s a good idea or not; just spitballing.

fwbrasil · July 15, 2020, 11:10pm

@djspiewak thanks for the clarifications

I’m getting out a bit over my skis here. Please correct me where I’m in err. My understanding is that scala-async was able to handle certain higher-order function cases that Monadless couldn’t (since not all things are Traverse ).

It’s been some time since I developed Monadless so my memory might be failing but afaik there isn’t a construct that is supported by scala-async that isn’t supported by Monadless.

The more correct thing to do, if we buy into my assumption, is to take a Sync instance when such effects may need to be wrapped. So rather than effectively turning writeToDatabase(...) into pure(writeToDatabase(...)) , you would turn it into delay(writeToDatabase(...)) .

I think it’s more a question of how you set up your Monadless instance. It should be possible to change the Cats integration in Monadless to behave differently if Sync is available in the implicit scope. I still don’t see any issues with how Monadless deals with side effects, though.

I find Monadless’ transformation code quite readable in case you want to understand how it works: monadless/monadless-core/src/main/scala/io/monadless/impl/Transformer.scala at master · monadless/monadless · GitHub

sideeffffect · July 16, 2020, 12:06am

I think that whenever we discuss async/await-like programming or for-comprehension in Scala, we should take a step back and have a look at F#'s Computation Expressions.

They are a variant of Haskell’s do-notation or of Scala’s for-comprehension, but it’s much more generalized and customizable. It doesn’t work only with flatMap <- (as if “monadic” val x = ...) and pure yield, but also works with for/while loops, sequences, pattern matches and resource disposal and try-catch-finally analogs from the “imperative” world.

Here’s a hypothetical example to illustrate the point:

let fetchAndDownload url =
  async {                                         // marks the start of the computational expression, like `for` in Scala, but async is an object which defines how it's all wired up together
    let urlStripped = strip url                   // usual variable binding; it's nice you can do that even as the first thing -- you can't do that in Scala
    let! data = downloadData urlStripped          // `let! x = y` is like `x <- y` in Scala, but it's much visually closer to its imperative cousin, which in Scala would be `val x = y`
    let processedData = processData data          // another usual variable binding
    use! monitor = createMonitor processedData    // like `let! x = ...`, but the resource is disposed of at the end of the (otherwise asynchronous) block, think of cats-effect's `Resource.use` (`use x = ...` is F# normal resource acquisition)
    do! notifyMonitor                             // like `let! () = x`, but nicer syntax than `_ <- x` which is what Scala forces you to do
    return processedData                          // like `pure`, serves the same purpose as Scala's `yield` block
  }

The most important part (even before the generality and customizability) is that the syntax is intentionally made similar to the “imperative counterparts”. There’s just ! added at the end of the keyword!. This results in easy learning of the concept and then in fluent writing and reading of the code.
Playing with other syntax constructs programmers are familiar from the imperative world, like resource acquisition/disposal or exceptions (try-catch-finally) is also handy in practice.

The other useful thing to have, besides looking like imperative code, would be debugging like imperative code, where one can nicely see the stacktrace as is logically expected (even though that’s not how it actually is).

And OCaml has recently gained syntax even for applicative composition.

Whichever path Scala takes, be it the improvement of for-comprehension based on callbacks or this new async/await with state machines transformation, I hope it will align well with the rest of Scala’s syntax and that it will be more general and work with more than just Future‘s. Currently, this is Scala’s weak spot, but we can learn from other languages’ successes.

More info and examples on Computation Expressions

djspiewak · July 16, 2020, 2:29pm

Just to clarify, async/await (and also the F# syntax) are not directly comparable to for-comprehensions. for-comprehensions are special syntax for certain function calls, whereas async/await (and similar) change the semantics of existing constructs. Critically, they change the semantics of constructs which have no bearing on each other in normal syntax. For example:

async {
  val a = await(fa)
  val b = await(fb)
  ...
}

// vs

for {
  a <- fa
  b <- fb
  ...
} yield ...

The critical example here is reordering:

async {
  val b = await(fb)
  val a = await(fa)
  ...
}

Referential transparency says that in all cases, you can reorder independent expressions and the resulting program will be the same. Note that reordering for-comprehensions isn’t reordering independent expressions, because it’s flatMaps under the surface, and the syntax is very explicitly different so as to convey that.

With async/await, though, pure expressions get restructured to be non-independent, since a and b are related by a flatMap. In other words, you get all the perils of imperative code tangling within a lexical scope, but applied to monads. This is actually the whole point of the construct (imperative logic, with all the familiarity and dangers), but it also means that it’s 100% not something that everyone will want, and definitely not something that is always applicable. Thus, not something that will replace for-comprehensions.

Though with that said, I really wish for-comprehensions were improved. By like a lot. Better-monadic-for helps a lot in Scala 2, and Scala 3 is getting some improvements (e.g. Guillaume is looking at removing trailing identity maps in the case that it doesn’t change the type), but even still it’s not what it could be as a construct.

littlenag · July 16, 2020, 2:54pm

I’ve not run across Computation Expressions before and will need to check it out.

As for myself, I would rather that Scala get coroutines, ala Kotlin, rather than things hyper-specialized to particular Monads, or Monads in general.

From my perspective, coroutines are a much more general and useful construct. Coroutines can work with Monads/Effects but without being locked into that structure.

For example, coroutines can express async/await, which is how Kotlin implements these notions as I understand. The POC from a few years back (http://storm-enroute.com/coroutines/) has an example implementing async/await via coroutines. Best I can tell the FSM transform that scala-async does is probably already 90% of what coroutines would need.

Coroutines are usually viewed as living on the imperative side of divide, but I think their applicability to FP is understated. One perspective is that for-comprehensions and async/await are effect generators, but without the regular control flow statements that we normally use. A coroutine can yield effects, but have access to all the same control flow statements.

Obviously coroutines have their own set of challenges. I would hope these could be dealt with.

djspiewak · July 16, 2020, 4:08pm

Coroutines do not compose in the way that effects do. You cannot enrich coroutines with dependency injection, or separate error channels, or tracing semantics, or alternative resource management. This is basically the argument for monads abstracting over coroutines, which is exactly what IO is. Once you have a coroutine monad, you can compose it in a general way, which gives you considerably more power. for-comprehensions and async/await recover the syntactic side of things, bringing the imperative usability of monadic interfaces on par with coroutines, while leaving intact the advantages of a monadic API from the standpoint of enabling combinators and abstraction.

djspiewak · July 16, 2020, 5:22pm

Here’s maybe a better example:

async {
  val a = await(fa)
  throw new RuntimeException(a.toString)
}

What happens here? Where does the exception go? Do you require a MonadError[F, Throwable]? Do you just throw it and hope for the best? Cats Effect guarantees that exceptions are caught in Sync#delay, but it makes no guarantees about flatMap/map (and it is in fact a violation of the functor laws to catch within those combinators, though in practice I think all of the practical IOs do it anyway).

littlenag · July 16, 2020, 5:55pm

I’m not arguing that coroutines compose in the way that effects do. They obviously don’t, but for my purposes that is tangential.

Instead, I’m arguing that the same code transform that scala-async does is almost exactly what would be necessary to implement coroutines, with only suspend/yield as the missing piece. So instead of limiting that transformation logic to transforming Futures/IOs why not go further, support coroutines, and then implement async/await in terms of the coroutine?

Then we get two spiffy new tools instead of one.

Of course that only works if the semantics of async/await don’t need further enrichment beyond what a coroutine could do. I can see Computation Expressions as being like that.

Design of -Xasync

Experimental -Xasync

Experimental `-Xasync`