Pre SIP: `for` with Control Flow (an alternative to Scala Async, Scala Continuations and Scala Virtualized)

odersky · July 29, 2020, 8:51am

In the .Net world F# was the first mover for async syntax, but then C# won with async/await. At least that’s what I see. Scala does have generalized async/await now. Furthermore, the JVM might get something like native coroutine support with Loom. So my conclusion is that adding new syntax to the language now is premature. Let’s see first how async/await and Loom play out.

lihaoyi · July 29, 2020, 10:28am

My personal opinion here is that “virtualized Scala” or “lifted Scala” would be incredibly valuable, but the design space is large enough I would like to see an implementation become popular “in the wild” via a macro or compiler plugin before considering it for the core language.

Among the possible alternatives:

Purely applicative lifting, like Mill does
Mostly applicative lifting, like SBT does
“One-shot” monads, like scala-async does
Generalized monads, like Monadless does
Full-syntax virtualization, like F# or DSL.scala do

Each of these points in the design space have different tradeoffs, benefits and limitations. Exploring the viability and usability and utility of even one of the above alternatives is already a huge task, hence why I do not believe a single SIP discussion would be enough to design on a design.

We already have an implementation for DSL.scala, but it has not gotten broadly popular enough that I would be comfortable considering it for “upstreaming” in the language. Others like the Mill or SBT implementations similarly remain niche. Like it or not, these kinds of language features are complex enough that no matter how much we discuss the design, it’s only through broad experience using them that their true utility and limitations become apparent.

Scala-Async has a long history of broad discussions and usage, and is a known quantity. While I’m not totally comfortable with finalizing it without further discussion, I do think it has a good claim to being broadly accepted and useful. For other ideas, as much as I think they show potential, I would like to see some broader community usage before we have enough experience to discuss and compare them in an informed manner

yangbo · July 30, 2020, 12:40am

Macro or compiler plugin based implementation is impossible in Scala 3, due to lack of the ability to produce untyped ASTs.

Due to lack of the ability to produce untyped ASTs, there is no way to port Scala.Async (name based, supporting flat-mappable types other than Future) to Scala 3. @odersky, do you suggest the Scala community do nothing except waiting for Loom? If it is the case, that would be a good reason to not upgrade to Scala 3, at least until Loom has been deployed.

yangbo · July 30, 2020, 4:34am

@odersky, what do you think if I create a PR to Dotty as an experiment. After all, this is the only way to start the experiment in Dotty since we cannot generate untyped AST from compiler plugins or macros.

rssh · July 30, 2020, 7:01am

Currently dotty-cps-async supports types other than Future (and event interoperability between different types of monads and automatic colouring, which is impossible in untyped case).
If exists some corner cases, where something is impossible to express in typed form – it would be great to look and analyze them. I can’t find one.

//about proposed language changes itself:

the part about if in for is great, it’s intuitive, can clear semantics for collections and can be implemented without risk. (although I prefer to have 2 translations for sync and async conditions)
in match I afraid about nesting .right expression ‘k-times’.
block part looks unfinished or too restrictive (not specified, how to handle ← expression inside stats).
next steps having x ← y instead x = await(x) wrapped in some async expression: the value for developer will be the value of avoiding àsync brackets on top of each function.
This can be an interesting theme for research in the middle-term, two things which come to mind:
– a) if we want ‘lift’ type of monad to the type of DSL expression, then we would have untrivial inference with typing;
– b) proposal as specified less powerful than existing async wrappers, where while-s and nested blocks are supported.
This can be fixed in the long run after research.
also specifying concrete rules in specification prevent optimizations (for example, dotty-cps-async have different patterns when all subexpression are async or some not, so we have no more than minimum nesting of monads applications. )

If anybody want to count my opinion in the short term — I see adopting ìf` can be factored out to small and nice change.

odersky · July 30, 2020, 9:13am

I doubt that. Scala async is a compiler phase that’s running on typed trees late in the pipeline. It could probably be a plugin in Dotty, or else a compiler phase enabled by a -Y option, the way it is done now. I see no evidence that any of this has to run before typer.

@odersky, what do you think if I create a PR to Dotty as an experiment. After all, this is the only way to start the experiment in Dotty since we cannot generate untyped AST from compiler plugins or macros.

A PR that ported Scala async to Dotty would be awesome.

Related: A project that re-does for expressions to drop withFilter and use a zero instead would also be very welcome. This would fix the long standing issue Scala Wart: Convoluted de-sugaring of for-comprehensions · Issue #2573 · lampepfl/dotty · GitHub. I guess it’s not going to be easy to wean Scala off withFilter. But we’d have to do this work before we can consider any further extensions to for expressions.

yangbo · July 30, 2020, 6:50pm

I did not mean it runs before typer. The current Scala 2 implementation is a macro that generates untyped tree, even though the input expressions are typed. In current Dotty it’s impossible because lack of ability of retype-check, which is required because we don’t have a universal type signature of flatMap.

yangbo · July 31, 2020, 4:12am

Why?

That should follow the existing for comprehension translation rules, since the proposed rules are just some addition to existing rules.

That’s exact the approach that has been implemented in Dsl.scala’s ResetEverywhere plugin. It might be a good idea.

I would suggest to use Dsl type class instead of Monad, because Dsl has been proven it works well with Scala type inference, even in complex use cases.

Why do you say that? I think this proposal also addresses while and nested blocks.

I disagree. Virtualized control flow provides the ability of optimization in runtime.

rssh · August 1, 2020, 6:02pm

afraid about nesting .right expression ‘k-times’.

Why ?

different shapes when using and when implementing DSL, which looks unintuitive for me. For match handling, I expect something which accepts a partial function shape, where all branches have equivalent status. Also, using number of çalls for the index will provoke DSL implementors to write low-level code, which always sequentelly process cases one by one in runtime. Of course, this is ‘stylistic’ preferences, which hightly opinioned.

Looking again: wrap { ...stats; r } is expanded to for { ...stats } yield r … oops, during first read I missed for, I.e. have read this as wrap { ...stats; r } is expanded for { ...stats } yield r [i.e. missed for in formula ]. This was the source of confusion, sorry.
Now, as I understand:

  for{ a <- b
         _ <-  { d <- e; x }
       }

will always be equivalent of

 for(a <- b;
      d <- e;
      _ <- x;)

and

 for(a <- b;
       {  d <- e;
        _ <- x; 
       } }

not supported, because block should be always in the right part. Ok, looks like I got this, still unsure, when it will be useful.

block: ok, I misinterpret one. But using only in the right parts inside for-s looks like restrictions.
while: restriction that it should belong to for loop. Cps transformers usually translate all control flow, regardless of they are situated inside for or not.

It depends. Sometimes, in runtime, you have no needed information. I.e. if we have ifThenElse(cond,a,b) and cond is synchronous, and we pass to ``ÌfThebElse cond.pure```, than interpreter see Future (or other monad) and can’t get value without switching context.

Btw, can DSL be universally typed in some structure of hight-kind types(?), to be able to use reflection (but outside of for syntax, because in macros you will see flatMap-s.).
With monads this works.

yangbo · August 3, 2020, 6:16am

That’s exactly the “naive use case” that I mentioned in the proposal, where you can simply implement right and left as identity.

rssh:

Looking again: wrap { ...stats; r } is expanded to for { ...stats } yield r … oops, during first read I missed for, I.e. have read this as wrap { ...stats; r } is expanded for { ...stats } yield r [i.e. missed for in formula ]. This was the source of confusion, sorry.
Now, as I understand:
  for{ a <- b
         _ <-  { d <- e; x }
       }
will always be equivalent of
 for(a <- b;
      d <- e;
      _ <- x;)
and
 for(a <- b;
       {  d <- e;
        _ <- x; 
       } }
not supported, because block should be always in the right part. Ok, looks like I got this, still unsure, when it will be useful.

Firstly, for without a loop or yield is not supported from this proposal. I have an idea to support the syntax, but not now. If we add a yield "result" clause to the expression, for example:

for {
  a <- b
  _ <-  { d <- e; x }
} yield "result"

should translate to

for {
  a <- b
  _ <- for { d <- e } yield x
} yield "result"

Note that d is a local value inside the nested for comprehension, and will not be accessible by the outer for comprehension.

I disagree. Explicit async / await has been adopted by many main languages. This proposal basically reuse for as the alternative to async in other languages, and reuse <- as the alternative to await in other languages. In practice, other language’s while statements with await must be present in an async function. It’s a requirement, not a restriction.

That’s the pitfall of the Monad-like approach. We don’t have to use Monad. In the refiable control flow approach, keywords.pure(cond) does not have to return a Future[Boolean], instead it’s just an opaque alias type Pure[Boolean] if you define Pure as opaque type Pure[A] = A.

I don’t know what exactly you were referring. Note that the definition of DSL control flow differs from the Dsl type class. The latter Dsl is a general type class for control flow introduced by the library Dsl.scala, while DSL control flow is defined in this proposal, denoting control flows that contain <- and will be translated to name based, signature free calls to some control flows functions. There are couple of options to implement these functions:

Creating functions that return specific concrete types, e.g. Future or Option.
Creating functions that return delimited continuations. This is a general solution because you can then evaluate continuations to concrete types.
Creating functions that delegate to Monad. This is a half-baked solution because you can’t simply use two monads in a single DSL block. For example, awaiting a Future in a for loop. There are monad transformers and Eff, built on top of Monad, to handle multiple DSLs. Unfortunately, none of those solution are transparent to users.
Creating functions that delegate to the Dsl type class. Unlike Monads, Dsl allows for multiple DSLs in a single DSL block.
Creating functions that reify control flow to data structures according to the technology described in the “reifiable control flow” section of this proposal.

yangbo · August 3, 2020, 6:53am

Awesome. I am so happy to know you are open to a PR, especially I like Scala Async’s approach, which translates from typed tree to name based calls instead of symbol based calls, unlike dotty-cps-async, which tied to Monad, excluding more general type classes or reifiable control flows.

Let me double confirm, I will work on a PR for dotty, which translates async / await to name based function calls. The possible approaches are:

A desugar transformation between parser and typer. Since it’s before symbol resolution, we have to introduce some new language keywords or reuse existing language keywords. The original idea of this proposal is to reuse for and <- instead of async/await, but I am also open to turn async/await into keywords.
Some hooks that override Typer.typedApply and replace the untyped tree and then continue type checking. Since it’s inside the typer, the symbol information is available, and no language keywords need to be introduced.
A separate phase after typer. Since name based function calls need retypecheck, this solution might reintroduce the infamous resetLocalAttrs bugs.

@odersky, what’s you opinion? Since you don’t like the approach 1 that changes the grammar, would you mind considering to merge a PR in approach 2? I promise all the transformation will be in a separate trait DesugarAsync, which can be enabled under a compiler flag, and is mixed into Typer. It should be totally optional and the impact to the typer is minimal.

rssh · August 3, 2020, 1:11pm

Ok. I belong to camp, which thinks that one control-flow is enough, and having all inside for will cause difficulties when programmer needs a ‘real for’ instead ‘control-flow for,’ but I know that second camp exists and can understand your point

yet one note: a <- b is alternative to a = await(b), not await(b). This simplifies translation for apply (which was for me one of the most complex parts of the cps) but limits the expressivity.

I have thought about the possibility of a recreation DSL idea in the macro variant. Since typing is doing before macroses, than we can define a DSL=algebra (set of types and rules on top of scala types) in the same way, as Monads.

I guess, that if we have syntax patterns, then we have appropriative typing, in some sense, because in general, we can define type T[A,B, … C], such as

[exression(a:A, b:B, .. c:C):T] == [transfom(expression(a,b, ... c)) can be successfully typed].

Thought, this definition will make type system Turing-complete and undedicable. This this is like non-formal intuition, that typing should exists.
I guess that for concrete rules of control-flow mangling we can search such algebra in the existing scala type system, and chances that exist one general HK-typing, understandable to developers, are quite big.

In monad this looks like:
for any T we can define F[T] which have map: F[T], T=>S => F[S], flatMap… etc (then may-be restore, transformers, etc). For desugaring explore properties of this monad algebra.
Other frameworks can add additional structure to this monad. (error handling, brackets, conversions ,… etc)

for DSL, we can think in that direction:
let we have F[] with (map, … )
then we can define DSL[F[], B, T], such as
exists:
- monadic structure over F (for for desugaring).
- sequential composition: andThen(a,b): F[A], F[B] => F[B]
- ifThenElse: F[Boolean], F[A], F[C] => F[A | C]
- while: (F[Boolean], F[Unit]) => F[Unit]
for handling more than one monad you can have something like convert: DSL[F[], … ] => DSL[G[], … ] (or define virtualized flow methods as ifThenElse[F[_],G[_],H[_]) on top of F,G,H, but I guess that this will end with lattice over F[_]-s, so monad conversion will be enough).

We will receive something like a subset of scala-virtualized, as F in role of Rep. I’m pretty sure that if you start typing DSL control-flow, than you will deduce generic typing. F[_]
And this will be possible to implement without touching scala-core, in the same manner, as dotty-cps-async (or even add handlers there).

But this will not work inside for loops, because macros do not see for, but map,
flapMaps instead.

This is a related idea (not precisely you proposing), maybe it will be useful if you want to adopt DSL-over-monad on practice before patching compiler. (because the number of people, who use the library will always be bigger, than who use the patched compiler). But if be ‘inside for’ is significant, that such macro-based approach will not be able to reflect original proposal

yangbo · August 3, 2020, 6:47pm

That’s possible and I have created a macro for your idea earlier this year at this branch: GitHub - Atry/Dsl.scala at 2.x-wip . The macro was completed but the runtime is WIP. There are some minor difference from your types, especially:

F[_] is not ideal. For example, you cannot do char <- "string", because "string" does not unify to F[_]. The better type signature is just If[Condition, Then, Else], instead of the higher-kinded types. See Dsl.scala/keywords-If/src/main/scala/com/thoughtworks/dsl/keywords/If.scala at 4124db849f7f692e56de21764c10337c8c825268 · Atry/Dsl.scala · GitHub

I am glad to see you now realized why we need keywords.right / keywords.left things.

The problem of the approach translating to symbol based function calls is that it tied to the interface. Even though I thought Dsl type class is a more general solution than Monad, I don’t want to persuade everyone.

The name based approach give the choice of optimization / implementation to the runtime.

The real problem is the priority of map / flatMap. I can provide an extension method flatMap to virtualize for comprehension, unfortunately it only works for those types that do not have their own flapMap, for example java.util collections. If for comprehension is translated to keywords.flatMap(fa) { ... } instead of fa.flatMap { ... }, I can easily support all types.

If we don’t change the for comprehension translation rule, a workaround is wrapping Scala collections to another type, which enables the virtualized flatMap, e.g. in Dsl.scala, Each is used for virtualized collections comprehension, and there is a to method to evaluate the virtualized code to regular collections: ThoughtWorksInc/Dsl.scala 2.0.0-M0+1-b691cde8 - com.thoughtworks.dsl.comprehension

odersky · August 4, 2020, 3:44pm

Having a port for Scala async to Dotty would be very nice! Now, how to do it? I think it should model closely what’s done in the existing 2.13 implementation that just landed. I don’t know that implementation in detail, but do know that a lot of thought went into it.

In terms of type checking I still don’t see the problem. I assume that async and await are defined in some library and imported into the user program. They would identify themselves to the compiler by some annotation that tells the compiler that code in this async needs a state transform. The user program has to typecheck with the types of async and await as given, but the exact types of these methods are arbitrary.

Then the translation would generate new code at some later phase, which again uses an untyped interface. At that point, the introduced low-level calls do need a type-check, but the compiler does that
all the time anyway. Have a look at tpd and TypeAssigner, they allow arbitrary tree constructions but typecheck the trees at the moment they are constructed.

The whole thing could either be done in a plugin or as an optional compiler phase under an -X flag, which is what 2.13 async now uses.

rssh · August 5, 2020, 4:36am

Hmm, Note, that async problem is already solved. (dotty-cps-async: https://github.com/rssh/dotty-cps-async ). implements absolutely all functionality of existing scala-async now, with the quite similar developer interface.

Of course, many competing solutions is better than none. But for having progress, we need a open field of competition [when solutions are different enough to have space to innovate], not just recreating,

Throwing resources into duplication of work which already done by somebody else, will create the most inefficient form of quasi-market, with big fragmentation (which force customer to make an unnecessary choice).

Such competition can have a sense in markets, connected with money, where each product receives a compensation proportionally to market share, but with open source, when monetization is very weak, this is just a waste of each other’s time.

So, I’m quite surprised with the conscious creation of such type of competition … quite interesting.

Regards.

odersky · August 6, 2020, 9:56am

I see your points. But the fact is that Scala async is a part of Scala 2.13 and is heavily used in large Scala shops. So we basically need to support it in Scala 3, it the same way we need to support the Scala 2 standard library. If there is a consensus that another solution such as dotty-cps-async is superior, then we can discuss migration strategies. But the consensus has to be there first. And, most importantly, it should not be tied into the 2 to 3 transition. Managing all the 2 to 3 changes is extremely hard as it is; we do not want to switch core libraries at the same time.

rssh · August 6, 2020, 1:23pm

Ah, transition - I see. Yes, you are right - if we want bug-to-bug compatibility, then the only possible way - is to translate the existing plugin with minimal changes.
From one side - maybe this is not so hard, from the other side – it’s far from origin @yangbo proposal and looks more like an ordinary commercial task. From the yet another point - if big Scala shops expect bug-to-bug compatibility from an open-source project without paying for it, then what is a long-term model of this game (?). Maybe the right way is to set up something like ‘compatibility-fund,’ which provides estimations and actual needs, where interested parties can donate money or time?

yangbo · August 9, 2020, 5:37am

Great! So I assume the current error in retypechecking is a bug, right? I create a bug report accordingly.

github.com/lampepfl/dotty

Cannot retypecheck after frontend phase

opened 05:32AM - 09 Aug 20 UTC

closed 12:17PM - 09 Aug 20 UTC

Atry

itype:bug

## Minimized code ```bash git clone https://github.com/Atry/dotty-retypechec…k.git && cd dotty-retypecheck && sbt Test/compile ``` ## Output ``` Cloning into 'dotty-retypecheck'... remote: Enumerating objects: 22, done. remote: Counting objects: 100% (22/22), done. remote: Compressing objects: 100% (14/14), done. remote: Total 22 (delta 0), reused 19 (delta 0), pack-reused 0 Unpacking objects: 100% (22/22), done. [info] welcome to sbt 1.3.13 (Oracle Corporation Java 10.0.1) [info] loading settings for project global-plugins from console.sbt ... [info] loading global plugins from /Users/atry/.sbt/1.0/plugins [info] loading settings for project dotty-retypecheck-build from plugins.sbt ... [info] loading project definition from /private/tmp/dotty-retypecheck/project [warn] There may be incompatibilities among your library dependencies; run 'evicted' to see detailed eviction warnings. [info] loading settings for project dotty-retypecheck from build.sbt ... [info] set current project to dotty-retypecheck (in build file:/private/tmp/dotty-retypecheck/) [info] Compiling 2 Scala sources to /private/tmp/dotty-retypecheck/target/scala-0.26/classes ... [info] Compiling 1 Scala source to /private/tmp/dotty-retypecheck/target/scala-0.26/test-classes ... [info] result of /private/tmp/dotty-retypecheck/src/test/scala/mypackage/Main.scala after typer: [info] package mypackage { [info] final lazy module val Main: mypackage.Main$ = new mypackage.Main$() [info] final module class Main$() extends Object(), _root_.scala.Serializable { [info] this: mypackage.Main.type => [info] def main(args: Array[String]): Unit = [info] if math.random().>(0.5) then [info] { [info] println(identity[String]("then")) [info] } [info] else [info] { [info] println(identity[String]("else")) [info] } [info] } [info] } [info] result of /private/tmp/dotty-retypecheck/src/test/scala/mypackage/Main.scala after inlinedPositions: [info] Unchanged since typer [error] -- Error: /private/tmp/dotty-retypecheck/src/test/scala/mypackage/Main.scala:4:12 [error] 4 | println(identity("then")) [error] | ^^^^^^^^ [error] | cannot infer type; expected type <?> is not fully defined [info] result of /private/tmp/dotty-retypecheck/src/test/scala/mypackage/Main.scala after MyCompilerPlugin: [info] package mypackage { [info] final lazy module val Main: mypackage.Main$ = new mypackage.Main$() [info] final module class Main$() extends Object(), _root_.scala.Serializable { [info] this: mypackage.Main.type => [info] def main(args: Array[String]): Unit = [info] mypackage.MyHooks.ifThenElse[Unit](math.random().>(0.5), [info] { [info] println( [info] identity[ [info] <error cannot infer type; expected type <?> is not fully defined> [info] ]("then") [info] ) [info] } [info] , [info] { [info] println(identity[<error unspecified error>]("else")) [info] } [info] ) [info] } [info] } [error] one error found [error] (Test / compileIncremental) Compilation failed [error] Total time: 4 s, completed Aug 8, 2020, 10:31:48 PM ``` ## Expectation It should compile.

yangbo · August 9, 2020, 5:12pm

yangbo:

Let me double confirm, I will work on a PR for dotty, which translates async / await to name based function calls. The possible approaches are:

A desugar transformation between parser and typer. Since it’s before symbol resolution, we have to introduce some new language keywords or reuse existing language keywords. The original idea of this proposal is to reuse for and <- instead of async / await , but I am also open to turn async / await into keywords.

Some hooks that override Typer.typedApply and replace the untyped tree and then continue type checking. Since it’s inside the typer, the symbol information is available, and no language keywords need to be introduced.

A separate phase after typer. Since name based function calls need retypecheck, this solution might reintroduce the infamous resetLocalAttrs bugs .

According to @smarter’s comment, approach 3 is impossible. @odersky, would you mind if I go approach 2?

smarter · August 9, 2020, 5:34pm

This is not what I said, I said that retypechecking was possible but requires using an instance of ReTyper. I also said that it’s something that should be avoided if possible. In this particular case, the most important thing is that we match the existing Scala 2 implementation as closely as possible to make it easier to compare them and to keep them compatible, and as far as I can tell, the Scala 2 implementation does not do retypechecking and does use a separate phase: Compiler support for scala-async, with a state machine transform phase running after erasure [ci: last-only] by retronym · Pull Request #8816 · scala/scala · GitHub