Pre SIP: `for` with Control Flow (an alternative to Scala Async, Scala Continuations and Scala Virtualized)

You are basically saying Python, C#, ECMAScript, Kotlin, and even C++ 20 are all wrong, and Scala will never be a language for those developers who are familiar with generators and coroutines.

The purpose of this proposal is to provide a standardized rewrite rule as an alternative to Scala Async. The only reason why val! is not desirable is that mainstream languages don’t use that. Despite of the complexity of the runtime and rewrite rules (including the ability of reifiable control flow), the users should be able to create coroutines and generators just like other mainstream languages with the help of this proposal.

Nope, not saying that at all. I’m not saying anything is wrong, just offering that I’ve been down the road of “monadic things are hard, let’s make them look like they’re not hard” starting with writing Future-wrapping DSLs on top of the original CPS plugin, which seems like 100 years ago. It was a lot of effort, and it turned out that understanding the compile-failure modes of the resulting downstream code was a lot more effort than just learning about Future and “monadic”(-ish) syntax.

I do understand that, and I think that’s great. From what I understand, though (and this is what I was trying to say), there’s already a de-facto standard rewrite rule that’s been merged into the language. It claims to be flexible to various F[_]s, so it might be the place to start. Certainly if you’re not happy with it, I’d be interested to hear some noise about that, since I’ve no idea what its rewrite rules are and they don’t seem particularly documented nor were they discussed in a meaningful way on this forum. I guess what I’m trying to say is – I think we all have the same complaint, but let’s all complain about the same thing :slight_smile:

4 Likes

Note, “complain about the same thing” meaning, here – there’s a flag which does some purportedly general “non-monadic to monadic” transformation which, given that it’s merged into the language, should presumably be the one way to accomplish such a thing. But it’s not general if it’s not documented – since nobody else can implement it – and there was no opportunity to discuss what the transformation should be in order to generalize most completely.

1 Like

One more thing @yangbo – I get that there’s more to what you’re proposing, in that it also requests some ability to treat raw ASTs (pre-typing) in order to virtualize them. I really don’t think that’s going to happen, although I can think of a lot of great use cases myself – Scala isn’t going to become racket. There’s already some pretty powerful tools for monadic DSLs, and if there’s a general-purpose transformation of imperative to monadic – as long as it’s documented and accessible – I think that would answer a lot of such use cases (at least, when IDEs catch up with it)

Virtualizing AST commonly happens. Existing for comprehension, pattern matching, and scala.Dynamic are all AST virtualization. Even though we did not specify the translation rules in -XAsync, it is definitely translating AST to name based function calls.

In the .Net world F# was the first mover for async syntax, but then C# won with async/await. At least that’s what I see. Scala does have generalized async/await now. Furthermore, the JVM might get something like native coroutine support with Loom. So my conclusion is that adding new syntax to the language now is premature. Let’s see first how async/await and Loom play out.

3 Likes

My personal opinion here is that “virtualized Scala” or “lifted Scala” would be incredibly valuable, but the design space is large enough I would like to see an implementation become popular “in the wild” via a macro or compiler plugin before considering it for the core language.

Among the possible alternatives:

  • Purely applicative lifting, like Mill does
  • Mostly applicative lifting, like SBT does
  • “One-shot” monads, like scala-async does
  • Generalized monads, like Monadless does
  • Full-syntax virtualization, like F# or DSL.scala do

Each of these points in the design space have different tradeoffs, benefits and limitations. Exploring the viability and usability and utility of even one of the above alternatives is already a huge task, hence why I do not believe a single SIP discussion would be enough to design on a design.

We already have an implementation for DSL.scala, but it has not gotten broadly popular enough that I would be comfortable considering it for “upstreaming” in the language. Others like the Mill or SBT implementations similarly remain niche. Like it or not, these kinds of language features are complex enough that no matter how much we discuss the design, it’s only through broad experience using them that their true utility and limitations become apparent.

Scala-Async has a long history of broad discussions and usage, and is a known quantity. While I’m not totally comfortable with finalizing it without further discussion, I do think it has a good claim to being broadly accepted and useful. For other ideas, as much as I think they show potential, I would like to see some broader community usage before we have enough experience to discuss and compare them in an informed manner

16 Likes

Macro or compiler plugin based implementation is impossible in Scala 3, due to lack of the ability to produce untyped ASTs.

Due to lack of the ability to produce untyped ASTs, there is no way to port Scala.Async (name based, supporting flat-mappable types other than Future) to Scala 3. @odersky, do you suggest the Scala community do nothing except waiting for Loom? If it is the case, that would be a good reason to not upgrade to Scala 3, at least until Loom has been deployed.

@odersky, what do you think if I create a PR to Dotty as an experiment. After all, this is the only way to start the experiment in Dotty since we cannot generate untyped AST from compiler plugins or macros.

1 Like

Currently dotty-cps-async supports types other than Future (and event interoperability between different types of monads and automatic colouring, which is impossible in untyped case).
If exists some corner cases, where something is impossible to express in typed form – it would be great to look and analyze them. I can’t find one.

//about proposed language changes itself:

  • the part about if in for is great, it’s intuitive, can clear semantics for collections and can be implemented without risk. (although I prefer to have 2 translations for sync and async conditions)
  • in match I afraid about nesting .right expression ‘k-times’.
  • block part looks unfinished or too restrictive (not specified, how to handle <- expression inside stats).
  • next steps having x <- y instead x = await(x) wrapped in some async expression: the value for developer will be the value of avoiding àsync brackets on top of each function.
  • This can be an interesting theme for research in the middle-term, two things which come to mind:
    – a) if we want ‘lift’ type of monad to the type of DSL expression, then we would have untrivial inference with typing;
    – b) proposal as specified less powerful than existing async wrappers, where while-s and nested blocks are supported.
    This can be fixed in the long run after research.
  • also specifying concrete rules in specification prevent optimizations (for example, dotty-cps-async have different patterns when all subexpression are async or some not, so we have no more than minimum nesting of monads applications. )

If anybody want to count my opinion in the short term — I see adopting ìf` can be factored out to small and nice change.

1 Like

I doubt that. Scala async is a compiler phase that’s running on typed trees late in the pipeline. It could probably be a plugin in Dotty, or else a compiler phase enabled by a -Y option, the way it is done now. I see no evidence that any of this has to run before typer.

@odersky, what do you think if I create a PR to Dotty as an experiment. After all, this is the only way to start the experiment in Dotty since we cannot generate untyped AST from compiler plugins or macros.

A PR that ported Scala async to Dotty would be awesome.

Related: A project that re-does for expressions to drop withFilter and use a zero instead would also be very welcome. This would fix the long standing issue https://github.com/lampepfl/dotty/issues/2573. I guess it’s not going to be easy to wean Scala off withFilter. But we’d have to do this work before we can consider any further extensions to for expressions.

I did not mean it runs before typer. The current Scala 2 implementation is a macro that generates untyped tree, even though the input expressions are typed. In current Dotty it’s impossible because lack of ability of retype-check, which is required because we don’t have a universal type signature of flatMap.

1 Like

Why?

That should follow the existing for comprehension translation rules, since the proposed rules are just some addition to existing rules.

That’s exact the approach that has been implemented in Dsl.scala’s ResetEverywhere plugin. It might be a good idea.

I would suggest to use Dsl type class instead of Monad, because Dsl has been proven it works well with Scala type inference, even in complex use cases.

Why do you say that? I think this proposal also addresses while and nested blocks.

I disagree. Virtualized control flow provides the ability of optimization in runtime.

afraid about nesting .right expression ‘k-times’.

Why ?

  • different shapes when using and when implementing DSL, which looks unintuitive for me. For match handling, I expect something which accepts a partial function shape, where all branches have equivalent status. Also, using number of çalls for the index will provoke DSL implementors to write low-level code, which always sequentelly process cases one by one in runtime. Of course, this is ‘stylistic’ preferences, which hightly opinioned.

Looking again: wrap { ...stats; r } is expanded to for { ...stats } yield r … oops, during first read I missed for, I.e. have read this as wrap { ...stats; r } is expanded for { ...stats } yield r [i.e. missed for in formula ]. This was the source of confusion, sorry.
Now, as I understand:

  for{ a <- b
         _ <-  { d <- e; x }
       }

will always be equivalent of

 for(a <- b;
      d <- e;
      _ <- x;)

and

 for(a <- b;
       {  d <- e;
        _ <- x; 
       } }

not supported, because block should be always in the right part. Ok, looks like I got this, still unsure, when it will be useful.

  • block: ok, I misinterpret one. But using only in the right parts inside for-s looks like restrictions.
  • while: restriction that it should belong to for loop. Cps transformers usually translate all control flow, regardless of they are situated inside for or not.

It depends. Sometimes, in runtime, you have no needed information. I.e. if we have ifThenElse(cond,a,b) and cond is synchronous, and we pass to ``ÌfThebElse cond.pure```, than interpreter see Future (or other monad) and can’t get value without switching context.

Btw, can DSL be universally typed in some structure of hight-kind types(?), to be able to use reflection (but outside of for syntax, because in macros you will see flatMap-s.).
With monads this works.

That’s exactly the “naive use case” that I mentioned in the proposal, where you can simply implement right and left as identity.

Firstly, for without a loop or yield is not supported from this proposal. I have an idea to support the syntax, but not now. If we add a yield "result" clause to the expression, for example:

for {
  a <- b
  _ <-  { d <- e; x }
} yield "result"

should translate to

for {
  a <- b
  _ <- for { d <- e } yield x
} yield "result"

Note that d is a local value inside the nested for comprehension, and will not be accessible by the outer for comprehension.

I disagree. Explicit async / await has been adopted by many main languages. This proposal basically reuse for as the alternative to async in other languages, and reuse <- as the alternative to await in other languages. In practice, other language’s while statements with await must be present in an async function. It’s a requirement, not a restriction.

That’s the pitfall of the Monad-like approach. We don’t have to use Monad. In the refiable control flow approach, keywords.pure(cond) does not have to return a Future[Boolean], instead it’s just an opaque alias type Pure[Boolean] if you define Pure as opaque type Pure[A] = A.

I don’t know what exactly you were referring. Note that the definition of DSL control flow differs from the Dsl type class. The latter Dsl is a general type class for control flow introduced by the library Dsl.scala, while DSL control flow is defined in this proposal, denoting control flows that contain <- and will be translated to name based, signature free calls to some control flows functions. There are couple of options to implement these functions:

  1. Creating functions that return specific concrete types, e.g. Future or Option.
  2. Creating functions that return delimited continuations. This is a general solution because you can then evaluate continuations to concrete types.
  3. Creating functions that delegate to Monad. This is a half-baked solution because you can’t simply use two monads in a single DSL block. For example, awaiting a Future in a for loop. There are monad transformers and Eff, built on top of Monad, to handle multiple DSLs. Unfortunately, none of those solution are transparent to users.
  4. Creating functions that delegate to the Dsl type class. Unlike Monads, Dsl allows for multiple DSLs in a single DSL block.
  5. Creating functions that reify control flow to data structures according to the technology described in the “reifiable control flow” section of this proposal.

Awesome. I am so happy to know you are open to a PR, especially I like Scala Async’s approach, which translates from typed tree to name based calls instead of symbol based calls, unlike dotty-cps-async, which tied to Monad, excluding more general type classes or reifiable control flows.

Let me double confirm, I will work on a PR for dotty, which translates async / await to name based function calls. The possible approaches are:

  1. A desugar transformation between parser and typer. Since it’s before symbol resolution, we have to introduce some new language keywords or reuse existing language keywords. The original idea of this proposal is to reuse for and <- instead of async/await, but I am also open to turn async/await into keywords.
  2. Some hooks that override Typer.typedApply and replace the untyped tree and then continue type checking. Since it’s inside the typer, the symbol information is available, and no language keywords need to be introduced.
  3. A separate phase after typer. Since name based function calls need retypecheck, this solution might reintroduce the infamous resetLocalAttrs bugs.

@odersky, what’s you opinion? Since you don’t like the approach 1 that changes the grammar, would you mind considering to merge a PR in approach 2? I promise all the transformation will be in a separate trait DesugarAsync, which can be enabled under a compiler flag, and is mixed into Typer. It should be totally optional and the impact to the typer is minimal.

4 Likes

Ok. I belong to camp, which thinks that one control-flow is enough, and having all inside for will cause difficulties when programmer needs a ‘real for’ instead ‘control-flow for,’ but I know that second camp exists and can understand your point :wink:

yet one note: a <- b is alternative to a = await(b), not await(b). This simplifies translation for apply (which was for me one of the most complex parts of the cps) but limits the expressivity.

I have thought about the possibility of a recreation DSL idea in the macro variant. Since typing is doing before macroses, than we can define a DSL=algebra (set of types and rules on top of scala types) in the same way, as Monads.

I guess, that if we have syntax patterns, then we have appropriative typing, in some sense, because in general, we can define type T[A,B, … C], such as

[exression(a:A, b:B, .. c:C):T] == [transfom(expression(a,b, ... c)) can be successfully typed].

Thought, this definition will make type system Turing-complete and undedicable. This this is like non-formal intuition, that typing should exists.
I guess that for concrete rules of control-flow mangling we can search such algebra in the existing scala type system, and chances that exist one general HK-typing, understandable to developers, are quite big.

In monad this looks like:
for any T we can define F[T] which have map: F[T], T=>S => F[S], flatMap… etc (then may-be restore, transformers, etc). For desugaring explore properties of this monad algebra.
Other frameworks can add additional structure to this monad. (error handling, brackets, conversions ,… etc)

for DSL, we can think in that direction:
let we have F[] with (map, … )
then we can define DSL[F[
], B, T], such as
exists:
- monadic structure over F (for for desugaring).
- sequential composition: andThen(a,b): F[A], F[B] => F[B]
- ifThenElse: F[Boolean], F[A], F[C] => F[A | C]
- while: (F[Boolean], F[Unit]) => F[Unit]
for handling more than one monad you can have something like convert: DSL[F[], … ] => DSL[G[], … ] (or define virtualized flow methods as ifThenElse[F[_],G[_],H[_]) on top of F,G,H, but I guess that this will end with lattice over F[_]-s, so monad conversion will be enough).

We will receive something like a subset of scala-virtualized, as F[X] in role of Rep[X]. I’m pretty sure that if you start typing DSL control-flow, than you will deduce generic typing. F[_]
And this will be possible to implement without touching scala-core, in the same manner, as dotty-cps-async (or even add handlers there).

But this will not work inside for loops, because macros do not see for, but map,
flapMaps instead.

This is a related idea (not precisely you proposing), maybe it will be useful if you want to adopt DSL-over-monad on practice before patching compiler. (because the number of people, who use the library will always be bigger, than who use the patched compiler). But if be ‘inside for’ is significant, that such macro-based approach will not be able to reflect original proposal

That’s possible and I have created a macro for your idea earlier this year at this branch: https://github.com/Atry/Dsl.scala/tree/2.x-wip . The macro was completed but the runtime is WIP. There are some minor difference from your types, especially:

F[_] is not ideal. For example, you cannot do char <- "string", because "string" does not unify to F[_]. The better type signature is just If[Condition, Then, Else], instead of the higher-kinded types. See https://github.com/Atry/Dsl.scala/blob/4124db849f7f692e56de21764c10337c8c825268/keywords-If/src/main/scala/com/thoughtworks/dsl/keywords/If.scala#L7

I am glad to see you now realized why we need keywords.right / keywords.left things.

The problem of the approach translating to symbol based function calls is that it tied to the interface. Even though I thought Dsl type class is a more general solution than Monad, I don’t want to persuade everyone.

The name based approach give the choice of optimization / implementation to the runtime.

The real problem is the priority of map / flatMap. I can provide an extension method flatMap to virtualize for comprehension, unfortunately it only works for those types that do not have their own flapMap, for example java.util collections. If for comprehension is translated to keywords.flatMap(fa) { ... } instead of fa.flatMap { ... }, I can easily support all types.

If we don’t change the for comprehension translation rule, a workaround is wrapping Scala collections to another type, which enables the virtualized flatMap, e.g. in Dsl.scala, Each is used for virtualized collections comprehension, and there is a to method to evaluate the virtualized code to regular collections: https://javadoc.io/page/com.thoughtworks.dsl/dsl_2.12/latest/com/thoughtworks/dsl/comprehension$.html

Having a port for Scala async to Dotty would be very nice! Now, how to do it? I think it should model closely what’s done in the existing 2.13 implementation that just landed. I don’t know that implementation in detail, but do know that a lot of thought went into it.

In terms of type checking I still don’t see the problem. I assume that async and await are defined in some library and imported into the user program. They would identify themselves to the compiler by some annotation that tells the compiler that code in this async needs a state transform. The user program has to typecheck with the types of async and await as given, but the exact types of these methods are arbitrary.

Then the translation would generate new code at some later phase, which again uses an untyped interface. At that point, the introduced low-level calls do need a type-check, but the compiler does that
all the time anyway. Have a look at tpd and TypeAssigner, they allow arbitrary tree constructions but typecheck the trees at the moment they are constructed.

The whole thing could either be done in a plugin or as an optional compiler phase under an -X flag, which is what 2.13 async now uses.

Hmm, Note, that async problem is already solved. (dotty-cps-async: https://github.com/rssh/dotty-cps-async ). implements absolutely all functionality of existing scala-async now, with the quite similar developer interface.

Of course, many competing solutions is better than none. But for having progress, we need a open field of competition [when solutions are different enough to have space to innovate], not just recreating,

Throwing resources into duplication of work which already done by somebody else, will create the most inefficient form of quasi-market, with big fragmentation (which force customer to make an unnecessary choice).

Such competition can have a sense in markets, connected with money, where each product receives a compensation proportionally to market share, but with open source, when monetization is very weak, this is just a waste of each other’s time.

So, I’m quite surprised with the conscious creation of such type of competition … quite interesting.

Regards.