PRE-SIP: Suspended functions and continuations

Its interesting that people here are talking about the UI-thread coding issue but this in my view is already a solved problem in Scala as mentioned earlier, i.e. you can make your own ExecutionContext that points to the Swing event dispatch thread

object SwingExecutionContext {
  val executionContext: ExecutionContext = ExecutionContext.fromExecutor(new Executor {
    def execute(command: Runnable): Unit = SwingUtilities invokeLater command
  })
}

And then using standard Future composition you can just do this

for {
  _ <- something
  _ <- something
  _ <- something
  _ <- Future.successful(()).flatMap { _ =>
    for {
      _ <- showGui()
      _ <- setProgress()
    } yield ()
  }(SwingExecutionContext.executionContext)
} yield ()

The Future.successful is a bit weird but its written that way because you are not passing any data from another Future to render into the UI thread, i.e. something like this is more realistic

for {
  _ <- something
  _ <- something
  _ <- something
  _ <- getCurrentProgress().flatMap { progress =>
    for {
      _ <- showGui()
      _ <- setProgress(progress + 10)
    } yield ()
  }(SwingExecutionContext.executionContext)
} yield ()

Other IO types like Task/IO can also do this. The example shows how you can invoke computations outside of the main UI/event dispatch thread (typically with Future this is a ForkJoinPool and in the above example it would use whatever implicit ExecutionContext you have in scope) and then you can explicitly provide your own ExecutionCOntext if you only want to execute something onto the UI thread since you don’t want to overload it.

3 Likes

automatic batching for parallel execution is done by static analysis, at least that is written on yona execution model:

Another important concept here is that the order of aliases defined in the let expression does matter. Yona doesn’t just randomly re-arrange them based on dependencies alone. If it did, it could for example close those files before they are ever read. That would be incorrect. Yona just uses static analysis of this expression to determine which aliases can be “batched” and actually batches the execution if they provide underlying Promises. Then the whole expression is transformed into something like this:

not sure how it works, though.

Scala 3 probably could use Capture Checking to determine subsequences of async computations inside which computations can be run in parallel, e.g. if computations capture disjoint capabilities then they can be run in parallel.

note: I haven’t spent a lot of time to check if my idea would work properly, but if yona-lang authors made something working then maybe Scala can do similar thing too.

1 Like

There is also HVM which does automatic parallelisation of purely functional programs https://github.com/Kindelia/HVM which you may find interesting.

To do something like this in Scala it would require a way for tracking purity, i.e. the compiler needs to know that functions are pure/side effect free which currently isn’t possible (you can use types at runtime to designate that computations are pure however the compiler just sees it as a type and nothing more).

Finally you would need to see if its possible for the JVM to show the same performance characteristics that HVM via clang does.

2 Likes

Well, it’s interesting, but limiting automatic parallelism to only purely functional code reduces its applicability too much.

Add a soft keyword to solve a problem that will not be valid soon after Loom been released maybe not a good idea.

1 Like

I think I would frame the question differently; at least for me “suspension” sounds quite abstract and hard to reflect in the real world.

Maybe I’m oversimplifying, but isn’t the crux of the problem answering the question whether we want side-effecting and “normal” methods to have the same signature? (I tried asking the same on twitter some time ago, but without conclusive answers :wink: ).

If the signatures should be different - then the second step would be considering specific solutions. Coloring using IO, coloring using suspend or coloring using capabilities - I think these are the propositions on the table.

If however the method signature shouldn’t tell us whether the method is side effecting, and if we are targeting a Loom runtime - well then neither suspensions, nor capabilities are going to be useful.

Not that I know the “right” answers, however I am leaning towards having side-effecting and “normal” methods distinguished by the type system. The reason is simple, and I think quite well-known in the literature as RPC fallacy. People did attempt to make a remote call look as if it was a local call, and as far as I know, they all failed. To the point that it’s now pretty well established that it’s a bad idea. To give more context, take a look at Jonas Boner’s presentation. (Maybe Lightbend wasn’t so wrong about async after all :wink: But then, maybe the are talking about “async in the large” not “async in the small”?)

It’s all about failure modes - the ways in which a remote call can fail are vastly different from the ways a local call can. And by the way - all file-reading operations are network calls as well. Shouldn’t we be tracking this in our type system? The side-effecting operations mentioned here would probably be more or less what today we know as “blocking” operations, but as blocking is no longer an issue with Loom, we need to make our focus more precise (and that’s a good thing!).

These side-effecting capabilities can be more or less fine grained, but a direct consequence of a capability being required or not in a function, is how the function might fail, which errors and how be handled and how. And this brings direct runtime consequences.

In yet another words: I would first established what kind of properties we want to track through the type system in Scala. Reading through the (very interesting) proposal and discussion, I think there are quite diverging opinions. But only once a goal is set (not necessarily to unanimous applause), as to how far the type system should go, we can consider the (secondary to safety) syntactical approaches: using either wrapper, suspended or direct style.

8 Likes

The interesting point to make here is that out of co-incidence in almost all cases async tasks also happen to be side effecting, that is pretty much every IO/file/network operation also happens to be a side effect. What this means in practice is that even if you deliberately avoid the color function problems for async tasks (i.e. you make no distinction between asynchronous and non synchronous computations), if you still care about strongly typing your side effects then you pretty much end up re-creating red-blue color problem anyways.

This actually describes the history and design of Haskell. That is, even if Haskell didn’t have virtual/green threads and solved the IO vs CPU bounded computations in a different way, it would still have the IO because thats how Haskell is able to solve the “representing side effects in a purely functional language” problem.

I would argue that this is the reason why making a big deal out of the red-blue/color function problem is a bit benign because if you accept the preposition that a significant portion of the Scala programmers track side effects via types then you end up, by accident, marking your computations as asynchronous anyways. Which brings us to final point, if there is significant (usually in practice pretty much complete) overlap between marking async functions and marking side effecting functions, doesn’t it make sense to take advantage of this since we are solving 2 “problems” at once?

3 Likes

This is something commonly heard, but it doesn’t pass the sniff test.

  • The aws CLI command is called the same way as local CLI commands
  • The boto3 Python library is called the same way as local Python libraries
  • requests.get calls in Python looks the same as any other method calls

Yes, treating RPCs the same as normal methods can fail. In high performance or high concurrency scenarios, where the thread overhead is unacceptable. Or in high-reiability scenarios, where the novel failure modes become significant.

But to say “they all failed” is absurd. There are more people happily using Python’s requests alone than there are in the entire Scala community. Probably the majority of the world is treating RPCs like normal method calls, and it generally works reasonably well.

Sure sometimes treating RPCs as normal methods has caveats and overheads, and sometimes it falls apart, but that’s not unique to RPCs: every abstraction has caveats and overheads, and scenarios they fail. But that doesn’t mean they’re failures in general, it just means that specialized use cases sometimes call for specialized tools or techniques.

5 Likes

I think you might be comparing apples and oranges here.

CLI commands have a single (rather coarse-grained) path to handling failures (die with some error code), so the difference between aw and less failing is much less relevant than the difference in failure modes between a pure function and a database query.

The python libraries are also not really equivalent comparisons for a similar the same reason: idiomatic error handling in Python is to just throw an exception, so two Python functions which both have a return type and may-or-may-not throw exceptions (but you’d better assume they do) aren’t a great analog for how failure modes are handled in idiomatic Scala.

It would make sense that, if async computations can be made so performant that the difference between a pure function and side-effecting network call can be made invisible on the JVM, that this would be a boon to Java applications, and in this context, Loom replacing Future in Java applications makes a lot of sense.

However: having recently had to try to answer the question, “how many ways can critical method X fail”, in a part of a codebase that (while written in Scala) used the Exception-first style, I can say with certainty that moving to this sort of style would be a mistake.

4 Likes

Note that despite the sensationalist claims of its author, this thing is not viable at all. It fails catastrophically on certain program shapes. The author says a type system could rule out these program shapes, but no type system that does something like that has been demonstrated so far AFAIK.

2 Likes

I don’t think so. This is a non-goal for many in the Scala community, including I believe Odersky (citation needed). Capture tracking’s notion of purity does not correspond to functional programming’s notion of purity. Indeed, neither concept embeds the other completely, so while they overlap in some cases, they are genuinely different concepts.

There is probably no future Scala version (from EPFL) that tracks what functional programmers mean by ‘purity’. Scala is a hybrid language with a user base beyond pure functional developers, and an implicit goal to capture Python-like markets, which entails an embrace of procedural programming.

Java already makes them identical via various RPC frameworks. The main problem is inefficiency. Loom allows you to make them look identical while still retaining efficiency.

There are extremely compelling reasons to do so:

  • Handle RPC errors with try/catch/finally (the value of this CANNOT be overstated)
  • Abstract over both local and remote implementations
  • Write resource-safe code using ordinary language mechanisms (try-with-resources, try/finally, etc.)
  • Single-colored functions

Perhaps in a new programming language designed for cloud-native computation, one would have some differences (to be proposed) between local and remote computations.

But for ordinary programming languages designed prior to the advent of cloud-native systems, the pros of having a uniform computation model vastly outweigh the benefits (indeed, the uniformity is a primary driver of adoption for functional effect systems!).

Moreover, the drawbacks have been overstated. There are two main drawbacks to RCP-calls-as-ordinary-function-calls:

  1. Failure with new error types. RPC calls may fail in new ways that application code may not anticipate or necessarily know how to deal with. I think such is largely solvable without new language constructs by better design of RPCs.
  2. More seriously, timeout and retry behavior. RPC calls are flakier than local calls and subject to significantly longer delays. However, these have robust solutions that work across both local procedure calls and remote procedure calls: retry strategies and timeout policies. Retry strategies properly apply to recoverable errors and are useful in local and remote contexts; timeout policies too, are useful in both local and remote contexts. Frameworks (or, to take a more extreme point of view, libraries and even programming languages) should take special care to separate recoverable and non-recoverable errors and provide compositional ways of applying both retry and timeout policies.

Currently Loom does not provide a lot of machinery to help with (2). However, it provides a solid foundation for library authors to develop their own approaches to solving these challenges, based on underlying language primitives that are proven and familiar to developers.

More precisely, today we have “async blocking”, which happens when a fiber / virtual thread suspends, waiting external re-activation, and we also have “sync blocking”, which happens when a physical thread suspends, waiting external re-activation. What Loom is doing is upgrading almost all “sync blocking” to “async blocking”. Semantically, they’re all blocking, it’s just a question of efficiency: async blocking is vastly more efficient than sync blocking, so it’s merely a sort of optimization applied retroactively to the masses of synchronous code that have already been written.

I do not think that question will ever have agreement, which maybe argues Scala should be more opinionated so as to select for a user base compatible with its goals. But it is clear that no official answers will be forthcoming until capability-based research program is closer to completion (ETA: 5 years). And until then it is extremely risky to modify the language, especially in ways that import already-obsolete Kotlin designs into the much more modern Scala 3 programming language.

3 Likes

I don’t know for the general use case, but ZIO solved that issue pretty well. It creates a very insightful error trace, with what code would have been executed next (in the context of the app, not the internal fiber management weaving). Very actionnable, debug is (almost) as simple as in mono-threaded code.
And if I followed things correctly, in ZIO it’s even cheap (in runtime perf - almost free, even)

3 Likes

Yes, although one of these two problems, on one of the platforms (JVM/Loom) is set to disappear. Hence my proposition to shift the focus of the problem on something that isn’t platform-dependent :slight_smile:

I didn’t say I want to track purity :wink: That’s probably too much. Writing to mutable state? Probably not. Performing a network call? Probably yes. Maybe tracking non-local computation would be a good, precise term?

(in fact you propose the same in the next section, as I now see)

Java already makes them identical via various RPC frameworks.

Not always - you often get different checked exceptions, which is a way of “marking” a method as side-effecting. Where we have IO[], java often has throws IOException - both influence the signatures. But again given history, we might be looking for better solutions than checked exceptions (I think in general in Scala we are looking for better solutions to various problems :slight_smile: ).

Do we want the compiler to point out that we might not be handling all the error cases that we should (which could lead to applying e.g. a retry/timeout strategy)? I think in typed a language the answer might be “yes”.

I do not think that question will ever have agreement, which maybe argues Scala should be more opinionated so as to select for a user base compatible with its goals.

There definitely won’t be agreement, but luckily we have EPFL and Martin who picks the direction as the where Scala should be headed (with input from the community of course, but ultimately somebody has to make some choices from time to time).

1 Like

I think they failed as ultimately you do need to tackle the fact that an RPC call fails differently from a local call. Now this might be done with discipline (in Python) or with the help of a compiler (in Scala) - that’s a dynamic vs static typed discussion, people have different preferences, and that’s completely fine.

But there are no magic solutions which make RPC calls behave just as local calls. You need different code when doing an RPC, than when doing a local call. (note that this code might be far away from the invocation site, somehere in an error handler, but it still needs to be there).

3 Likes

That is what we call “effects”. Interaction between an automaton and its environment: I highly recommend Oleg Kiselyov's talk titled "Having an Effect"[0] in which he... | Hacker News

You seem to want an effect system.

Thanks, I’ll take a look.

I might be indeed looking for what’s known as an “effect system” in literature, however I have the feeling that outside of academia, the term “effect tracking” is an overloaded term, with many possible meanings (covering mutable state, async, remote computations etc.). So maybe a more precise one would suit our communication better.

There’s something that still isn’t clear for me from the discussion. Does Loom somehow solve the classic N+1 problem? I.e. let’s say I have a function that does a Google search: def google(str: String): List[URL]. Now I try doing this:

val list : List[String] = ???
list.map(str => google(str))

How does Loom ensure this is done efficiently, i.e. by spawning one thread per element of list?

If we tracked in types that google can perform a costly block, we would be able to use that information to, perhaps, forbid the above piece of code. Perhaps there should be a variant of map which always spawns a thread per element and allows blocking operations.

Regardless of what the exact solution is, tracking in types that google can block seems better than the situation where it’s easy to have into performance problems when using it. Sure, similar problems would occur with computation-intensive function as well, but I feel like they occur much more easily once we start doing async programming and a single function call could suddenly take 100ms, or whatever is the local Google roundtrip time.

1 Like

You are conflating concurrency with asynchronicity.

Loom does not change the semantics of your code: in particular, it does not automatically insert any concurrent operations, nor does such a thing make sense in general (see above academic references on auto-parallelization, which is fraught with known issues).

Loom merely takes your synchronous code (that is, code formerly using physical threads and operations like IO or locks that “sync block” those threads) and makes it fully asynchronous (using virtual threads and “async blocking”, which is more efficient than “sync blocking”).

As such, maybe in your code base, you have some code like list.map(str => google(str)), where each invocation to google blocks a physical thread. Under Loom, the code has the same meaning and will produce the same result, only google can now be fully asynchronous (which does not imply it is concurrent with respect to the thread executing the List#map, because it is NOT concurrent), which means you get the same behavior before but it runs more efficiently.

Loom is all about efficiency, not concurrency, per se: taking the same programs and making them work better. As a consequence, you can now do “async operations” (i.e. “efficient operations”) anywhere without having wrapper types like Future, including in List#map.

4 Likes

Ok, so you want to track “local” computation versus “remote” computation. First off, that would not be related to async versus sync tracking: both sync and async can do remote computation, the only difference is efficiency.

Second, in the era of cloud-native applications, the cloud itself has become a sort of standard library: every other call is to some microservice or GraphQL or REST API. Our applications are the glue that hold together operations implemented in the cloud. So tracking “remote” computation may be increasingly and incredibly noisy, as we enter a future in which nearly all calls might be “remote”.

Third, and in my opinion, it is very important to not be obsessed with “tracking” things for the sake of academic novelty (which is good for obtaining grant money but bad for commercial software). Tracking information using types involves considerable effort for developers, who have to type more characters and wrestle with more mistakes (see also: uninferrable exception lists in Java). You can, like Odersky is trying to do, reduce the cost of tracking–preferrably NOT via inserting more magic fraught with edge cases that works in unexpected ways with other language features, such as “auto-adaptation” in context functions–but fundamentally, you must still acknowledge it has a cost.

To pay for itself, you have to demonstrate that the information is (a) actionable, and (b) so frequently actionable that the costs of universal tracking are outweighted by the proven benefits.

I have not even heard a hand-wavvy argument on remote vs local being actionable: what would a developer do differently, knowing that “doX()” is a remote call versus a local call? What would the developer do differently, knowing that “doX()” is a local call versus a remote call? Not abstractly, but what concrete code would a developer write knowing such a difference?

I have argued above that the steps a developer would and should take to flaky computations always involves retries, and the steps a developer would and should take to long-running computations always involves timeouts. Although remote computations are more likely to be flaky and long-running, it is only a correlation, and many local computations can be both flaky and long-running. So the mere presence or absense of a “remote bit” is likely to be insufficient information to be actionable.

If I am wrong, then it should be possible to provide some evidence that:

  1. Devleopers know to do and actually do something radically different based on the “remote bit”, such that it significantly affects correctness or performance or some other metric that matters to the business.
  2. Developers do this so often that it overwhelms the significant drawbacks to infecting every type signature across the entire code base with a “remote bit” (or at least, infecting either all remote code, or all local code, with such a bit, if you can infer its negation by its absence).

Ultimately, my stance is that “effect tracking” is a distraction and a waste of resources, hence my blog post, Effect Tracking Is Commercially Worthless.

That dynamic could change in a future in which tracking things is cost-free or super-low-cost and completely automatic (fully type-inferred), but until when and if that point arrives, I will always be asking proponents of effect tracking to demonstrate (a) actionability of information, and (2) pervasiveness of need, such that benefits clearly outweigh costs. To my knowledge, no one has demonstrated this in the case of remote vs local, and it cannot be demonstrated at all in the case of sync vs async.

4 Likes