"for" syntax for parallel computations (afor / applicative for comprehension))

A new syntax for for comprehensions that deals with applicatives has been discussed before, but it has been three years so I’d like to revive the discussion. I wont linger on the details, but I feel it’s necessary to give a concrete example for reference for those of you that have not read the original thread, or are unfamiliar with applicatives. so I’ll give a quick refresher. (My explanation will be as entry-level as possible, so as to encourage as much discussion as possible. Anyone who already knows what “Applicative” means can probably skip it):

Current situation

current for comprehensions are desugared into .flatMap() and .map() calls. So each item in a for comprehension has access to the “results” of the previous items. This is very powerful; we can use it in conjunction with an effect monad to neatly handle combining computations (e.g. db queries). For example:

def getUserData(uid: UserID): IO[UserData] =
    for {
      user <- getUser(uid) //IO[User]
      userSettings <- getUserSettings(user) //IO[UserSettings]
    } yield UserData(user, userSettings) 

The IO type holds all of our effects (and usually errors as well) in a nice neat package, but these computations are guaranteed to be sequential. Meaning the userSettings are retrieved after the user. This can be a problem, for example, in high performance applications that need to grab a lot of unrelated data quickly. In order to combine many computations in parallel, we need to use something else.

Many different functional programming libraries support doing computations in parallel with all of the same nice qualities as our friend IO up there (ZIO and cats-effect to name a few). The solution that these libraries provide all look something like this (I’m using scalaz because it’s what I know):

def getUserData(uid: UserID): ParIO[UserData] = 
   )(case (user, userSettings) => UserData(user, userSettings))

This isn’t quite as pretty as the previous example but it doesn’t look like too much of a hassle, and it runs everything in parallel just like we wanted. Unfortunately, since applyN has to be implemented with tuples, things get hairy when you have to add many things together. I’ve seen some that are pretty nasty to work with:

    case (
        (dear, lord, this, is, too)
        (many, tuples, for, the, human)
        (brain, to, process)
    )  => InsanelyLongConstructor(...)

I think it’s pretty obvious that this syntax can get out of hand quickly, and it’s clunky to use. So it would be nice to have a way to represent the same idea with much less code.

The Proposed Change

As a Haskell guy, the most tempting solution is to follow Haskell’s lead. Haskell has “do” statements that behave the same way as scala “for” comprehensions. Haskell also has “ado” (applicative do) statements, so the best solution in my eyes would be to create an “afor” comprehension, where there is some “magic method” (I’ll call it apply() from here on out, since it is called that in scalaz) defined for anything that uses it that handles combining the computations (ParIO in our example).

def getUserData(uid: UserID): ParIO[UserData] =
    afor {
        user <- getUser(uid)
        userSettings <- getUserSettingsWithUserID(uid) 
    } yield UserData(user, userSettings) 

Problems with that

  1. Includes a new keyword
  2. reuses the for syntax, but changes how the scope works. Since these computations are parallel underneath the hood, they must be kept totally separate. i.e
afor {
    user <- getUser(uid)
    userSettings <- getUserSettings(user) //compilation error, since user is not accesible until after yield 
} yield UserData(user, userSettings) 

Other Solutions

The original thread suggested using the same for syntax, but with some modifications

for {
  a <- aa with  //parallel
  b <- bb with  //parallel
  c <- cc 
  d <- dd
} yield a + b + c + d

I personally think we shouldn’t mix the applicative syntax in with the normal for syntax. Mostly because in functional terms, all Monads are also Applicatives, so the Applicative apply (scalaz.Applyl.apply() in our examples) should be able to be derived from the Monad’s .flatMap(). But flatMap() forces the computation to be sequential, so any Applicative that is also a Monad could not combine computations in parallel (Assuming we keep the “All Monads are Applicatives” constraint).

In simpler words: we need one type constructor over the whole for comprehension. Let’s call it M[_].

  • if M is going to handle parallel computations, it needs an apply method that looks like this:
    apply(f: M[A => B], self: M[A]): M[B]
  • if M is in a for comprehension, it needs a flatMap()
    flatMap(f: A => M[B], self: M[A]): M[B]

given that these two things are true, we can actually make the apply() method directly from flatMap and map like so:

def: apply(fn: M[A => B], self: M[A]): M[B] =
    fn.flatMap(f => self.map(a => f(a))

but using flatMap() forces the computations to be sequential, which is undesireable. However if we implement a custom apply(), then we have two routes to get to the same type that have fundamentally different consequences, which is definitely not desireable.

This is why we dont want one M[_] for both the applicative and regular “for” syntax.

I’m curious to hear if anyone is interested, and if this gets a lot of attention I might make a PR myself.


That syntax allows for parallel and sequential computations in the same for-comprehension and still allows to pass values from earlier steps to later ones. Therefore you can do things like that cleanly:

for {
  a <- aa with // parallel
  b <- bb with // parallel
  c <- cc
  d <- dd(a, b)
  e <- ee(d, c)
} yield a + b + c + d + e

This looks rather convoluted. What’s wrong with the cats variatation

(QueryA, QueryB, QueryC, QueryD, QueryE).mapN {
  case (a, b, c, d, e) => ...

Or in the other example

1 Like

the cats-effect mapN is the same as the scalaz applyN. They are both limited by tuple size. (cats-effect goes to 22 while scalaz goes to 12). So both will require nesting with tupleN if too many things get put together. I have seen applicative instances of custom JSON decoders that can have hundreds of fields.

That’s a good point; for smaller numbers of combined computations that will work well, and with dotty removing the tuple22 limit, we dont have to nest them. But I think it’s a less-than-ideal solution. If we combine a lot of things together, it can still get pretty noisy:

    ) <-
        query1 zip
        query2 zip
        query3 zip
} yield ...

If scalaz has the same syntax as cats (it’s in core, it doesn’t require the effect library), than you’re not representing it well. scalaz.Apply[ParIO].apply3(queryA, queryB, queryC){case (a, b, c) => ...} is a lot more verbose than (queryA, queryB, queryC).mapN { case (a, b, c) => ... }. Using the far more verbose syntax as a motivating example makes it harder to look at what you present and trusting you give a fair representation.

This is fixed in scala3.

Yeah, it does get pretty noisy, but so do all alternative. If you have a boatload of generators, it’s going to look bad regardless of syntax and whether they are monadic or applicative. A comparison on merit with what the same example looks like in both cases would be better than giving the streamlined example to one syntax and the messy one to the other and point at how much more streamlined the simpler case is.

For very large tuples, it’s a shame the generator isn’t next to its corresponding value, so that’s nice of the apfor proposal. However, having such large tuples is IMO a yellow flag, and maybe you should have already done some logical grouping. If the motivating example is tuples with hundreds of fields, I’m not sold on the motivation.


There doesn’t need to be boatload of generators to see the difference. Following code shape is rather common:

for {
  user        <- userService.fetchUser(userEmail)
  trade       <- tradeService.fetchNextTrade(accountName)
  constraints <- constaintsService.fetchConstraints(accountName)
  buyerInfo   <- partiesService.fetchInfo(trade.buyer)
  sellerInfo  <- partiesService.fetchInfo(trade.seller)
} yield {
  // use generated values

With specialied for comprehension for applicatives code would still look good:

for {
  user             <- userService.fetchUser(userEmail)
  with trade       <- tradeService.fetchNextTrade(accountName)
  with constraints <- constaintsService.fetchConstraints(accountName)
  buyerInfo        <- partiesService.fetchInfo(trade.buyer)
  with sellerInfo  <- partiesService.fetchInfo(trade.seller)
} yield {
  // use generated values

But tuples looks bad (unless I got something wrong and there is some way to retain readability):

for {
  ((user, trade), constraints) <- userService.fetchUser(userEmail)
  (buyerInfo, sellerInfo) <- partiesService.fetchInfo(trade.buyer)
} yield {
  // use generated values

At least format the motivating example for both

for {
  ((user, trade), constraints) <- userService.fetchUser(userEmail) zip
                                  tradeService.fetchNextTrade(accountName) zip
  (buyerInfo, sellerInfo)      <- partiesService.fetchInfo(trade.buyer) zip
} yield {
  // use generated values


for {
  user             <- userService.fetchUser(userEmail)
  with trade       <- tradeService.fetchNextTrade(accountName)
  with constraints <- constaintsService.fetchConstraints(accountName)
  buyerInfo        <- partiesService.fetchInfo(trade.buyer)
  with sellerInfo  <- partiesService.fetchInfo(trade.seller)
} yield {
  // use generated values

You won’t hear me claim that the second isn’t nicer, but not by as much as when you don’t align it.


Stealing Borrowing ~ from anorm can make working with the deeply linked tuples less noisy:

final case class ~[+A, +B](_1: A, _2: B)
for {
  user ~ trade ~ constraints <- userService.fetchUser(userEmail) zip
                                tradeService.fetchNextTrade(accountName) zip
  buyerInfo ~ sellerInfo     <- partiesService.fetchInfo(trade.buyer) zip
} yield {
  // use generated values

Live Version

1 Like

Thanks for suggestions. In such simple cases it seems zipping doesn’t look terribly wrong, but notice some important details:

  • names for generated values are short
  • parallelism level is medium (max 3)
  • because of the above identation is not ridicuously big, but in real life I often prefer using longer variable names
  • generators are one liners, so matching a generator with generated value is somewhat feasible

What I see in production code is rather something like that (only if parallel queries are important enough to warrant such gymnastics):

// Scala Futures are eager so below 3 lines start 3 futures at once
val userFut = userService.fetchUser(userEmail)
val tradeFut =  tradeService.fetchNextTrade(accountName)
val constraintsFut = constaintsService.fetchConstraints(accountName)
for {
  user         <- userFut
  trade        <- tradeFut
  constraints  <- constraintsFut
  buferInfoFut  = partiesService.fetchInfo(trade.buyer)
  sellerInfoFut = partiesService.fetchInfo(trade.seller)
  buyerInfo    <- buyerInfoFut
  sellerInfo   <- sellerInfoFut
} yield {
  // use generated values

We can avoid extra names and bring rest of the names close to generators even today. As an exercise I’ve done the following (a long time ago, so code doesn’t match previous examples):

import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration.Duration
import scala.concurrent.{Await, Future}

object FutSeqMain {
  import FutSeq._

  def main(args: Array[String]): Unit = {
    val x = Future(9)
    val result: Future[Int] =
      // no explicit tuples handling in this for-comprehension
      for {
        a  <- Future(5)
        b  = Future(8)
        c  = Future(a + 6)
        ev <- Completer(b)(c).build
        d  <- Future(ev(c) - ev(b))
      } yield {
        a + ev(b) + ev(c) * d // `+ ev(x)` does not compile, good!
    println(Await.result(result, Duration.Inf))

object FutSeq {
  import HList.{HCons, HNil}

  sealed trait HList

  object HList {
    sealed trait HCons[H, +T <: HList] extends HList

    sealed trait HNil extends HList

  sealed class Completer[H <: HList] private (futures: List[Future[Any]]) {
    import Completer.Evidence

    def apply[T](future: Future[T]): Completer[HCons[future.type, H]] =
      new Completer(future :: futures)

    def build: Future[Evidence[H]] =
      Future.foldLeft(futures.reverse)(Evidence[H])((ev, _) => ev)

  object Completer extends Completer[HNil](Nil) {
    final class Evidence[H <: HList] private {
      def apply[T](future: Future[T])(implicit witness: In[future.type, H]): T =

    private object Evidence {
      private val evidence = new Evidence[Nothing]

      def apply[H <: HList]: Evidence[H] =

  final class In[V, H <: HList] private {
    def apply(value: V): V =

  object In {
    implicit def inHead[V, H <: HCons[V, HList]]: In[V, H] =
      new In[V, H]

    implicit def inTail[V, U, T <: HList](
        implicit in: In[V, T]): In[V, HCons[U, T]] =
      new In[V, HCons[U, T]]

Works for Scala Futures. After modifications it probably could work for other IO monads too, but types in Completer would be much more complex. I haven’t tested the whole idea in actual application so I don’t know if it’s worthwhile. Anyway, it’s still inferior to enhanced for-comprehension proposals.

You can also use -> as an extractor.


That’s a very fair point. I was just using an example inspired from the code base I work in. We tend to have problems with implicits in these specific circumstances due to the idiosyncrasies of of our code structure (would love to change it but it’s not MVP of course). The core scalaz syntax works very similarly to cats, as you said. Apologies for not pointing that out.

so given the suggestions from some of the replies, here’s what we can currently make it look like with scala 3 in the worst cases:

  (thisIsSomeDataA ->
    thisIsSomeDataB ->
    thisIsSomeDataC ->
    thisIsSomeDataD ->
    ... ->
    thisIsSomeDataZ) <-
      serviceMethodForGrabbingA zip
      serviceMethodForGrabbingB zip
      serviceMethodForGrabbingC zip
      serviceMethodForGrabbingD zip
      ... zip
} yield someFunction(thisIsSomeDataA,...)

which is definitely better than the examples that I posted, but I think it falls short of expressing the simple intent like an afor comprehension would:

afor {
  thisIsSomeDataA <- serviceMethodForGrabbingA
  thisIsSomeDataB <- serviceMethodForGrabbingB
  thisIsSomeDataC <- serviceMethodForGrabbingC
  thisIsSomeDataD <- serviceMethodForGrabbingD
  thisIsSomeDataZ <- serviceMethodForGrabbingZ
} yield someFunc(thisIsSomeDataA, ...)

I understand making a case class for ~, but how did you get that -> extractor?

You can construct tuples with ->. Since 2.13 (https://github.com/scala/scala/pull/6304) you can use it for destructuring/matching as well.

val (a, b) = (1, 2)
val a -> b = 1 -> 2

By the way I think you could employ the same strategy as in that PR for your own custom ~ extractor.

val ~ = Tuple2

This works because matching uses (at least conceptually… I think it’s optimized away) the Tuple2.unapply method and now you have ~.unapply too.


When scala 3 does eventually come out, another option would be to do something like this:

def getStuff: ParIO[Stuff] = {
   def makeStuff = (Stuff.apply _).curried //or any curried function that makes stuff

   makeStuff mapFlipped
      getPieceOfData1() apply
      getPieceOfData2() apply
      getPieceOfData3() apply

I’d say that this is a fairly common pattern in haskell (or purescript, which is what I use most).

  <$> thing1
  <*> thing2
  <*> thingRed
  <*> thingBlue

There are infix operators that can almost do this in scalaz (<*>, and fmap), but the arguments are in the opposite order, so you would just need to make an implicit somewhere that defines new infix operators with the arguments flipped.

The only thing I worry about with this is transparency.

Wouldn’t it make sense to call it “async for” so that we don’t have to introduce another keyword?


Calling is async would be super weird when using applicative for with other applicatives, like e.g. validations. What validations have to do with async?


Besides that, parallel does not have to be asynchronous and asynchronous does not have to be parallel, so it doesn’t even describe that case.


Hi, sorry to revive this topic but looking at the applicative do documentation, it seems like the compiler could automatically desugar into applicative-style using zipWith with no change in semantics. zipWith subsumes zip:

abstract class Foo[A]:
  def zipWith[B, C](that: Foo[B])(f: (A, B) => C): Foo[C]
  def zip(that: Foo[B]): Foo[(A, B)] = zipWith(that)(_ -> _)

Example 1

  x <- a
  y <- b
  yield f(x, y)

desugars to


Example 2

  x <- a
  y <- b
  r <- f(x, y)
  yield r



Example 3

    x1 <- A
    x2 <- B
    x3 <- C(x1)
    x4 <- D(x2)
    yield (x1,x2,x3,x4)

turns into

A.zipWith(B)((x1, x2) => C(x1).zipWith(D(x2))((x3, x4) => (x1, x2, x3, x4))).flatten

Example 4

     x <- A
     y <- B(x)
     z <- C
     return f(x, y, z)


(for { x <- A; y <- B(x) yield (x, y)).zipWith(C)(case ((x, y), z) => f(x, y, z))

Example from the OP

  x1 <- queryA
  x2 <- queryB
  x3 <- queryC
  x4 <- queryD
  x5 <- queryE
  xN <- queryN
  yield InsanelyLongConstructor(x1, x2, ..., xN)


queryA.zipWith(queryB, queryC, ..., queryN)(InsanelyLongConstructor.apply)

A version of zipWith with an arity larger than 1 can be synthesized from the vanilla version by the compiler, if not manually written by the author of the structure.


You are missing the [B] appended to the line def zip(... in Foo[A].

Here’s the fix in Scastie: