A new syntax for for comprehensions that deals with applicatives has been discussed before, but it has been three years so I’d like to revive the discussion. I wont linger on the details, but I feel it’s necessary to give a concrete example for reference for those of you that have not read the original thread, or are unfamiliar with applicatives. so I’ll give a quick refresher. (My explanation will be as entry-level as possible, so as to encourage as much discussion as possible. Anyone who already knows what “Applicative” means can probably skip it):
Current situation
current for comprehensions are desugared into .flatMap()
and .map()
calls. So each item in a for comprehension has access to the “results” of the previous items. This is very powerful; we can use it in conjunction with an effect monad to neatly handle combining computations (e.g. db queries). For example:
def getUserData(uid: UserID): IO[UserData] =
for {
user <- getUser(uid) //IO[User]
userSettings <- getUserSettings(user) //IO[UserSettings]
} yield UserData(user, userSettings)
The IO type holds all of our effects (and usually errors as well) in a nice neat package, but these computations are guaranteed to be sequential. Meaning the userSettings are retrieved after the user. This can be a problem, for example, in high performance applications that need to grab a lot of unrelated data quickly. In order to combine many computations in parallel, we need to use something else.
Many different functional programming libraries support doing computations in parallel with all of the same nice qualities as our friend IO
up there (ZIO and cats-effect to name a few). The solution that these libraries provide all look something like this (I’m using scalaz because it’s what I know):
def getUserData(uid: UserID): ParIO[UserData] =
scalaz.Apply[ParIO].apply2(
getUser(uid),
getUserSettingsWithUserId(uid)
)(case (user, userSettings) => UserData(user, userSettings))
This isn’t quite as pretty as the previous example but it doesn’t look like too much of a hassle, and it runs everything in parallel just like we wanted. Unfortunately, since applyN
has to be implemented with tuples, things get hairy when you have to add many things together. I’ve seen some that are pretty nasty to work with:
scalaz.Apply[ParIO].apply3(
scalaz.Apply[ParIO].tuple5(
queryA,
queryB,
queryC,
queryD,
queryE
),
scalaz.Apply[ParIIO].tuple5(...),
scalaz.Apply[ParIO].tuple3(...)
)(
case (
(dear, lord, this, is, too)
(many, tuples, for, the, human)
(brain, to, process)
) => InsanelyLongConstructor(...)
)
I think it’s pretty obvious that this syntax can get out of hand quickly, and it’s clunky to use. So it would be nice to have a way to represent the same idea with much less code.
The Proposed Change
As a Haskell guy, the most tempting solution is to follow Haskell’s lead. Haskell has “do” statements that behave the same way as scala “for” comprehensions. Haskell also has “ado” (applicative do) statements, so the best solution in my eyes would be to create an “afor” comprehension, where there is some “magic method” (I’ll call it apply()
from here on out, since it is called that in scalaz) defined for anything that uses it that handles combining the computations (ParIO in our example).
def getUserData(uid: UserID): ParIO[UserData] =
afor {
user <- getUser(uid)
userSettings <- getUserSettingsWithUserID(uid)
} yield UserData(user, userSettings)
Problems with that
- Includes a new keyword
- reuses the for syntax, but changes how the scope works. Since these computations are parallel underneath the hood, they must be kept totally separate. i.e
afor {
user <- getUser(uid)
userSettings <- getUserSettings(user) //compilation error, since user is not accesible until after yield
} yield UserData(user, userSettings)
Other Solutions
The original thread suggested using the same for syntax, but with some modifications
for {
a <- aa with //parallel
b <- bb with //parallel
c <- cc
d <- dd
} yield a + b + c + d
I personally think we shouldn’t mix the applicative syntax in with the normal for syntax. Mostly because in functional terms, all Monads are also Applicatives, so the Applicative apply
(scalaz.Applyl.apply()
in our examples) should be able to be derived from the Monad’s .flatMap()
. But flatMap()
forces the computation to be sequential, so any Applicative that is also a Monad could not combine computations in parallel (Assuming we keep the “All Monads are Applicatives” constraint).
In simpler words: we need one type constructor over the whole for comprehension. Let’s call it M[_].
- if M is going to handle parallel computations, it needs an apply method that looks like this:
apply(f: M[A => B], self: M[A]): M[B]
- if M is in a for comprehension, it needs a
flatMap()
flatMap(f: A => M[B], self: M[A]): M[B]
given that these two things are true, we can actually make the apply()
method directly from flatMap and map like so:
def: apply(fn: M[A => B], self: M[A]): M[B] =
fn.flatMap(f => self.map(a => f(a))
but using flatMap()
forces the computations to be sequential, which is undesireable. However if we implement a custom apply()
, then we have two routes to get to the same type that have fundamentally different consequences, which is definitely not desireable.
This is why we dont want one M[_]
for both the applicative and regular “for” syntax.
I’m curious to hear if anyone is interested, and if this gets a lot of attention I might make a PR myself.