Ah I see you beat me to it @oleg-py
I have been looking at this as time allowed over the last couple of weeks.
Here’s a quick implementation of the elimination of the final map call in scalac.
Here’s a quick change which implements a simplification of value bindings in scalac.
However, while I was implementing these I ran into a couple of things which are worth discussing here.
Eliminating map
from for-comprehension
This is a more complicated change than it might seem in terms of downstream breakage.
Consider the following code:
val xs = for { x <- List(1, 2, 3); if x > 3 } yield x
xs.length
Once the map
call has been eliminated, we have a problem. withFilter
returns a FilterMonadic
in Scala <= 2.12 and a WithFilter
in Scala 2.13. The terminal map
call is usually used to convert the WithFilter
back into the target collection type, so I suspect this change breaks lots of existing code.
We need to decide what to do about that - disable the optimization when there is a filter? Change the filter to a strict filter
call when it is the final method call in the for comprehension? What if the user code has not implemented filter
for that type?
The other thing to consider here is whether users will have written code which follows the monad right identity law, i.e. such map(x => x)
actually is a no-op. I’m sure that there is user code out there which would be affected by this change.
I suspect we would need to run a community build to estimate the amount of breakage.
Simplifying desugaring of value bindings
As @oleg-py says this is probably less of a priority but nevertheless there is a problem with this change too. The idea of this change is to alter the desugaring of code like the following
for {
x <- List(1, 2, 3)
y = x + 1
z <- List(4, 5, 6)
} yield (x + y + z)
from this:
List(1, 2, 3).map { x =>
val y = x + 1
(x, y)
}.flatMap {
case (x, y) =>
List(4, 5, 6).map { z =>
x + y + z
}
}
to this:
List(1, 2, 3).flatMap { x =>
val y = x + 1
List(4, 5, 6).map { z =>
x + y + z
}
}
In other words, we are trying to change value bindings from a tuple that gets threaded through each call to map
or flatMap
to a simple val
that is in scope for nested map
and flatMap
calls.
However, this causes problems for code like the following:
for {
x <- List(1, 2, 3)
y = x + 1
if y > 0
z <- List(4, 5, 6)
} yield (x + y + z)
The new desugaring causes problems like this:
List(1, 2, 3).withFilter( /* what is y here? how do we write the filter condition? */ ).flatMap { x =>
val y = x + 1
List(4, 5, 6).map { z =>
x + y + z
}
}
We would need to decide what to do with if
after value bindings if we proceed with that change.
Personally I would suggest that if
should only be allowed immediately after a <-
generator, and I think @odersky has mentioned this solution previously, but once again this seems like it would break a lot of existing code.
Alternatives
- Should we change the
for
comprehension at all? Perhaps we should investigate whether we can repurpose do
as it seems to see little use (my perception about this might be totally wrong)?
- Should we investigate whether we could add a completely different kind of comprehension syntax? There are lots of interesting ideas in the FP space now, such as F# computation expressions and OCaml’s new comprehension syntax.