Pre-SIP: if-else Guards in For-Comprehensions with Custom Error Values

Motivation

Either - if guards are unsupported

In for comprehensions over Either, there is no idiomatic way to short-circuit the expression to a Left if a predicate does not hold. The current way to do this in stdlib looks something like this:

for {
  x <- getX
  _ <- if (x % 2 == 0) Right(()) else Left("x must be even")
} yield x      

This requires a dummy _ binding, a Right(()) to act as a ‘success sentinel’, which obfuscates the purpose of the expression and the predicate itself.

A clearer way to express this could look like:

for {
  x <- getX 
  if x % 2 == 0 else "x must be even"
} yield x      

Try - if guards produce non-descriptive failures

Guards are supported in for comprehensions over Try, as Try implements withFilter. So it is possible to write:

for{
  x <- tryX
  if x % 2 == 0
} yield x

However, they produce a generic Failure when the predicate is false:

Failure(NoSuchElementException("Predicate does not hold for x"))

This is a generic, non-descriptive error that gives the caller no information about which predicate failed or why. If the same if pred else error pattern were made available for Try a more informative failure could be returned, for example:

for {                                                                                                                                                           
    x <- tryX                                                                                                                                                     
    if x % 2 == 0 else IllegalArgumentException("x must be even")   
} yield x  

Background

In for-expressions, if (predicate) is desugared to .withFilter(predicate). withFilter is defined for collections, Option, and Try, but not for Either.

Both Try and Either represent computations that can produce a failure. The key difference is that Try represents computations that may throw exceptions, capturing them as Throwable, while Either[L, R] supports custom error types L.

Therefore, Either cannot support withFilter because a predicate failure would require producing a value of type L, but withFilter has no context-independent way to construct such an L.

Meanwhile,withFilter on Try can only produce a generic Throwable, again because withFilter carries no information about what failure to produce when the predicate returns false.

Proposed improvement

Language change

The key change would be to extend the for comprehension grammar to allow if pred else error and define its desugaring to .withFilterOrElse(pred, error). The existing rule for if without an else would not need to change, making it backwards-compatible.

stdlib change

.withFilterOrElse(pred, error) would need to be defined on Either and Try in the stdlib. This could be implemented as follows for both:

// Either
def withFilterOrElse[A1 >: A](pred: B => Boolean, error: => A1): WithFilterOrElse[A1, B]    

// Try
def withFilterOrElse(pred: T => Boolean, error: => Throwable): WithFilterOrElse[T]
                                                                                      

In each case, the function returns a WithFilterOrElse object that supports map, flatMap, foreach, and withFilterOrElse, composed with filterOrElse(pred, error) to align with the existing WithFilter implementation and avoid intermediate allocations. For Try, it would also need to support the existing withFilter method.

For Either, filterOrElse already exists in stdlib, so WithFilterOrElse can delegate to it directly. For Try, an equivalent method would need to be added first:

  def filterOrElse(pred: T => Boolean, error: => Throwable): Try[T]  = this match{
    case Success(t) if !pred(t) => Failure(error)
    case _                 => this
  }

While this proposal is limited to stdlib’s Either and Try, the desugaring change would enable any type that defines .withFilterOrElse to use the if-else syntax in for comprehensions. For example, it would be a natural addition to the MonadError typeclass in Cats, or ZIO’s IO type.

7 Likes

Why don’t you just

for {
  x <- getX
  _ <- Either.cond(x % 2 == 0, (), "x must be even")
} yield x

I generally do use Either.cond or Either.raiseUnless from Cats in practice, but I intentionally restricted my example to stdlib methods.
Even then, you need to use a generator expression and bind the result to a placeholder value _ <- ....
To me this is still less clear than the proposed solution, as the generator syntax is for binding a value, rather than validating a condition.

2 Likes

In practice I almost never use if-guards in for comprehensions; now that you bring it up, this gap may be part of why I trained myself out of doing so – the results tend to be insufficient for serious work.

It’s a really intriguing proposal – especially if Cats and ZIO were adapted to make use of it, I could imagine using this pattern fairly routinely…

2 Likes

+1 on the lack of this making for-comprehensions hard to use with types that need to produce an error instead of filtering. Either.cond exists, but as it stands, but there isn’t a standard formulation for this method unless you bring in another library. And having if/else syntax feels pretty natural.

1 Like

btw, conceptually the same as ‘Step 2’ in From freeing \leftarrow. to better for and unification of direct and monadic style

1 Like

I think the problem with the suggested approach is that it can be semantically ambiguous. For example:

def getX: Either[Int, Int] = ???

if
  x <- getX
  y <- if x > 0 else 0
yield y

That can be perceived as this (the proposed approach):

getX.flatMap:
  case x if x > 0 => Right(x)
  case _          => Left(0)

or simply like this:

getX.map:
  case x if x > 0 => x
  case _          => 0

Both of them may make sense in certain contexts. However, the latter perception looks more generic because it can work with types like Option, Seq, etc., whereas the former one works for a limited set of types only. Therefore the syntax can be quite confusing without some kind of disabmiguation.

That said, I believe Scala should get more syntactic conveniences for “for” comprehensions. Currently it is very limited. The question is how to make it generic enough so that it would work for a broad range of types including Future, Stream, IO, ZIO, etc.

1 Like

The current if/withFilter guard only works for types that represent “non empty” or “empty,” such as collections (including Stream) and Option.

Similarly, the proposed if-else/withFilterOrElse guard would only work for types that represent “succeeded” or “failed”: Either, Try, Future, IO, ZIO etc.

These two categories cover the types where a conditional guard in a for comprehension would make sense, and I agree that making the if-else syntax generic over both would make the syntax ambiguous. The type that the for comprehension is over provides the disambiguation, just as it does for flatMap and map.

I should note that I overlooked Future in the original proposal, it actually has the same limited withFilter implementation as Try: it produces a generic NoSuchElementException if the predicate is false.
I doubt if guards are used much in practice on either Try or Future for this reason. withFilterOrElse would fit Future in the same way.

I agree that this does not address all the current limitations of for comprehensions, and something like @rssh’s suggestion could provide more benefits. However, this change is narrow and follows existing patterns, and I think it could be valuable as-is.

My biggest problem with for-comprehensions is that they have their own weird rules, and this is another weird rule: right <- monad if p(right) else newLeft. I would rather extend Boolean with |? and write

for
  right <- sumtype
  _ <- p(right) |? Left(newLeft)
yield f(right)

But even more than that, I’m not sure that this is the way to lean for the language as a whole, because the direct version works with normal rules instead of for-comprehension rules

sumtype.boundary:
  val right = monad.?               // Normal assignment
  if !p(right) then Left(newLeft).? // Normal conditional
  f(right)                          // Final value is answer

I code with a library that enables this, and I find that having only one set of rules to think about simplifies things significantly. The main problem with this is that you could jam a future into the middle of it and it wouldn’t know to not allow the exit attempt with Left(newLeft).? whereas with the for comprehension you’d get a type error. Capability tracking can potentially solve this. Even without it solved, it’s still awfully useful.

So, while I don’t want to tell people who love for comprehensions that they should have a worse experience than necessary, I would encourage thinking hard about (1) how can you solve the problem at the library level (e.g. my |? which you could rename orLeft if you don’t want to wrap in Left and don’t like symbols), and (2) whether adding yet more ways that for comprehensions are their own thing with different rules is actually good for the language.

(Runnable example of alternate solutions.)

2 Likes

Thank you for the alternative suggestions,
For the first suggestion, I had already considered something similar: specifically an extension on boolean for ApplicativeError types from cats:


infix def orRaise[F[_], E](e: => E)(implicit F: ApplicativeError[F, E]): F[Unit] = F.raiseUnless(b)(e)

This would enable the syntax you suggested for any ApplicativeError type, such as Either, Try, or IO, without needing to write Left/Failure /IO.raiseError explicitly:

for
   success <- applicative
   _       <- p(success) orRaise newError
yield success

This still uses a dummy binding, which was one of the motivations for the original proposal. Moreover, I think starting with ‘if’ would make the intention of this statement clearer. But I do find it nicer than the current way of doing things like using Either.cond, and this can be achieved in a library, so I think it’s worth considering.

For the second alternative, I’m not very familiar with direct style, so that is something I will look into. I am put off by the lack of type safety when mixing multiple effect types, but as you say this could be solved in the future.

I agree that making a lot of special rules for-comprehensions makes them harder to learn and reason about, and is not the right direction to go in. And I agree that if desugaring to withFilter inside for-comprehensions is a weird rule. However, this rule isn’t going away, and what I’m suggesting is an extension to it that makes it useful across more types, rather than an entirely new rule to learn.
In any case, making for comprehensions more expressive (even if this means introducing rules specifically for them) does not preclude direct style from becoming the main idiom in the future.

I thought more about this, and I agree that the example you shared is ambiguous. But I would argue that the ambiguity comes from having an Either[A,A] which is ambiguous on its own.

Normally it’s not ambiguous at all! That’s a key property of sum types (as opposed to union types): the two sides add! So Left(a) is not Right(a) and it’s only ambiguous when you try to take shortcuts that hide which side is which.

Unfortunately, this is what has to be kept clear–Either[A, B] is not simply a long way to spell A | B. And that’s where the ambiguity comes from:

right <- sumtype if p(right) else leftValue
right <- sumtype if p(right) else defaultRightValue
right <- sumtype if p(right) else backupSumtype

All of these seem pretty natural to me. Which one it actually means is something one has to learn and remember. (The second two rebind the right name so as to avoid error-prone clutter in later steps. If you didn’t know, you can actually rebind names in a for comprehension already, because it desugars to lambdas with names that shadow each other.)

If you like the word if, even though it’s not necessary for the logic if we’re creating a new unnamed item, you can create that too (see example 3). I chose the no-helper-object flavor (orLeft) because it seemed more compact. The opaque type helper object pattern can be used for a variety of syntaxes (e.g. continue (p) stop -1, or inverted as stop (-1) unless p).

Anyway, given how much flexibility Scala has, I think you can get to an 80% solution just by employing regular features; and then the question is whether the extra flexibility is worth it. Just noting that for could do more is important but not enough, I think. It needs to be that the very best alternatives that can be managed still fall short in critical ways.

That’s certainly true! But making for comprehensions more tricky is something that needs to be tackled by everyone who reads code, even if direct style grows in prevalence. So it reduces but does not eliminate the concern.

Personally, I’m fine with the syntax addition. I mainly wanted to make sure we had all the obvious-to-me alternatives on the table.

2 Likes

Thank you for clarifying the misconception on my part, and I now concede that the syntax in its proposed form is ambiguous. The example you provided makes that clear.
For Either specifically, this ambiguity could be avoided with the following type signature for withFilterOrElse:

def withFilterOrElse[A1 >: A](pred: B => Boolean, error: => Left[A1, B])

Then in your example:

right <- sumtype if p(right) else Left(leftValue) // compiles
right <- sumtype if p(right) else Right(defaultRightValue) // doesn't compile 
right <- sumtype if p(right) else backupSumtype // doesn't compile 

For types that have Throwable as an error type, like Future, Try and IO this ambiguity should never arise , so their implementations of withFilterOrElse can be as originally described.

To give another example for ZIO, which is also a sum type but lacks distinct subtypes like Left / Right, you could require the error parameter type match the return type of ZIO.fail:

def withFilterOrElse(pred: A => Boolean, error: => IO[E, Nothing]): ZIO[R, E, A]

Then for example:

io <- zio if p(io) else ZIO.fail(errorValue) // compiles
io <- zio if p(io) else ZIO.succeed(sucessValue) // doesn't compile 
io <- zio if p(io) else backupZio // doesn't compile 

In any case, it’s up to each library to decide whether to add this kind of disambiguation, or to simply assume that the error and success types are distinct (as they are for Throwable-based types). Within stdlib, the Left disambiguation is an improvement on the original proposal for Either, and I appreciate you and @satorg pointing the need for it out.

The argument still needs to be made that

for
  right <- sumtype if p(right) else Left(leftValue)

is so much better than

for
  right <- sumtype
  _ <- If(p(right)) Else Left(leftValue)

that the former needs to be added, even though the latter works at the library level already. Especially since you can pick less ambiguous names for the latter, e.g. pass/fail.