I’m opening this thread to discuss improvements we should consider w.r.t. the constructs we can use in Scala to express control flow.
Since my list of grievances is quite long, I expect the features pitched in this thread to turn into one or several (or none!) [Pre-]SIPs, depending on feedback. Further, while my post does proposes solutions at the end, I’m more interested to know whether the community agrees that there is something to be improved before I invest more time into making sure we can come up with changes having the right syntax and semantics.
I note that most of the ideas I’m about to present have been workshopped with @EugeneFlesselle.
Motivation
This section makes the case for defining new constructs to express conditional control flow in Scala. I will start relating pattern matching to conditional expressions to justify the general need for “concise” match expressions. Then I’ll present use cases for new constructs.
The case for conditional expression
Let me start by doing something a little weird: justify the existence of if
!
Pattern matching is one of the most expressive forms of conditional control flow. As such, it can be used as a building block to express many other forms of conditional statements. For example, if a then foo else bar
is merely sugar:
a match
case true => foo
case _ => bar
As we can see, there is a trivial way to write a conditional expression as a match over a Boolean scrutinee and vice-versa. Given this observation, one may wonder then why having conditional expressions in the first place. There are several answers to this question:
First, a conditional expression is just more concise than its equivalent match expression, and concision matters because of language ergonomics. When we see if a then 1 else 2
, we can immediately recognize that we’re looking at a dichotomous choice. The match expression is a more convoluted in contrast.
Second, a conditional expression is more “structured”. There are only two cases: either a
holds or it does not, the branch that comes first (i.e., right after the condition) is always the one running if the condition holds, and the other branch is always the one running otherwise. Hence, we have just less to think about because the syntax of the construct is ruling out many scenarii that a general-purpose match doesn’t.
Third, a conditional expression doesn’t have to have an else
branch:
def computeThings(): T =
if shouldLog then println("about to compute things")
proceed()
The equivalent match expression has a case false
that just produces a unit. Writing such a case would simply introduce noise in our code.
Lastly, let’s recognize the convenience of “else if
”. For example, consider the following conditional expression:
if a1 then
foo
else if a2 then
bar
else
zim
There are two ways to rewrite it as match expressions:
// (1)
a1 match
case true => foo
case _ =>
a2 match
case true => bar
case _ => zim
// (2)
(a1, a2) match
case (true, _) => foo
case (_, true) => bar
case _ => zim
The first case isn’t ideal because it introduces excessive nesting, making it difficult to quickly identify control flow paths. The situation is manageable here but imagine that we had 3 other else if
branches attached to the original expression!
The second case isn’t great either because it causes the reader to go back to the expression of the scrutinee for every case. Using else if
, one can more easily focus on just the condition of the current branch while reading code from top to bottom. Further, the second version has slightly different semantics as the first one. In particular, a2
is evaluated “no matter what”, which may be undesirable because of performance and/or side effects.
Patterns in conditional expressions
Now that we’re hopefully convinced that conditional expressions are useful despite being merely syntax sugar, let’s talk about conditions. After all, not all scrutinees are Boolean values.
Let’s imagine we have a type Version
defined as follows:
enum Version:
case Legacy
case Stable(major: Int, minor: Int)
Now, imagine we want to run code only for a stable version greater than or equal to 2:
v match
case Version.Stable(major, _) if major >= 2 => foo
case _ => bar
We had no choice but to use a match expression here. The problem is that our “condition” is not a simple Boolean value, and therefore we can no longer sweeten our program with those tasty conditional expressions. That is unfortunate because all the reasons we used to justify them in the previous section still apply here.
Though it is not a Boolean value, the pattern is notionally a predicate. In fact, we could rewrite as below, so that we could go back to a conditional expression:
extension (self: Version)
def >= (m: Int): Boolean =
self match
case Version.Stable(major, _) if major >= m => true
case _ => false
if v >= 2 then foo else bar
Sadly the “trick” is quite underwhelming as we’ve done nothing more than moving code around. Further, not having access to major
in the then
branch might be a loss, as we will discuss later.
Refining matches
One fundamental problem that match
addresses and if
doesn’t is that it lets us extract a value from the scrutinee and give it name for use in another expression. For example:
v match
case Version.Stable(major, minor) if major >= 2 =>
println(s"running version ${major}.${minor}")
case _ => ()
But again, this match expression would be arguably better expressed as a conditional expression. In particular, since nothing is executed unless the condition holds, the “catch-all” branch is just pure noise.
It is interesting to observe that the inability to bind values in conditional expression also impedes expressiveness in pattern guards. For example, assuming v
is an instance of Option[Version]
:
v match
case Some(x) =>
arbitraryComputation(x) match
case Version.Stable(major, minor) if major >= 2 =>
println(s"running version ${major}.${minor}")
case _ => ()
case _ => ()
Because the version is wrapped in an Option
, we now have to introduce additional nesting to first extract it and give it a name. We need this name because arbitraryComputation
is just a function and not an extractor. And obviously, the problem grows with the number of values that must be unwrapped in general!
I suppose we could use some more dark extractor magic to use an extractor in this specific case. However, that would be “costly” for something that could be a one-liner with language support, as we’ll see later. Further, I am not convinced this magic would be powerful enough to deal with variables matched from different levels of nested matches.
Generalizing from the above example use case, we remark that we sometimes need to write match expressions that “refines” the result of a previous match. For example:
v match
case Some(x) if arbitraryPredicate(x) =>
arbitraryComputation(x) match
case Version.Stable(major, minor) if major >= 2 =>
foo(major, minor)
case Version.Legacy =>
bar
case _ =>
zim
case _ =>
zim
The above example is a slight variation of the one shown in the previous section. The most significant difference is that we actually want the conditional behavior on x
to be expressed as a match expression as it involves a non-dichotomous choice. There are two cases of interest:
v
isSome(x)
andx
is a stable version greater than or equal to2.0
v
isSome(x)
andx
is a legacy version
In any other case, we run zim
, which here stands in for some kind of default behavior.
In addition to the excessive indentation caused by the inner match, the problem of this expression is that it needs two occurrences of zim
. It is manageable in this small example, but zim
could be a much larger expression, potentially spanning many lines of code. Further, the problem grows with the number of non-exhaustive nested matches that we introduce, as zim
will have to appear in all “catch-all” cases.
Also note that while all “catch-all” cases are stacked at the end in the above example, it is easy to imagine situations where one zim
would end up in the middle of the expression:
n match
case A(x) =>
f(x) match
case Z => foo
case X => bar
case _ => zim // <- `zim` is in the middle
case B(x) =>
f(x) match
case Y => foo
case _ => zim
case _ => zim
One solution is to refactor zim
in a function that gets called in all “catch-all” cases. Unfortunately, that is inconvenient if zim
uses many variables from the local scope and impractical if it modifies local state (ah, if only Scala supported call-by-value-result!)
Bailing out
It sometimes happens that we want to test for some conditions at the top of a function before running the remainder of its body. For example, we may want to check for some preconditions:
/** Computes very important things.
*
* @param v The client's version, must be greater than or equal to 2.0.
* @param w The server's version, must be greater than or equal to the client's.
*/
def computeThings(v: Version, w: Version): T =
v match
case Version.Stable(client, _) if client >= 2 =>
if w >= v then
// many lines
if client >= 3 then
newTechnique()
else
oldTechnique()
else throw InvalidServerVersion(w)
case _ => throw InvalidClientVersion(v)
The function starts by verifying that its precondition are satisfied, throwing exceptions if they aren’t. Then, it goes on to run its logic, which happens to require a value extracted from v
.
Having the code written that way is sub-optimal for two reasons:
- Most of the function’s body is nested in a match case and excessive indentation is harmful.
- The handling of precondition failures is far from precondition checks.
As we can see, both of these issues grow with the number of precondition checks. The body of the function is nested at least once per test (twice in the case of a match) and it gets difficult to relate the exception thrown to the corresponding test. Note also that both conditional and match expressions introduce these issues.
The setup of computeThings
make it very difficult to refactor. Because we need the client
value to run its core logic, there’s at least one level of indentation that is unavoidable. The conditional expression could be written differently though:
if w < v then throw InvalidServerVersion(w)
// many lines
By inverting the condition, we can “bail out” early and write the remainder of the function without additional nesting. That works because throwing an exception is a way to give up control flow. I’ll show how this idea can be generalized later.
For comprehension to the rescue?
At first glance, it may seem like at least some of the issues discussed above are addressed by for comprehensions. For example, one can eliminate a match expressions used to extract values from options:
def g(v: Option[Int], w: Option[Int]): Option[Int] =
for x <- v; y <- w if y >= x yield
x + y
Unfortunately, there are still quite a few problems that for
does not solve:
- We can’t write an “else” branch that would be executed if one of the enumerators does not produce a value;
- We can’t use a for comprehension as a pattern guard;
- Filtering does not generalize well to all patterns.
The last issue is particularly vexing because it is quite tempting to write the following:
for case Version.Stable(major, minor) <- v if (major >= 2) do
println(s"running version ${major}.${minor}")
Note that it is possible to get this expression working if v
is wrapped in a sequence. But this trick reveals that for
, even with a couple of changes, is perhaps ill-suited to fill the design gaps that I’ve discussed so far. AFAIU, the feature is fundamentally about handling enumerators rather than performing arbitrary pattern matching.
What have others done?
Let’s make a detour to other languages before we discuss possible solutions for Scala.
Patterns in conditional expressions
In Swift, we can use patterns in conditional expressions, which then serve as conditions. The bindings introduced by the patterns are defined only in the first branch. Multiple conditions can appear separated by a comma, and mixing pattern matches and Boolean expressions is allowed.
if case Version.stable(let major, _) = v, major >= 2 {
foo()
} else {
bar()
}
Note extracted values can be bound to let
or var
(which are equivalent to Scala’s val
and var
, respectively) to control the mutability of the local variable.
Rust also admits patterns in conditional expressions, although multiple conditions are not supported. The language therefore suffers the same issue as Scala with respect to nested matches: zim
has to be repeated.
if let Version::Stable(major, _) = v {
if major >= 2 { foo() } else { zim() }
} else {
zim()
}
In this paper, Lionel Parreaux describes the so-called “ultimate conditional syntax” for MLscript that supports patterns in conditional expressions. Multiple conditions can appear separated by and
:
if v is Stable(major, _) and major >= 2
then foo()
else bar()
Refining matches
Since Parreaux’s conditional unifies the syntax of conditional and match expressions, nested patterns are expressed as naturally as and
-separated sequences of conditions in a “multi-way conditional expression”:
if v is
Some(x) and x is
Stable(major, minor) then foo(major, minor)
Legacy then bar()
else zim()
Bailing out
In Swift, a guard
statement can be used to test for a condition and break control flow if it does not hold. It has a single branch, introduced by else
.
func computeThings(_ x: Int) throws -> T {
guard x >= 0 else { throw InvalidArgument(x) }
// many lines
return T()
}
Just like in regular conditional expressions, the condition can contain patterns. Bindings introduced by these patterns are available after the guard statement, but not in its branch.
func computeThings(v: Version, w: Version) -> T {
guard case Version.stable(let client, _) = v, client >= 2 else { throw InvalidClientVersion(v) }
guard w >= v else { throw InvalidServerVersion(w) }
// many lines
return if client >= 3 {
newTechnique()
} else {
oldTechnique()
}
}
Rust has a similar construction. Sadly it also suffers from the inability to write multiple conditions.
fn computeThings(v: Version, w: Version) -> i64 {
let Version::Stable(major, _) = v else { panic!("invalid server version"); };
// ...
}
Proposed solutions
Let’s now examine possible solutions to address the three shortcomings introduced in the first section. These are pretty straightforward and directly inspired by precedent in other languages.
Patterns in conditional expressions
In both Swift and Rust, pattern matches in conditional expressions require an inversion of the pattern and the scrutinee. For example, in Swift:
// match form
switch v {
case Version.stable(let major, _) where major >= 2:
print(major)
default:
break
}
// conditional form
if case Version.stable(let major, _) = v, major >= 2 {
print(major)
}
This inversion may look surprising. Hence, it may be reasonable to try to avoid it in Scala, just like in Parreaux’s “ultimate conditional syntax”:
// without pattern/scrutinee inversion
with v match case Version.Stable(major, _) if major >= 2 then
println(major)
The reason for introducing the construct with with
is that an expression of the form if v match case ...
is currently valid Scala, provided that the cases are wrapped in braces, only with different semantics. The additional condition on the result of the match is introduced with if
for consistency with pattern guards.
If we conceded the inversion of the pattern and the scrutinee, then we could use a syntax similar as Swift and Rust:
// with pattern/scrutinee inversion
if val Version.Stable(major, _) = v if major >= 2 then
println(major)
This form has a few advantages that perhaps make up for the surprising inversion:
if
is a more recognizable introducer for a conditional expression;val
andvar
can be used to introduce immutable and mutable bindings, respectively;- the syntax is consistent with that of
val
s andvar
s introduced with irrefutable patterns; - both additional matches and Boolean expressions are introduced with the same keyword.
Refining matches
A case in a match expression can be followed by another match expression, introduced with with
to avoid ambiguity with pattern guards:
v match
case Some(x) with x match
case Version.Stable(major, minor) =>
foo(major, minor)
case Version.Legacy =>
bar
case _ =>
zim
The cases of the nested expressions need not be exhaustive. If none of them match, then control flow returns to the outer expression and proceeds as though the current case had not been matched.
The nested matches could obviously introduce side-effects, but that wouldn’t be new. One can already write arbitrary side effects in a pattern guard (and you better make sure they don’t change the scrutinee!)
Bailing out
A val
or var
introduced with a refutable pattern can be followed by an expression returning control flow, introduced by else
. Conditions and nested matches can follow the pattern, introduced by if
and with
, respectively. The expression in the else
branch must produce Nothing
.
def computeThings(v: Version, w: Version): T =
val Version.Stable(client, _) = v if client >= 2 else throw InvalidClientVersion(v)
if w < v then throw InvalidClientVersion(v)
// many lines
if client >= 3 then
newTechnique()
else
oldTechnique()