QoL: Sound binding in pattern alternatives

Sporarum · June 19, 2023, 1:01pm

(This is about pattern alternatives, not union types !)

enum Tree:
  case Wrap(t: Tree)
  case Node(t1: Tree, t2: Tree)
  case Leaf(value: Int)
import Tree.*

def rightmostValue(tree: Tree): Int =
  tree match
    case Wrap(subtree) | Node(_, subtree) => rightmostValue(subtree)
    case Leaf(value) => value

The code above is unambiguous, right ?
We extract a value subtree of type Tree, and recurse on it

But this is not allowed:

Illegal variable subtree in pattern alternative

It is said as much in the Scala 2 spec:
“They may not bind variables other than wildcards.”
https://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html#pattern-alternatives

My somewhat crude proposal is the following:

Binding identifier id is allowed in pattern alternatives iff all the following conditions are true:

each alternative i extracts id with type T_i
effect: id has type T = T_1 | ... | T_n
if an alternative or one of its sub-patterns is id: U_i, then T must be =:= U_i
id cannot be bound in a “pattern binder” id@Extractor(...)

The last condition could also be replaced by “if id appears in a pattern binder, then the pattern must be the same in all aternatives”

Examples:

tree match
  case Wrap(subtree) | Leaf(subtree) =>
    // body where subtree has type Tree | Int

tree match
  case Wrap(subtree) | Leaf(subtree: Int) =>
    // error: subtree is bound to type Tree | Int, not Int

tree match
  case Wrap(subtree@Wrap(_)) | Leaf(subtree) =>
    // error: multi-bound variables cannot be part of pattern binders

tree match
  case Wrap(subtree@Wrap(_)) | Node(_, subtree@Wrap(_)) =>
    // Might be okay ?

This has the impact that P1 | P2 is not always equivalent to P2 | P1:

tree match
  case Wrap(t) | t: Tree =>
    // not equivalent to
  case t: Tree | Wrap(t) =>

Questions for you:

Is this sound ?
Does this have good UX ?
Would this be useful ?
Suggestions ?

TomasMikula · June 19, 2023, 7:07pm

It can be implemented as syntactic sugar (am I right?), and thus will not introduce any new unsoundness.

seigert · June 20, 2023, 4:42pm

It was discussed a little here: QoL: Sound binding in pattern alternatives

Sporarum · June 22, 2023, 11:05am

I’m sorry I don’t understand what you want to say, this is a link to the current discussion

mbovel · June 22, 2023, 1:44pm

See also this archived feature request: Support Variables in pattern alternatives · Issue #12 · lampepfl/dotty-feature-requests · GitHub.

Sporarum · June 22, 2023, 2:24pm

And Allow variables in pattern alternatives using union types · Issue #175 · lampepfl/dotty-feature-requests · GitHub
(Which was just flagged as a duplicate of yours)

So it seems this is already “common sense”

Sporarum · June 22, 2023, 2:37pm

I think the most straightforward way to implement this is to desugar

tree match
  case Wrap(subtree: Int | Tree) | Leaf(subtree: Int | Tree) =>
    body

into

tree match
  case Wrap(subtree: Int | Tree) =>
    body
  case Leaf(subtree: Int | Tree) =>
    body

(Avoids having to deal with symbols being defined twice)

For this to be predictable, the proposal needs to be restricted like so:
“”"
Binding identifier id is allowed in pattern alternatives iff there is a unique T such that:
each alternative extracts id with type T (through the extractor’s return type and/or through narrowing (id: T))
“”"
I believe this is acceptable

Downside to this simplicity is it can lead to code explosion:

treePair match
  case ( 
      (Wrap(subtree1: Int | Tree) | Leaf(subtree1: Int | Tree)), 
      (Wrap(subtree2: Int | Tree) | Leaf(subtree2: Int | Tree)) 
  ) =>
    body

will become

tree match
  case (Wrap(subtree1: Int | Tree), Wrap(subtree2: Int | Tree)) =>
    body
  case (Wrap(subtree1: Int | Tree), Leaf(subtree2: Int | Tree)) =>
    body
  case (Leaf(subtree1: Int | Tree), Wrap(subtree2: Int | Tree)) =>
    body
  case (Leaf(subtree1: Int | Tree), Leaf(subtree2: Int | Tree)) =>
    body

sjrd · June 22, 2023, 2:43pm

Duplicating arbitrary user code is a big no-no. So to be supported, it needs to go deep into actual semantics, not just a desugaring.

There’s nothing fundamentally against this support. My understanding is that the cost/benefit ratio has always been considered too high.

Sporarum · June 22, 2023, 2:45pm

(Except with inline parameters)

Sporarum · June 22, 2023, 2:47pm

This idea was also mentionned here:

github.com/lampepfl/dotty

Comment by smarter to Fix #10050: Use lowercase pattern variable syntax for quoted patterns

lampepfl:master ← dotty-staging:fix-#10050

> ``case '{ type t; $x: Map[`t`, `t`] } => '{ val y: t = .. .} // t form within …the quotes`` Would the following be possible? ```scala case '{ $x: Map[t, t] } => '{ val y: t = .. .} // t form within the quotes ``` Semantically this makes sense I think: we accept a type in all these positions, it's just that it has to be the same one. Normal pattern syntax does not currently allow referring to the same pattern variable multiple times in a pattern, but I recall @LPTK had some ideas on how this could be useful (e.g., when using alternatives).

But the goal there would be for things like

pair match
  case (a, a) => 
  case _ =>

to work

Which is similar but completely orthogonal

seigert · June 26, 2023, 11:07am

My bad, I’ve posted the wrong link from mobile. What I meant to post is
Could it be possible to allow variable binging in patmat alternatives for Scala 3.x?

There is also a link from @LPTK to some prototype for this feature.

Sporarum · June 26, 2023, 12:03pm

Thank you !
I had searched if someone had already opened something here, but didn’t find it