Pattern matching with named fields

To me this won’t do; a large point of pattern matching is the symmetry between constructing and destructuring the data types, so breaking that symmetry makes the whole thing feel wrong. (There are already places where symmetry breaks down, e.g. around how infix methods work, and those places already do feel wrong)

Lookahead/backtracking in the parser is not important, given how minimal it is here. We already require a lot of lookahead (or equivalent hacks) parsing lambdas and such, so there’s precedence for elsewhere in the language and it seems to be working OK (contrast to eg.g. Python or F#, which uses the lambda or fn keywords to avoid lookahead/backtracking)

This seems fixable by defining positional arguments to reference the _n methods that ProductN traits define, or productElement(n: Int), and making case classes implement ProductN. That way we could have the same def unapply(...): Option[Foo] method work for both positional and named bindings

1 Like

Sorry for not replying to this idea, I’m a little bit afraid of posting to many posts here.

Personally I found the proposal of having a type in the unapply object with a tuple of names a much simpler approach.

Do you mean like:

object User:
   type Unapply = ("name", "age", "city")
   der unapply: Some((String, Int, String)) = ???

Which seems quite doable. Unapply could be defined in a trait of User. I wonder what a creative mind would do with this mechanism.

Or do you mean more like named tupples?

object User:
   der unapply: Some((name: String, age: Int, string: String)) = ???

Why not apply the following dead-simple desugaring:

  case Foo(a, b){ u = c, v = d } => ...

  // desugars to:

  case x @ Foo(a, b) => val u = x.u; val v = x.v; ...

(Note that in Scala, the x in x @ Foo(...) gets the more precise type taken by Foo's unapply.)

Distinguishing named fields from extractor arguments seems like the right thing to do. I dislike the idea of conflating the two, which leads to a weird inconsistency/discontinuity:

// Assume Foo takes 3 patterns

Foo(a, b, c)  // error
Foo(a, b)     // ok
Foo(a)        // error
Foo()         // error

Foo(x = a, y = b, x = c) // ok (but looks wrong)
Foo(x = a, y = b)        // ok
Foo(x = a)               // ok (but looks wrong)
Foo()                    // error (discontinuity!)

To hone in why it looks wrong, consider these uses of familiar extractors:

  case Some(value = x, size = s) =>  // Some now takes two patterns?

  case List(x, y, length = l) =>     // Looks like l is part of the list

The approach would also allow for name punning, and in fact it could work with any pattern, not just unapply patterns.

case class Person(name: String, age: Int) { lazy val uid = ... }

...

  case Person(n, a){ uid = u } => ... n ... a ... u ...

  case Person(n, a){ uid } => ... n ... a ... uid ...

  case (_: Person){ age, name, uid } => ... name ... age ... uid ...
1 Like

Okay, I think for mixed usage, we do mean basically the same here. I updated my post, after I better understood name based matches.

Yeah, we need some kind of mechanism to prevent methods like hashCode or getClass become visible in the pattern. I went with the underscore prefix, but val should do the trick too. Annotations would be more flexible, but unscalaish? I think that we we need some kind of restriction is also some part of the critique from @LPTK ?

That wouldn’t work for patterns like Foo(y = Some(a)), for a Foo with two patterns. Such patterns are, at least for me, the main appeal for pattern matching with named fields. I think the trad-off for having a new syntax just to shorten x.u isn’t very good. I never had an issue with writing x.u by hand.

Actually, in my proposal I meant for u and v to of course also be able to be patterns. So it’s true that the desugaring would be more involved than what I showed.

You could do either:

  case Foo(_, _){ y = Some(a) } =>

or:

  case (_: Foo){ y = Some(a) } =>

My interpretation of those examples would be:

// Assume Foo takes 2 patterns

Foo(a, b, c)  // error
Foo(a, b)     // ok
Foo(a)        // error
Foo()         // error

Foo(x = a, y = b, x = c) // error (and good point)
Foo(x = a, y = b)        // ok
Foo(x = a)               // ok (and is the point the proposal)
Foo()                    // error (discontinuity!)

To fix the discontinuity, we could add syntax to ignore missing patterns:

Foo(a)                // error
Foo(a, <keyword>)     // ok
Foo(<keyword>)        // ok
Foo()                 // error

Foo(x = a)            // error
Foo(x = a, <keyword>) // ok
Foo(<keyword>)        // ok
Foo()                 // error

But I would argue, that named pattern most of the time lead to <keyword> and the usage of <keyword> with only positional arguments is accident waiting to happen. We would also have to pick and learn a keyword / symbol for <keyword>, which could be quite a hassle. So I favor the of idea of ignoring missing patterns, as soon as named pattern shows up, and go with the discontinuity.

In this case I would prefer your idea from above, because it leads to better structured programs. The ifs separate the deconstruction quite nicely. (Although I think a different keyword is needed to distinguish boolean predicates and patterns, but details.)

I think Li Haoyi’s suggestion would work really well if we limit it to case classes, and make case classes a little more special than they are right now, instead of “special casing” tuples.

Now unapply methods return tuples and you can only use positional arguments in pattern matching. If unapplies could return any case class (like Scala 3 already supports apparently) then you could simply use the names of the case class fields as well as the positions.

I don’t see how this would make pattern matching with opaque types any more or less cumbersome than it is right now, since now your unapply already had to return a tuple which is simply a special case of a case class.

My thought was on the first one. A simple Unapply type that names the values.

My thought was on the first one. A simple Unapply type that names the values.

Then we might as well use a (different) case class for that. Let me explain:

Right now the situation in Scala 3 is as follows: A pattern match will resolve to an unapply that either returns a Product type, or a type that has isDefined and get methods (for instance, Option) and where the type of the get is either a simple type or a Product type. Product types have selectors _1, _2, and so on, and these determine what fills the pattern slots. Case classes are Product types, They define unapply methods that return the case class instance itself.

For case classes the compiler does know which field defines a selector, so it would have everything it needs to implement named patterns. But for general products that information is not available at compile time.

One important principle of Scala’s pattern matching design is that case classes should be abstractable. I.e. we want to be able to represent the abilities of a case class without exposing the case class itself. That also allows code to evolve from a case class to a regular class or a different case class while maintaining the interface of the old case class.

Take for instance the following case class

case class Person(name: String, age: Int)

Say we want to go to a regular class with 3 parameters while also maintaining the old interface. Right now we would do something roughly like this:

class Person(val name: String, phoneNumber: String, val age: Int):
  def this(name: String, age: Int) = this(name, "", age)
object Person:
  def unapply(p: Person): Option[(String, Int)] = Some((p.name, p.age))

In the prototype that would lose the ability to use name and age for named pattern arguments. But we could recover it like this:

case class NameAndAge(name: String, age: Int)
object Person:
  def unapply(p: Person): Option[NameAndAge] = Some(NameAndAge(p.name, p.age))

The only point where that scheme breaks down is if the case class has exactly one argument. Example:

case class Name(name: String)
object Person:
  def unapply(p: Person): Option[Name] = Some(Name(p.name))

Then case Person(n) would bind n to the Name wrapper class, not to the String argument of that class. But presumably case Person(name = n) would bind n to to the string.

A somewhat drastic but simple cure for this discrepancy would be to allow named pattern matches only for case classes with at least two arguments. If there’s only one argument, you should not need a named match anyway. So while we lose generality in that way, we gain in simplicity and uniformity of code use. It’s a tradeoff to be considered.

But doesn’t this discrepancy already exist? If you have def unapply(p: Person): Option[Name] the semantics will change if Name has 1 field vs if it has 2 fields. But when you have def unapply(p: Person): Name that’s not the case anymore.

I think this discrepancy comes from the fact that extractors with 1 argument are represented as Option[Foo] instead of the more correct Option[Tuple1[Foo]].

Yes, that’s where it comes from. My argument was that that’s what prevents us from defining watertight abstractions for case classes with one element, if named parameters are allowed. Without named parameters it’s fine. A case class with a single String field can be emulated by an unapply with result Option[String]. But then the name of the field gets lost.

Actually n would be a Name in case Person(name = n). The prototype just shuffles the argument trees around, so typing stays unaffected. Your examples in general are not intended to work, but do because the prototype has ‘some’ bugs.

Anyways, I think I implemented the general idea from @Katrix, with one deviation that I use an annotation, not a type for the names. I think it makes more sense to put this information directly on the method, instead in an unrelated type. (You can blame my Lombok exposure for that :wink: ) It doesn’t look quite scalaish, but pattern matching under the hood rarely does.

class Age(val hidden: Int)

object Age:
  @patternNames("years")
  def unapply(age: Age): Option[Int] = Some(age.hidden)

It doesn’t show any particular behaviour for extractors with a single element.

4 Likes

So, should we expect this feature soon?

The topic is only 3 years old, you have to let these matters simmer awhile.

I see my previous comment was:

The linked scala-debate thread is so old, someone was posting from their blackberry.

At this point, I would have to time travel just to understand the reference.

2 Likes

So I finally got around to polish the SIP and updated my prototype. I would like to have some feedback for the SIP and a partner in crime to finish the prototype, it’s quite ugly now.

Maybe we get this done this decade.

9 Likes

Thank you @Jentsch for writing the SIP! As far as I know, the process is dormant for now but it should be resurrected soon. Please be patient.