Pattern matching with named fields

I"m not sure if a separate unapplyNamed method would be good idea. When we want to allow mixed usage, like Foo(1, x = 2), both the positional and the named pattern should be handled by the same unapply. To avoid clashes with preexisting methods, maybe the names should be prefixed with underscore e.g. _city, just like _1.

Following the example of the specs, I’d like to propose very boldly:

class FirstChars(s: String):
  def _1 = s.charAt(0)
  def _first = _1
  def _2 = s.charAt(1)
  def _second = _2

  def _get(i: Int) = if i < s.length then Some(s(i)) else None

  def isEmpty = false
  def get = this

object FirstChars:
  def unapply(s: String): FirstChars = new FirstChars(s)

"Hi!" match
  case FirstChars(char1, char2 = second, char2 <- 3) =>
    println(s"First: $char1; Second: $char2; Third: $char3")

The rules for patterns become:

  • <pattern> at position n resolves to a pattern on _<n>, just like before
  • <pattern> = <name> resolves to a pattern for the field _<name>.
    Important: I swapped the position of name and pattern, in comparison the my previous proposal. This seems counter intuitive, as destroys the perfect alignment with the constructor. But in Scala an equal sign always meas set the (new) name left of the sign to the result of the expression right of sign. It also avoids a look-ahead in the parser.
    Notice than the restriction to only allow names is arbitrary. It’s only there to prevent people to shot in there own foot.
  • <pattern> <expr> resolves to a pattern for _get(<expr>)
    This is a new capability and could be left out. But it shows the similarity to for-comprehensions.

I’ll try try to update the prototype and the proposal, but maybe this is more than I can chew.

1 Like

Not sure about how I like tying it to class fields or method names. First off I feel that it just limits the scope of what can be done with named fields in patterns. For example, it prevents me from defining an unapply for a opaque type, that might have a different representation behind the scenes. It might also expose some things I don’t want available in a pattern.

Personally I found the proposal of having a type in the unapply object with a tuple of names a much simpler approach.

Honestly I’d leave this kind of thing out for now, or discuss it in a separate thread. If we want to add names to pattern matching, we should stick with that and try to get it over the finish line, rather than risk conflating multiple novel ideas in one proposal that we can’t get consensus on.

1 Like

To me this won’t do; a large point of pattern matching is the symmetry between constructing and destructuring the data types, so breaking that symmetry makes the whole thing feel wrong. (There are already places where symmetry breaks down, e.g. around how infix methods work, and those places already do feel wrong)

Lookahead/backtracking in the parser is not important, given how minimal it is here. We already require a lot of lookahead (or equivalent hacks) parsing lambdas and such, so there’s precedence for elsewhere in the language and it seems to be working OK (contrast to eg.g. Python or F#, which uses the lambda or fn keywords to avoid lookahead/backtracking)

This seems fixable by defining positional arguments to reference the _n methods that ProductN traits define, or productElement(n: Int), and making case classes implement ProductN. That way we could have the same def unapply(...): Option[Foo] method work for both positional and named bindings

1 Like

Sorry for not replying to this idea, I’m a little bit afraid of posting to many posts here.

Personally I found the proposal of having a type in the unapply object with a tuple of names a much simpler approach.

Do you mean like:

object User:
   type Unapply = ("name", "age", "city")
   der unapply: Some((String, Int, String)) = ???

Which seems quite doable. Unapply could be defined in a trait of User. I wonder what a creative mind would do with this mechanism.

Or do you mean more like named tupples?

object User:
   der unapply: Some((name: String, age: Int, string: String)) = ???

Why not apply the following dead-simple desugaring:

  case Foo(a, b){ u = c, v = d } => ...

  // desugars to:

  case x @ Foo(a, b) => val u = x.u; val v = x.v; ...

(Note that in Scala, the x in x @ Foo(...) gets the more precise type taken by Foo's unapply.)

Distinguishing named fields from extractor arguments seems like the right thing to do. I dislike the idea of conflating the two, which leads to a weird inconsistency/discontinuity:

// Assume Foo takes 3 patterns

Foo(a, b, c)  // error
Foo(a, b)     // ok
Foo(a)        // error
Foo()         // error

Foo(x = a, y = b, x = c) // ok (but looks wrong)
Foo(x = a, y = b)        // ok
Foo(x = a)               // ok (but looks wrong)
Foo()                    // error (discontinuity!)

To hone in why it looks wrong, consider these uses of familiar extractors:

  case Some(value = x, size = s) =>  // Some now takes two patterns?

  case List(x, y, length = l) =>     // Looks like l is part of the list

The approach would also allow for name punning, and in fact it could work with any pattern, not just unapply patterns.

case class Person(name: String, age: Int) { lazy val uid = ... }

...

  case Person(n, a){ uid = u } => ... n ... a ... u ...

  case Person(n, a){ uid } => ... n ... a ... uid ...

  case (_: Person){ age, name, uid } => ... name ... age ... uid ...
1 Like

Okay, I think for mixed usage, we do mean basically the same here. I updated my post, after I better understood name based matches.

Yeah, we need some kind of mechanism to prevent methods like hashCode or getClass become visible in the pattern. I went with the underscore prefix, but val should do the trick too. Annotations would be more flexible, but unscalaish? I think that we we need some kind of restriction is also some part of the critique from @LPTK ?

That wouldn’t work for patterns like Foo(y = Some(a)), for a Foo with two patterns. Such patterns are, at least for me, the main appeal for pattern matching with named fields. I think the trad-off for having a new syntax just to shorten x.u isn’t very good. I never had an issue with writing x.u by hand.

Actually, in my proposal I meant for u and v to of course also be able to be patterns. So it’s true that the desugaring would be more involved than what I showed.

You could do either:

  case Foo(_, _){ y = Some(a) } =>

or:

  case (_: Foo){ y = Some(a) } =>

My interpretation of those examples would be:

// Assume Foo takes 2 patterns

Foo(a, b, c)  // error
Foo(a, b)     // ok
Foo(a)        // error
Foo()         // error

Foo(x = a, y = b, x = c) // error (and good point)
Foo(x = a, y = b)        // ok
Foo(x = a)               // ok (and is the point the proposal)
Foo()                    // error (discontinuity!)

To fix the discontinuity, we could add syntax to ignore missing patterns:

Foo(a)                // error
Foo(a, <keyword>)     // ok
Foo(<keyword>)        // ok
Foo()                 // error

Foo(x = a)            // error
Foo(x = a, <keyword>) // ok
Foo(<keyword>)        // ok
Foo()                 // error

But I would argue, that named pattern most of the time lead to <keyword> and the usage of <keyword> with only positional arguments is accident waiting to happen. We would also have to pick and learn a keyword / symbol for <keyword>, which could be quite a hassle. So I favor the of idea of ignoring missing patterns, as soon as named pattern shows up, and go with the discontinuity.

In this case I would prefer your idea from above, because it leads to better structured programs. The ifs separate the deconstruction quite nicely. (Although I think a different keyword is needed to distinguish boolean predicates and patterns, but details.)

I think Li Haoyi’s suggestion would work really well if we limit it to case classes, and make case classes a little more special than they are right now, instead of “special casing” tuples.

Now unapply methods return tuples and you can only use positional arguments in pattern matching. If unapplies could return any case class (like Scala 3 already supports apparently) then you could simply use the names of the case class fields as well as the positions.

I don’t see how this would make pattern matching with opaque types any more or less cumbersome than it is right now, since now your unapply already had to return a tuple which is simply a special case of a case class.

My thought was on the first one. A simple Unapply type that names the values.

My thought was on the first one. A simple Unapply type that names the values.

Then we might as well use a (different) case class for that. Let me explain:

Right now the situation in Scala 3 is as follows: A pattern match will resolve to an unapply that either returns a Product type, or a type that has isDefined and get methods (for instance, Option) and where the type of the get is either a simple type or a Product type. Product types have selectors _1, _2, and so on, and these determine what fills the pattern slots. Case classes are Product types, They define unapply methods that return the case class instance itself.

For case classes the compiler does know which field defines a selector, so it would have everything it needs to implement named patterns. But for general products that information is not available at compile time.

One important principle of Scala’s pattern matching design is that case classes should be abstractable. I.e. we want to be able to represent the abilities of a case class without exposing the case class itself. That also allows code to evolve from a case class to a regular class or a different case class while maintaining the interface of the old case class.

Take for instance the following case class

case class Person(name: String, age: Int)

Say we want to go to a regular class with 3 parameters while also maintaining the old interface. Right now we would do something roughly like this:

class Person(val name: String, phoneNumber: String, val age: Int):
  def this(name: String, age: Int) = this(name, "", age)
object Person:
  def unapply(p: Person): Option[(String, Int)] = Some((p.name, p.age))

In the prototype that would lose the ability to use name and age for named pattern arguments. But we could recover it like this:

case class NameAndAge(name: String, age: Int)
object Person:
  def unapply(p: Person): Option[NameAndAge] = Some(NameAndAge(p.name, p.age))

The only point where that scheme breaks down is if the case class has exactly one argument. Example:

case class Name(name: String)
object Person:
  def unapply(p: Person): Option[Name] = Some(Name(p.name))

Then case Person(n) would bind n to the Name wrapper class, not to the String argument of that class. But presumably case Person(name = n) would bind n to to the string.

A somewhat drastic but simple cure for this discrepancy would be to allow named pattern matches only for case classes with at least two arguments. If there’s only one argument, you should not need a named match anyway. So while we lose generality in that way, we gain in simplicity and uniformity of code use. It’s a tradeoff to be considered.

But doesn’t this discrepancy already exist? If you have def unapply(p: Person): Option[Name] the semantics will change if Name has 1 field vs if it has 2 fields. But when you have def unapply(p: Person): Name that’s not the case anymore.

I think this discrepancy comes from the fact that extractors with 1 argument are represented as Option[Foo] instead of the more correct Option[Tuple1[Foo]].

Yes, that’s where it comes from. My argument was that that’s what prevents us from defining watertight abstractions for case classes with one element, if named parameters are allowed. Without named parameters it’s fine. A case class with a single String field can be emulated by an unapply with result Option[String]. But then the name of the field gets lost.

Actually n would be a Name in case Person(name = n). The prototype just shuffles the argument trees around, so typing stays unaffected. Your examples in general are not intended to work, but do because the prototype has ‘some’ bugs.

Anyways, I think I implemented the general idea from @Katrix, with one deviation that I use an annotation, not a type for the names. I think it makes more sense to put this information directly on the method, instead in an unrelated type. (You can blame my Lombok exposure for that :wink: ) It doesn’t look quite scalaish, but pattern matching under the hood rarely does.

class Age(val hidden: Int)

object Age:
  @patternNames("years")
  def unapply(age: Age): Option[Int] = Some(age.hidden)

It doesn’t show any particular behaviour for extractors with a single element.

4 Likes

So, should we expect this feature soon?

The topic is only 3 years old, you have to let these matters simmer awhile.

I see my previous comment was:

The linked scala-debate thread is so old, someone was posting from their blackberry.

At this point, I would have to time travel just to understand the reference.

2 Likes