Pre SIP: Demote match keyword to a method


#22

We have two variants of the question:

  • can we actually move the implementation of pattern matching to libraries? I still think the answer is (very likely to be) no.
  • can we pretend that at least x.match looks like a normal method call? Yes if Dotty can be patched for your variant of the syntax — whether you turn it into a match call or just bring the grammars closer…

First, the spec on meta is outdated. And I don’t think pattern matching could be implemented in a Dotty library (macros or not) — if it is, I feel it’s likely a bug.

That’s because pattern matching is supported both in the typechecker, a compiler phase for translation relying on some lower-level nodes (our version of labeled defs — labeled blocks by @sjrd, but the difference shouldn’t concern us), and handling in the phases after translation for the new generated constructs.

While I’m still the youngest Dotty contributor inside EPFL, here’s (part of) the rationale for the new metaprogramming designs as far as I can tell:

  • The Dotty typechecker is more robust than Scalac’s, but a few things are still subtle. Among those, preventing cycles is still delicate business — and that’s complexity essential to the Scala spec. To remove it, you’d just have to limit mutual recursion, but that would be a different language.
  • Scala 2 whitebox macros are very expressive, but they allow any library to extend the typechecker — which has frozen all its internals. People at Lightbend aren’t maintaining Scalac, they’re maintaining Scalac, macros-paradise, all macros in the community build, and probably more ones. They have to, lest the ecosystem stops working. That’s not sustainable, and has prevented many bigger changes in Scalac. The strong feeling is that we don’t want to repeat that mistake.
  • Limiting access to the typechecker has been an explicit design goal for the whole time — the new macros will run after typechecking is done and types are set in stone. More controlled mechanisms are possible (see match types and all that), but nothing like whitebox macros. Even usual compiler plugins are more limited — “research plugins” have fewer limitations but are currently disabled on stable builds.
  • There is an in-progress lower-level macro API, but that’s run after typechecking and cannot affect types, and the current design gives you a snapshot of Tasty trees, so I’d even expect no access to labeled blocks. Even then, you don’t have full access to the internals, just restricted access to a controlled subset.
  • Since people can be determined, I suspect you could maybe break some of the restrictions of the lower-level macro API by linking against the compiler (unsafely) and using enough reflection, but there’s no support and no binary compatibility guarantees for internal compiler APIs. No scala-reflect with stronger guarantees for half of the internals.

EDIT: To conclude: does that prove you can’t implement pattern matching in a library somehow? Not quite — I haven’t checked if there’s some clever way to encode parts of it via match types and whatnot. But it’s not pattern matching as we have it today.
EDIT2: what has not been discussed are macros before Namer and Typer, that is, simple desugarings. They’d be extremely limited


#23

I think the point was to treat pattern matching closures as the primitive form; then match can indeed be a method. It probably still needs to be treated specially in the compiler to avoid the actual creation of a closure, but that’s a minor point.

However, I am not sure it’s worth the change, for two reasons:

  • you lose syntax highlighting

  • even if we made match a method, the main usage would still be like this

    x match { ... }
    

    rather than

    x.match { ... }
    

    It just looks better. But that would then rely on alphanumeric methods being written in infix form, and that’s what we might actually want to get away from since it gives a confusing, and mostly pointless, choice to people. I am not sure about this yet, but just to get more experience with it I have taken to write now

    x `eq` y
    

instead of

   x eq y

and so on. As I said, so far it’s an experiment, nothing more. I am not sure we will reach a decision anytime soon. But since it affects the question of match I would like to clarify what Scala 3 should do about infix operators before deciding on a change of the status quo.


#24

Assuming that the pattern-matcher would generate the appropriate logic for creating PartialFunctions, and match being a method which takes a function from the “MyType” of the receiver to some type X, if the match method is marked as inline, and with an optimizer which can inline PartialFunction literals, wouldn’t we (at least theorhetically) be able to replace match as a keyword with match as a method, without necessarily incurring any performance implications?


#25

The problem is, what would it look like after inlining a match? At some point you need a primitive match expression. Except if the pattern matcher does the inlining and eliminates the closure; that would work. But, either way, that’s a minor issue.


#26

I’m not sure I understand the question, could you elaborate?

What I mean is that for a “partial function literal”, ex:

{ case Foo(_) => “bar” }

the compiler will generate an apply-method which performs the pattern match. Now, in order to reduce overhead, let’s say that match is loosely defined as follows:

trait Any { type Self =>
@inline def match[T](expr: PartialFunction[Self, T]): T = expr(this)
// you could technically have the following as well:
@inline def pipe[T](f: Self => T): T = f(this)
}

Now, a sufficiently advanced compiler would look at the following code:

1 match {
case 0 => “foo”
case _ => “bar”
}

And it would do the following rewrite:

({
case 0 => “foo”
case _ => “bar”
})(1)

Then during optimization, it would want to avoid megamorphic calls, or even invokeinterface ones, and it’d see that it’s a PF literal which is anonymous and only used once, and it would instead elide the allocation of the PartialFunction and apply (sic!) the body of its apply directly to its parameter.
Upon seeing that it is a primitive, it could elide all boxing and in the end you’d end up with something as performant and straightforward as the current match keyword.

All of the above makes a lot of assumptions, but at least it should outline the gist of my train of thought.


#27

Exactly. But what does the result of this look like? It would look like our good friend, the match expression unless you map at the same time to a lower level. These are all compiler games which can be played. Like I said, that would not be the problem.


#28

If match were a method I would expect it to be overridable. I haven’t thought through this but I’m sure someone could come up for some bad ways to override match.


#29

FWIW you could define match to take a regular function, since the MatchError behavior will be the same in either case


#30

Would I be able to define my favorite symbolic extension method and have no overhead when I write:

x ~ { case X(_) => ??? }


#31

Yes, if my assumptions hold.


#32

Absolutely. I guess the difference would be that there would be no match keyword or expressions needed in the language spec as it has been unified with the “Partial Function literal” syntax.

Edit:

One big difference though is that PartialFunctions are composable whereas match expressions arent. (Unless you sort of surround them to catch MatchError and delegate to another match expression.)


#33

Opinion: I disagree here. While I do think that point-free syntax is often overused by beginners, I really like the ability to use it for more DSL-ish applications, and do so pretty often for certain use cases. I think it’s worth keeping.


#34

(Minor nitpick: point-free style is a whole different thing, so let’s call it “infix” vs “dot-and-paren” notation.)

I agree with @jducoeur in terms of Scala’s ability to extend language-construct-like method being a strength of the language, especially the way the collection library demonstrates it.

To me, the infix makes feels natural when the argument is a lambda, which is also consistent with match:

List(1, 2, 3) filter { _ % 2 == 0 }
Option(1) match { case _ => "foo" }
List(1, 2, 3) map { case 1 => 1; case n => n + 1 }

I’d switch to dot-and-paren when you want to chain more than 1 (, which cannot do with match now).


#35

pointed vs pointless?

Edit: I just had that weird experience where the word point looks like a non-word or misspelling.