Two proposed changes for extension methods

Ichoran · October 28, 2020, 12:44am

I guess I don’t see how this is awkward. It seems a very tidy mapping between syntactic sugar and underlying constructs.

All there ever is is extension_fun. There isn’t any real fun. If you want to write it that way yourself, go for it. extension (x: A) def fun(y: B) : C is just because it looks nicer than def extension_fun(x: A)(y: B): C. If we want it to be weirder, use $ instead of _, and allow it to be spelled extension.fun in Scala code. (Java has to use $.)

So the compiler doesn’t get confused because there isn’t a separate thing to get confused by; you don’t need to prevent people from writing extension_foo themselves because it’s exactly the same thing!

You import extension_fun, always, and overriding works just like always. The only extra sugar is that when you see a.foo(b) and a doesn’t have a method foo on it, check to see if extension_foo is in scope and with the right type parameters.

There may be good reasons to have extension be a hard keyword anyway, but I don’t see how the name-mangling is anything but simplifying and empowering.

(Aside: you can tell extension(f: Foo) as a statement apart from extension(f: Foo) def…you just need at least a LALR(2) parser. There’s the usual problem of inferring semicolons, but that’s no different from various other things like whether an extra (bippy) on the next line is another parameter block, or an independent statement.)

japgolly · October 28, 2020, 1:08am

Ambiguous though. I don’t think an extension method of fun should prevent a fun method on the parent object itself, or vica-versa. For example I think the following should be legal, and it should be possible for users to import one or the other:

object obj:
  val fun = 1
  extension (x: A) def fun(y: B): C

smarter · October 28, 2020, 1:33am

it should be possible for users to import one or the other:

Why? We don’t allow importing one overload but not the other.

LPTK · October 28, 2020, 8:17am

Because unlike overloads, these two definitions appear unrelated in the source.

odersky · October 31, 2020, 9:52am

Some new thoughts on the second issue: what syntax to choose for a direct call to an extension method. To recap: Given

object obj:
  extension (x: A) def fun (y: B) = ...

How do I call obj.fun directly, without using it as an extension method? This matters for two reasons:

It’s a way to disambiguate things for the programmer if the extension method is not found at all, or the wrong one is found.
It’s a way for the language definition to describe what an extension method application means.

Elaborate extension method call syntaxes like the ones we have been discussing address only the first aspect. They don’t solve the second, since we still have to explain what those elaborate syntaxes mean.

By contrast, the extension_fun name mangling provides a solution for both aspects. On the other hand, the extension_ name mangling has problems on its own. When do you use the extension_ name, when the normal one? It’s all a bit arbitrary.

So maybe we can do without extension_? An alternative rule would simply state that an extension method like fun above would translate to a method

object obj:
  <extension> def fun(x: A)(y: B) = ...

The <extension> modifier is not accessible to user programs. Such methods can be called like any other methods. So a.fun(b) would translate to obj.fun(a)(b), which is by itself a legal expression.

There are several tricky aspects about this, but I believe they can be solved.

We have to make sure that overrides respect extensionality. Only extension methods can override
other extension methods, and all overrides of extension methods must again be extension methods.
Direct calls to extension methods must be selections with a qualifier and a dot. To see why, consider a collective extension like this one:
```
 object obj:
   extension (x: A) 
     def fun (y: B) = ...
     def other = fun(B())
```
Here, the fun(B()) call in other expands to x.fun(B()). So it cannot be a direct call. To make it a direct call, you’d have to write obj.fun(a)(B()).
This means we cannot call an extension method directly at all if it is locally defined. Take the example above but now in a def instead of in an object:
```
 def outer(a: A, b: B) = 
   extension (x: A) 
     def fun (y: B) = ...
   a.fun(b)    // has no direct equivalent
```
We cannot call fun directly, since there is no prefix from which we could select a fun. This is a hassle primarily for the language definition, where we’ll have to do some handwaving or resort to more awkward notation. For programming I don’t think it will matter, since it is an edge case of an edge case of an edge case. Direct calls of extension methods are already the rare exception. Local extension methods will be also quite rare, even though I see them to be useful on occasion. But if I have a local extension method, I don’t usually need to call it directly, since it is the thing that shadows all other possibilities anyway. So I can’t really see a scenario where one would write a local extension method that needs to be called directly. In that case, one should have defined the method as a normal method in the first place.

If we follow this route, we still have to decide whether extension should be a hard or a soft keyword. I am sitting on the fence here. On the one hand, making it a hard keyword is cleaner since it makes every word that can start a definition a hard keyword. On the other hand this will cause considerably breakage (starting with all code that calls extension on a file or a path). So, not sure about this one.

schrepfler · October 31, 2020, 11:46am

Adding “extension” at the call site feels like something which is an implementation detail and not relevant from the point of view of the implementing logic. Is it not possible to just use the dot notation or some other syntax notation which would keep the code simple?

Sciss · October 31, 2020, 2:16pm

So the contents of obj could also be imported to write fun(a)(b), right? This could be useful for “function” style, like

object math:
  <extension> def min(x: Double)(y: Double) = if x < y then x else y

import math._
4.0.min(5.6)
min(4.0)(5.6)  // alternative way of calling

What if the two arg lists were combined in one? Then one would have the alternative forms

4.0.min(5.6)
min(4.0, 5.6)

odersky · October 31, 2020, 2:37pm

So the contents of obj could also be imported to write fun(a)(b) , right?

No. A direct reference is only legal if it is a qualified selection. We want to avoid offering multiple ways to call the same thing.

Katrix · October 31, 2020, 5:17pm

That seems more like an exception to keep in mind than a good design decision to me. Why can you import normal methods and call them without an qualifier just fine, but not extension methods, when they look exactly the same with the qualifier?

japgolly · October 31, 2020, 10:46pm

I’m a bit worried with where this discussion is going. For example, the following code wouldn’t violate the Scala (3) spec so it should compile:

object Blah:
  def x = 1

  extension (a: Any)(using Blah1):
    def x = 2

  extension (a: Any)(using Blah2):
    def x = 3

  extension [A](a: A)(using Blah3[A]):
    def x = 4
    def y = 4

  extension [A <: X1](a: A):
    def x = 5

  extension [A <: X2](a: A):
    def x = 6

Overloads are not a capable solution to disambiguate the above (right?).

I propose that the compiler should associate a fresh/unique prefix per extension (an established practice is other parts of the language). So the above would be translated into:

object Blah:
  def x = 1

  def extension$1$x(a: Any)(using Blah1) = 2

  def extension$2$x(a: Any)(using Blah2) = 3

  def extension$3$x[A](a: A)(using Blah3[A]) = 4
  def extension$3$y[A](a: A)(using Blah3[A]) = 4

  def extension$4$x[A <: X1](a: A) = 5

  def extension$5$x[A <: X2](a: A) = 6

We have to have a solid foundation before we start building derivative features and the above gives you that.

Another solution would be requiring a unique name per extension block, just like we do with methods and classes, and those names become prefixes.

The proposed extension_ prefix don’t give us a solid foundation because so easy to write code that conforms to the spec but results in the compiler generating code that doesn’t compile. The cases we can imagine today might be dismissive as esoteric but once it’s in use, people are going to accidentally end up finding all kinds of cases that are perfectly valid but fail to compile.

As to how to call methods directly…

If we require unique names for extensions (like class names) then it’s trivial and we get it for free
If we go with the fresh name approach then we have to come up with a new language change, maybe something like Blah.extension(_: Any)(using Blah1).x or even Blah.extension(Any)(using Blah1).x which the compiler then translates to Blah.extension$1$x.

But the important points about what I’m proposing are:

unambiguous sibling extensions shouldn’t prevent compilation of each other
we have an unambiguous means of accessing methods directly

If we allow a solution that doesn’t satisfy the above two properties then you can bet $$$ that over the years, bugs/tickets are going to be raised against Scala and then we’ll probably end up changing the implementation 6 years from now. We need to choose an implementation that always works and avoids surprises.

odersky · November 1, 2020, 6:52pm

@sciss @Katrix I found a tweak to allow unqualified direct references to extension methods.

The new scheme in https://github.com/lampepfl/dotty/pull/10128 works as follows.

When resolving a simple identifier f we go as usual from inner scopes to outer scopes. If the search yields an extension method in the same collective extension as the reference, we treat the reference as
a recursive call with the same extension parameter. In all other cases we treat it as a direct reference. So it’s still true that in a situation like

 object obj:
   extension (x: A) 
     def fun (y: B) = ...
     def other = fun(B())

the fun(B()) call in other expands to x.fun(B()) , so it is not a direct call. To make it a direct call, you’d have to write obj.fun(a)(B()) . But in general, direct references to extension methods via simple identifiers are now possible.

etorreborre · November 1, 2020, 9:43pm

LPTK:

It looks like we have gone full circle back to the original extension class syntax. It looks similar when defining it:
  extension(x: A) {
    def fun(y: B): C = ...
  }
  // vs
  implicit class extension(x: A) {
    def fun(y: B): C = ...
  }

After changing many, many of the implicit class definitions in specs2, this really strikes a chord with me. While I like the occasional one-off extension syntax, I’ve had some difficulties to use the new syntax instead of implicit class because:

I have shared code between extension methods
I am massively overloading some terms
I’m using additional type parameters for some extension methods
I need to override some definitions
when I override definitions I can extend a parent implicit class and call super to re-use its implementation

I can possibly work around all of these issues but the idea of writing extension class as proposed by @LPTK is very tempting :-).

etorreborre · November 1, 2020, 9:50pm

Maybe I should add an example of something which is possible with implicit class but not possible with extension. This syntax is working with implicit class:

// definition 
implicit class Reference(alias: String):
  def ~(s: =>SpecificationStructure, tooltip: String): Fragment = ???

// usage
"user guide" ~ (userGuide, "this one")

Whereas with the current extension methods I have to write:

// definition 
extension (alias: String):
  def ~(s: =>SpecificationStructure, tooltip: String): Fragment = ???

// usage
"user guide".~(userGuide, "this one")

Otherwise I get:

[error]    |      value ~ is not a member of String.
[error]    |      An extension method was tried, but could not be fully constructed:
[error]    |
[error]    |          this.extension_~()

odersky · November 2, 2020, 8:28am

I just tried that on latest master and it works seems to work.

odersky · November 2, 2020, 9:00am

Thanks for drawing up this list of difficulties. It’s very useful as a checklist for evaluation.

I have shared code between extension methods

I believe that is still mostly possible, Extension methods are just methods in some enclosing scope and can share code there. Also, you can make extension methods private. So the only thing that’s not possible is to share a value definition that exists per call instance. You have to turn it into a shared def
and instantiate it in each method.

I am massively overloading some terms

And you get “have the same erasure” problems afterwards? Yes, we should work out a systematic solution for this.

I’m using additional type parameters for some extension methods

That one is a real shortcoming today. I have experienced the same issues. But it will be fixed once we allow multiple type parameter lists in methods.

I need to override some definitions
when I override definitions I can extend a parent implicit class and call super to re-use its implementation

That’s an interesting use case. I believe you can do everything “one level further out”. I.e. put the extension methods in objects and classes that have inheritance relationships between them. For instance:

class A:
  extension (s: String)
    def len: Int = s.length

object B extends A:
  extension (s: String)
    override def len: Int = s.length + 1

@main def Test =
  import B._
  println("abc".len)

If you want to make the extensions visible outside that class hierarchy you can use export to achieve that.

In the end it’s a tradeoff. Extension methods are just methods in the enclosing scope. That makes it easy to define them. For instance, I’m using them without a hitch from early on in the revised Scala 3 MOOCs. I would never have seriously considered to introduce implicit classes at such an early point. Extension methods also work really well with typeclasses. Any proposal that claims implicit classes can do the same would have to provide a detailed argument and in particularly would have to show how all the complexity introduced by simulacrum is unnecessary. After all, people have tried really hard to make implicit classes and type classes work together and the result was not pretty.

LPTK · November 2, 2020, 9:44am

I do not claim that implicit classes do the same, but I propose to use a construct closer to implicit classes. The main differences would be: lighter syntax, no implicit conversion behavior, and automatically defining the methods as methods in the parent object.

class A:
  extension Ops(self: B):
    def foo(x: C): D

would be encoded as:

class A:
  final class Ops(private val self: B) extends AnyVal:
    inline def foo(x: C): D = Ops$extension$foo(this)(x)
  def Ops$extension$foo(ops: Ops)(x: C): D

(It does not have to actually use AnyVal behind the scenes, but I use it here to express the fact that it would not need to allocate a wrapper.)

and an implementation of it, as in:

class AA extends A:
  extension Ops(self: B):
    def foo(x: C): D = self.n + x.m

would be encoded as:

class AA extends A:
  def Ops$extension$foo(ops: Ops)(x: C): D = ops.self.n + x.m

The explicit call syntax would be as usual, aa.Ops(b).foo(c). There would be no special behavior around the prefix aa.Ops(b), which would have the straightforward, well-defined semantics of the corresponding AnyVal class.

This approach would allow immediately reaping all the advantages of implicit classes (behaves like any other class-like construct: can extend things, override, provides real name-spacing without colliding with sibling definitions), but without the downsides (no implicit conversion behavior, no clunky syntax, possibility to define the methods in children of the parent class).

PS: I find that the name-spacing aspect is also valuable in itself. It avoids forcing every method into a top-level class scope, when doing mixin composition. Maybe the same extension concept could be used to allow making things look like they are name-spaced inside objects, while still being overridable:

class A:
  extension Blah:
    def foo: B = ...

object AA extends A:
  extension Blah:
    override def foo: B = ... super.foo ...

AA.Blah.foo  // internally calls AA.Blah$extension$foo

odersky · November 2, 2020, 9:50am

This comes close. But the problem is with overrides. We’d have to invent a new system to do something like virtual classes. This is far from a solved problem.

Also, there’s the issue that you have to name the extension, and have to use the extension name instead of the method name for overrides. I believe this would turn out to be quite a bit heavier than what we have.

odersky · November 5, 2020, 3:37pm

We have a solution now: https://github.com/lampepfl/dotty/pull/10149 allows you to use a @targetName annotation to avoid conflicting method definitions that have the same erasure.

etorreborre · November 26, 2020, 2:02pm

Thanks @odersky. I just came across the “same erasure” problem after upgrading to Scala 3.0.0-M2 on the following pattern (it used to work with 3.0.0-M1):

object ResultExecution:
  outer:  ResultExecution =>

  extension (r: =>Result):
      def execute: Result = 
        outer.execute(r)

  def execute(r: =>Result): Result = ???

Using a @targetName("executePostfix") annotation indeed works and I think I’m going to use it everywhere this issue occurs. The 2 issues I see with this annotation are:

it’s more work
we need to summon a name just for “keeping the compiler happy”
more importantly what happens when you refactor your code? How is your IDE supposed to understand that if you refactor the execute extension method to executeNow, then the associated target name should be updated as well, to executeNowPostfix?

I understand that it’s about trade-offs, I think I need to get used to the new paradigm :-).

etorreborre · November 26, 2020, 2:19pm

Speaking of which, is this a bug?

/* Fails with
[error] 103 |      outer.showInt(n)
[error]     |      ^^^^^^^^^^^^^
[error]     |Ambiguous overload. The overloaded alternatives of method showInt in trait Example with types
[error]     | (n: Int): String
[error]     | (n: Int): String
[error]     |both match arguments ((n : Int))
*/
import annotation._

trait Example:
  outer: Example =>

  def showInt(n: Int): String =
    n.toString

  extension (n: Int):

    @targetName("intShow")
    def showInt: String =
      outer.showInt(n)

It looks like @targetName does not work in that case.