Extension methods in typeclasses are surprising

ysthakur · September 13, 2020, 9:34pm

Currently, the Dotty documentation shows this example of implementing semigroups and monoids.

trait SemiGroup[T]:
  extension (x: T) def combine (y: T): T

trait Monoid[T] extends SemiGroup[T]:
  def unit: T

And shows this example of using it:

def combineAll[T: Monoid](xs: List[T]): T =
    xs.foldLeft(summon[Monoid[T]].unit)(_.combine(_))

This seems a bit surprising to me - the extension method combine magically came into scope when there was a Monoid[T] in scope, but the method unit still has to be invoked explicitly on the Monoid instance.

It would make more sense if one had to do import summon[Monoid[T]]._ before being able to use combine that way. But that’s annoying, so it would be even nicer if one could let the compiler know that Monoid and SemiGroup were typeclasses and make it so that whenever there’s an instance of Monoid is in scope, its members are also in scope, maybe with a typeclass annotation:

@typeclass
trait SemiGroup[T]:
  extension (x: T) def combine (y: T): T

@typeclass
trait Monoid[T] extends SemiGroup[T]:
  def unit: T

def combineAll[T: Monoid](xs: List[T]): T =
    xs.foldLeft(unit)(_.combine(_))

Despite perhaps complicating the language more, it would be more uniform - no special treatment for extensions, and of course, one would still be able to use summon[Monoid[T]].unit if there happened to be another unit method in scope or if there was some other ambiguity.

I’ve also opened a feature request issue on Github.

odersky · September 13, 2020, 10:06pm

But which unit in combineAll? You could have several typeclasses with a unit element. So this looks too ambiguous to me. It would be better to write combineAll like this:

def combineAll[T: Monoid](xs: List[T]): T =
  xs.foldLeft(Monoid.unit)(_.combine(_))

And that can be achieved by defining Monoid like this:

trait Monoid[T] extends SemiGroup[T]:
  def unit: T
object Monoid:
  def unit[T: Monoid] = summon[Monoid[T]].unit

lavrov · September 14, 2020, 7:52am

Glad I’m not alone in it. I agree that combination “implicit object + extension methods” feels magical. E.g. if define my type class without extension method I will not get this nice boilerplate free syntax:

trait Semigroup[T]:
   def combine(x: T, y: T): T

combine(1, 2) is not more ambiguous than 1 combine 2 but the former is not in scope as it is not an extension method.

I would rather treat extension methods as methods that only allow syntactically different invocation mechanism. So they are brought into scope exactly the same way as all other definitions. As for the type classes, I believe they should be addressed directly, maybe with some special syntax.

morgen-peschke · September 14, 2020, 9:27pm

On that note, has anyone figured out how to implement something like 1.pure[List] as an extension method?

I tried a couple different variants, but nothing worked.

trait Pure[F[_]] {
  def pure[A](a: A): F[A]
  
  extension [A] (a: A) def pureOne: F[A] = pure(a)
}
object Pure {
  given Pure[List] {
    def pure[A](a: A): List[A] = a :: Nil
  }
  
  extension [A,F[_]: Pure] (a: A)  def pureTwo: F[A] = summon[Pure[F]].pure(a)
  
  final class PartiallyAppliedPureThree[A](val a: A) extends AnyVal {
    def apply[F[_]: Pure] = summon[Pure[F]].pure(a)
  }
    
  extension [A] (a: A) def pureThree: PartiallyAppliedPureThree[A] = new PartiallyAppliedPureThree[A](a)      
}

def trialOne() = {
  // Without the explicit import of givens, fails with:
  // "value pureE is not a member of Int"
  import Pure.{given _}
  
  // Fails with:
  // value pureE is not a member of Int.
  // An extension method was tried, but could not be fully constructed:
  //
  //     Pure.given_Pure_List.extension_pureE[List](1)
  //println(1.pureOne[List])
  
  // Fails with:
  // Found:    (1 : Int)
  // Required: List
  //println(Pure.given_Pure_List.extension_pureOne[List](1))
}

def trialTwo() = {
  // Without explict import of method, fails with:
  // value pureTwo is not a member of Int
  import Pure.pureTwo
  
  // Fails with:
  // value pureTwo is not a member of Int.
  // An extension method was tried, but could not be fully constructed:
  //
  //     Pure.extension_pureTwo[List](1)
  //println(1.pureTwo[List])
  
  // Works if called explicitly with explicit type parameters
  println(Pure.extension_pureTwo[Int,List](1))
}

def trialThree() = {
  // Without explicit import of method, fails with:
  // value pureThree is not a member of Int
  import Pure._
  
  // Fails with:
  // value pureThree is not a member of Int.
  // An extension method was tried, but could not be fully constructed:
  //
  //     Pure.extension_pureThree[List](1)
  //println(1.pureThree[List])
  
  // Fails with:
  // Found:    (1 : Int)
  // Required: List
  // println(Pure.extension_pureThree[List](1))
  
  // Works, if called explicity with the type parameter second
  println(Pure.extension_pureThree(1)[List])
  // More explicit version of the preceding call
  println(Pure.extension_pureThree(1).apply[List])
  
  // Also works
  println(1.pureThree.apply[List])
}

@main
def run(): Unit = {
  trialOne()
  trialTwo()
  trialThree()
}

tarsa · September 15, 2020, 4:56am

If we have that (the apply method is common pattern in FP libraries, I guess):

trait SemiGroup[T]:
  extension (x: T) def combine (y: T): T
trait Monoid[T] extends SemiGroup[T]:
  def unit: T
object Monoid:
  def apply[T: Monoid] = summon[Monoid[T]]

Then we can do:

def combineAll[T: Monoid](xs: List[T]): T =
  xs.foldLeft(Monoid[T].unit)(_.combine(_))

which is both more concise and still unambiguous, so it can be refactored freely.

rgwilton · September 15, 2020, 8:07am

I don’t have the expertise, and hence could easily be wrong, but …

object Monoid:
  def apply[T: Monoid] = summon[Monoid[T]]

… starts to look like this could end up turning into unwanted boilerplate to me.

LPTK · September 15, 2020, 11:24am

By the way, some time ago I found a rather lightweight trick to avoid the apply-method boilerplate:

trait Monoid[T] {
  def (lhs: T) append (rhs: T): T
  extension (lhs: Monoid.type) def empty: T
}
object Monoid

Then this works:

def foo[A: Monoid](a: A) =
  a append Monoid.empty

However, this approach does not seem blessed, as its mention was removed from the Dotty documentation.

ysthakur · September 15, 2020, 3:23pm

You’re right, if there are multiple unit methods, one would still have to do Monoid.unit or summon[Monoid[T]].unit or something like that, but it would be handy to do be able to do just unit when there isn’t such a conflict. Perhaps the compiler would be able to resolve it with a type annotation such as foldLeft(unit : T) (although resolving it based on return type is a little too much, I guess)

morgen-peschke · September 15, 2020, 3:41pm

This is true, however as we have the same boilerplate today, it’s not strictly worse than our current situation.

rgwilton · September 15, 2020, 4:37pm

I agree. But I perceive that one of the big benefits of Scala 3 is meant to be less boilerplate magic and easier to use constructs. So, if every typeclass is going to end up with something like this then personally I would prefer that this is done automatically via something like a typeclass keyword or annotation (if that makes sense). If the keyword/annotation happens to create a companion object with apply method under the covers then that is fine to me.

rjolly · September 20, 2020, 8:26am

I would like to mention that combine is only “partially” imported, as can be seen when we throw an implicit conversion into the mix:

class A

object A:
  given Conversion[Int, A]:
    def apply(n: Int) = ???

given Monoid[A]:
  extension (x: A) def combine (y: A) = ???
  def unit = ???

val a = new A
1.combine(a)
^^^^^^^^^
value combine is not a member of Int, but could be made available as an extension method.

The following import might fix the problem:

  import given_Monoid_A.combine

And indeed if we add the import then it works as expected.

Katrix · September 20, 2020, 3:02pm

While I’m iffy on that point specifically, and agree that it can be surprising, I think it’s for the best. Can’t imagine compile performance would get better if something like the above worked.

Could be completely wrong though, and it wouldn’t affect anything at all.

Still feel iffy about having multiple layers of indirection though.

rcano · September 20, 2020, 4:20pm

That looks like a bug to me, because after the import is added, then it works. So it seems the search scope is being incorrectly set up.

rjolly · September 24, 2020, 3:07pm

odersky · September 25, 2020, 12:33pm

Basically, whenever you throw implicit conversions into the mix, you have a high chance of surprising results. So, it’s better to not do that. Over time, I’d like to try to get rid of implicit conversions. This means it is now much less appealing for me to work on corner cases where they hinder better type inference. And I doubt anybody else will have the stomach to venture into this super slippery terrain.

rjolly · September 28, 2020, 3:34pm

I have tried your workaround in Can We Wean Scala Off Implicit Conversions?

This still uses the concept of Conversion , but it’s no longer an implicit conversion . The conversion is applied explicitly whereever it is needed. The idea is that with the help of using clauses we can “push” the applications of conversions from user code to a few critical points in the libraries

, and it works - at the expense of some kind of algebraic purity in the type class definition:

import scala.language.implicitConversions

given id[T] as Conversion[T, T] = identity

trait SemiGroup[T]:
  extension [U](x: U)(using c: Conversion[U, T]) def combine (y: T): T

trait Monoid[T] extends SemiGroup[T]:
  def unit: T

class A

object A:
  given Conversion[Int, A]:
    def apply(n: Int) = ???

given Monoid[A]:
  extension [U](x: U)(using c: Conversion[U, A]) def combine (y: A) = ???
  def unit = ???

val a = new A

a.combine(a) // ok
a.combine(1) // ok
1.combine(a) // ok

There is still an issue if we introduce a second type B:

class B

object B:
  given Conversion[Int, B]:
    def apply(n: Int) = ???

given Monoid[B]:
  extension [U](x: U)(using c: Conversion[U, B]) def combine (y: B) = ???
  def unit = ???

val b = new B

b.combine(b) // ok
b.combine(1) // ok
1.combine(b)
^
Found:    (1 : Int)
Required: ?{ combine: ? }
Note that implicit extension methods cannot be applied because they are ambiguous;
both object given_Monoid_A and object given_Monoid_B provide an extension method `combine` on (1 : Int)

If someone has an idea how to solve this…

Katrix · September 28, 2020, 5:18pm

Use a separate trait for each class you want to convert from instead of “polluting” the global space of conversions.

So something like this

trait AConversion[T]
  def apply(t: T): A

trait BConversion[T]
  def apply(t: T): B

given Monoid[A]:
  extension [U](x: U)(using c: AConversion[U]) def combine (y: A) = ???
  def unit = ???

given Monoid[B]:
  extension [U](x: U)(using c: BConversion[U]) def combine (y: B) = ???
  def unit = ???

rjolly · September 29, 2020, 8:44am

It does not work, as we need to respect the method signature in SemiGroup:

trait SemiGroup[T]:
  extension [U](x: U)(using c: Conversion[U, T]) def combine (y: T): T

trait Monoid[T] extends SemiGroup[T]:
  def unit: T

class A

object A:
  abstract class Conversion[U] extends scala.Conversion[U, A]
  given Conversion[Int]:
    def apply(n: Int) = ???

given Monoid[A]:
  extension [U](x: U)(using c: A.Conversion[U]) def combine (y: A) = ???
  def unit = ???

given Monoid[A]:
      ^
object creation impossible, since def extension_combine: [U](x: U)(using c: Conversion[U, T])(y: T): T is not defined 
(Note that U does not match U)

ysthakur · September 29, 2020, 5:04pm

I suppose importing unit automatically might hurt compiler performance, but that would probably be mostly when there aren’t any unit methods in scope and the compiler has to hunt through every last typeclass instance (assuming there are several typeclasses defining it).

And anyways, having combine be accessible because of the presence of a given Monoid instance seems to have the same problem (although I don’t know anything about the compiler’s internals). IMO they both have the same disadvantages, and Dotty should have both or neither.

rjolly · September 30, 2020, 8:51am

There is an additional problem with automatic imports (or any imports for that matter) in case there are several with the same name. This causes ambiguity errors. My intuition is that it could produce overloaded definitions instead, but maybe I’m wrong. This is briefly discussed in https://github.com/lampepfl/dotty/issues/9882