Use Cases for Implicit Conversion in Bulk Extensions

I’ll continue the theme here with some feedback about deprecation of Implicit Conversions in my pet flavor.

In the pre-sip thread, it is suggested that implicit conversions for bulk extension can be replaced with a construct like this:

Unfortunately, I believe that will break a lot of code. Extensions as they currently exist break right-associativity*. So, every line of code that uses +: or ++: (just to mention the ones on ArrayOps) will be broken and need to have its operands flipped to compile again**.

(* For some definitions of “broken” and “by design”)
(** This one is true regardless of what you think about the intentionality of extensions behavior with right-associativity)

Of course, you could perhaps come up with a means to exclude the right-associative operators from the export above and then add extensions to the Array object like so:

extension [A] (a: A)
  def +: (arr: Array[A]) = ...

extension [A] (it: IterableOnce[A])
  def ++:(arr: Array[A]) = ...

extension [A] (arr: Array[A])
  def ++:(arr: Array[A]) = ...

But then we’re right back to this issue:

Not to mention that you will then have to do the same for every type that currently benefits from implicit conversion to every other collection type in the standard library. And that’s not to mention code outside of the standard library making use of implicit conversion to types with right-associative operators on them.

It seems to me that we either need a very long lead time to wean the entire ecosystem off of right-associativity in every use case, or we will still need some mechanism for adding them to (er… extending(?)) types as they currently behave.

Having said all of this, I’m not particularly a fan of implicit classes / conversions. But given the behavior of extensions around this topic, I don’t currently see another option that can achieve the same goals in this regard.

1 Like

I’m not so sure I follow your claim

e.g. in Scala 3.3.1 here I define an alternative syntax for tuple prepend/append and it behaves the same at the call-site if I enable the extension method via implicit conversion, or bulk export:

class ConcatOps[T <: Tuple](ts: T):
  def %: [U](elem: U): U *: T = elem *: ts
  def :% [U](elem: U): Tuple.Append[T, U] = ts :* elem

object ImplicitConversions:
  given [T <: Tuple]: Conversion[T, ConcatOps[T]] = ConcatOps(_)

  def test: (Int, String, Boolean) =
    1 %: "abc" %: EmptyTuple :% true

object Bulk:
  extension [T <: Tuple](ts: T)
    def concatOps: ConcatOps[T] = ConcatOps(ts)
    export concatOps.*

  def test: (Int, String, Boolean) =
    1 %: "abc" %: EmptyTuple :% true
1 Like

Ok, so shockingly you are absolutely correct (not shocking that you are correct… shocking that this works). However, I do think this whole thing highlights the absurdity of extension handling of right-associative methods:

scala> extension [T <: Tuple](ts: T)
   def concatOps: ConcatOps[T] = new ConcatOps(ts)
   export concatOps.*

def concatOps[T <: Tuple](ts: T): ConcatOps[T]
def :%[T <: Tuple](ts: T)[U](elem: U): Tuple.Append[T, U]
def %:[T <: Tuple](ts: T)[U](elem: U): U *: T

scala> 1 %: "abc" %: EmptyTuple :% true
val res0: (Int, String, Boolean) = (1,abc,true)

scala> extension [T <: Tuple](ts: T)
   def &: [U] (elem: U): U *: T = elem *: ts

def &:[T <: Tuple][U](elem: U)(ts: T): U *: T

scala> 1 &: "abc" &: EmptyTuple :% true
-- [E007] Type Mismatch Error: -------------------------------------------------
1 |1 &: "abc" &: EmptyTuple :% true
  |     ^^^^^
  |     Found:    ("abc" : String)
  |     Required: Tuple
  |
  | longer explanation available when compiling with `-explain`
1 error found

Notice this:

def %:[T <: Tuple](ts: T)[U](elem: U): U *: T //From export
//vs
def &:[T <: Tuple][U](elem: U)(ts: T): U *: T //From direct declaration

I think I’ve lost the thread again on how it is, exactly, that extensions are not broken with respect to right-associative operators.

So, you are absolutely correct that the specific syntax I highlighted and you demonstrated does work. But I’m not convinced that it isn’t a bug given the stated intent and behavior of the operator when directly declared in an extension.

The inconsistency and confusion around extension behavior seems relevant enough on its own as an objection to completely ridding ourselves of the tried-and-true method of bulk extensions.

Thank you for pointing out my mistake. I happily refine and clarify my objection as stated.

1 Like

So the compiler automatically flips the arguments for right associative extension methods, and in fact it prints the correct way to use the method, e.g. it prints

def &:[T <: Tuple][U](elem: U)(ts: T): U *: T

which expects an element on the left, and a tuple on the right, which matches the type error.

Unfortunately the repl seems to not print the flipped arguments of the right associative methods when defined through the bulk export

I don’t think the problem is with the REPL’s print out of the extension method.

The compiler flips operands at the call site during compilation of right-associative methods. Since it was somehow decided that this is “bad” the implementation of extension proactively swaps them first at the definition site so that when they are flipped at the call site, it comes out “correct”.

As a result, the correct way to satisfy

def &:[T <: Tuple][U](elem: U)(ts: T): U *: T

is with a call that looks like this:

ts &: elem

which gets turned into something similar to elem.&:(ts) by the compiler. Or, I guess, in this case:

&:(elem)(ts)

So the tuple on the left and the element on the right at the call site (not element left and tuple right as you said… that’s done by the compiler and matches the REPL printed structure above).

Which is backwards. Just to simplify and clarify, here are two invocations of the two methods (exported vs direct):

scala> "a" %: EmptyTuple
val res0: String *: EmptyTuple.type = (a,)

scala> EmptyTuple &: "a"
val res1: String *: EmptyTuple.type = (a,)

Notice how “the COLon goes on the COLlection side” no longer applies to methods declared as extensions?

I can’t speak to how the compiler handles the export of class methods or why the export version doesn’t introduce the broken behavior (as far as I’m concerned) but the direct declaration does. But either way this is further evidence of inconsistency in extension behavior around this topic and that is concerning.

On the plus(?) side, I suppose I’ve learned that I can get around the broken behavior by declaring my right-associative methods in a class and exporting them through the extension I want instead of attempting to declare them directly as extensions. I’m not sure this is a win, but I know more than I did, so that’s something I guess.

2 Likes

Have you read this page ?
https://docs.scala-lang.org/scala3/reference/contextual/right-associative-extension-methods.html#

We tried to make it as clear as possible, given these are somewhat confusing, so if you had read it before, your feedback would be much appreciated

Sorry for the inexcusably late response to your question. Yes, I’ve seen that page. I’m not 100% sure if you are referring to feedback on the documentation of right-associative operators in extensions generally or if you mean the declare-in-a-class-and-then-export workaround.

If the former, I outlined my documentation “concerns” in a lot more detail in this comment on my bug report. Truthfully, however, I don’t consider the documentation as being problematic in this case… I consider the implementation to be problematic with regard to right-associative extension methods.

If you meant the latter, I guess the only documentation feedback I have is that I’ve never seen exporting class members in an extension block referenced in any documentation that I can recall. Or really anywhere prior to this thread. FWIW, last I checked IntelliJ still considered this a syntax error. So I think it is not well known at all. If this is, indeed, the fix for the broken behavior of right-associative methods in extensions, I think that would be outstanding to document clearly until such time as the language can be fixed.

1 Like

No problem ^^

I was asking about the documentation, I’m happy it was clear on the desugarring, but it’s true that export behavior should be better documented (I think I learned of it in a patch note or something ?)

As for the implementation, I believe it to be “the right one”, I believe it is the logical choice in a vacuum.
And for me this remains the case even when considering that most experienced Scala 2 developers whom I’ve told about this feature are similarly surprised.
We should see the swapping being done by normal right-associative methods as weird, not the other way around !

Since we went from 2 to 3, could it have been possible to outright disable normal right-associative methods ? I don’t know, but I believe this was a goal since the introduction of right-associative extension methods.

Can you help me understand this more? Why would this be true? It seems to me that reading L to R is normal enough, similarly R to L as in other natural languages. But I can’t think of a single context where we combine the two patterns into one and read phrases L to R but the overall sentence/paragraph R to L.

Combining R-Associativity with R-Receptivity makes so much intuitive sense to me that I am struggling to see how it could be considered “weird” and the R-A/L-R pattern considered “normal”.

I want to prepend something to a sequence:

thing +: seq

It’s structured like what is actually happening!

seq +: thing

I’m not sure I even know what this means.

And from an extension perspective, there is such overwhelming evidence that everyone thinks of it in terms of extending a type (including documentation authors of various sources!) not in terms of special function application syntax. And if the prevalent mental model is extending a type, it should behave like a method on the type, right?

Say I give you a library with a type Foo in it. In my library I’ve mixed extensions into the code base for various circumstances. Your IDE says that in your current context the following are available:

Foo.%:
Foo.&:

What is happening with them? Which thing goes on the left and which on the right? If one is defined in a class and one is defined in an extension, they are opposite! To say this is “correct” makes no sense to me at all!

I am very specifically not trying to say that you are wrong. I can’t assert or defend that claim objectively. But I am saying that while I’ve heard the assertion, I don’t understand it and I’ve never heard a satisfactory answer for why that assertion is true. I’ve only heard the more verbose equivalent of “working as designed”.

So, on one hand, I would very much like to truly understand the rationale. But also, if we can agree that the ambiguity that now exists is bad (even if we disagree on which version is right), can we at least also agree that resolving that ambiguity is pretty important and allowing the ambiguity to remain is a bug in the language design?

The plan is to swap operand order of R-A methods in Scala? OK, fine. Deprecate any special meaning for : (without ambiguously breaking it!) and introduce a new suffix operator (say… >>?) which serves as the new R-A, L-R operator and the old : R-A, R-R operator stays as-is until Scala 3.5 or whatever. Right? What am I missing in that thinking?

Color me very confused and desperately wanting to understand this aspect of my favorite programming language!

I want to start by making something clear, the one goal of right associative methods (extensions or not) is to be parsed in a different way, namely right associative-ly:

// Normal infix method
l :+ x :+ y
// equivalent to
(l :+ x) :+ y

// Right Associative infix method
x +: y +: l
// equivalent to
x +: (y +: l)

This can be seen on the types: (Where +: is the method, not a type constructor.)

List[T] :+ T -> List[T]
T +: List[T] -> List[T]

And in general, a method looks the same at definition and use-cite:

class List[T]:
  def :+ (x: T)

l :+ x

In both, the list is on the left, and the element is on the right

But we couldn’t do that with right associative methods, because most of the time, they do not make sense if defined from the element, we would need every type to define +: for every collection !

class Int:
  def +: (l: List[Int])
  def +: (l: Seq[Int])
  ...
class String:
  def +: (l: List[String])
  def +: (l: Seq[String])
  ...
...

x +: l

Therefore, it was chosen to swap the parameters around, so we could define it from the collection side:

class List[T]:
  def +: (x: T)
// but used like
x +: l

But you’ll notice with extension methods, we can do either:

object List:
  extension (l: List[T])
    def +: (x: T)
  // or
  extension (x: T)
    def +: (l: List[T])

It was therefore decided we would have extension methods look like their application:

object List:
  extension (x: T)
    def +: (l: List[T])

x +: l

Both the definition and the call have the element on the left, and the collection on the right, as it should be !

2 Likes

Oh and as a consequence, you’ll notice we no longer need non-extension right-associative methods, they can be completely replaced by extension methods.

Hence why I think the former should be deprecated, as it’s essentially a legacy feature:
We had to do it the confusing way, but we no longer need to, so why do it ?

Of course as Scala developpers we tend to learn “+: means swap the arguments” instead of “+: means right-associative”, and it’s hard to unlearn that

1 Like

Thank you for the clear explanation of your thinking. I suspect others have tried to give me similar snippets of reasoning but for whatever reason I didn’t grok the arguments as clearly as I do here in your post. So, I appreciate your walkthrough.

I have a few follow-ups if you don’t mind:

If we are talking about parenthetical grouping of terms, then we’re into math-ish order of operations stuff. Mathematically these two things are the same (and I think it the vast majority of programming languages too):

// '^' is just some binary operator, doesn't matter which
a = b ^ c
z = x ^ a

z1 = x ^ (b ^ c)

assert z == z1

So, at the end of the day, if you follow the rabbit trail far enough down, in software, you get to a decision about receptivity for dispatch of a method call against one operand operating on the other. So, while you say that the only goal is to specify right-associativity, I don’t think we can escape the fact that directional receptivity is an implicit decision that must be made when implementing such features. We can’t only worry about grouping.

I hear you saying that your intuitive preference is that all calls, regardless of associativity, should be left-biased in their receptivity. Clearly, that concept aligns nicely with ordered parameter lists in a functional context as well (left on left, right on right). I’m sympathetic to that line of thinking (100% consistency is always nice!), but I also think we already don’t have receptivity consistency in various other ways not related to associativity at all. Since it’s not strictly related, I won’t belabor this point beyond asserting it here.

Maybe I’m unique in this, but these don’t look the same to me at all. The definition has no left side. It is merely a specification of a message that can be dispatched to the type. That message supports 0 or more parameters. I don’t see how we can infer any particular call-site structure from a def like this apart from the specific rules of the language that tell us how to do so. It could also legitimately be used like this:

x.pipe(l.:+)

Now x-ish things are on the left and l-ish things are on the right! That’s a valid way to accomplish the same thing, violates no language rules, and happily corresponds to the definitions of x, l, List[T], etc. You might say, yes but there’s a .pipe in there that is L-Receptive. True, but the whole purpose of pipe is to allow swapping positions in a call sequence, so I think the overall point still stands. If you prefer:

val myx = x.pipe
import l.:+

myx(:+) //Look 'ma, no l!

Another valid structure that can’t be inferred from the def. I’m sure there are nearly infinitely more such examples.

I won’t belabor the “looks like def” point. However, these two concerns seem to me to be at odds and weakly useful in justifying the breakage. Instead of clear specification of operations against the type being provided on the type, now you’ve relegated them to alternative blocks specifying what types may act upon this type with a particular operation. Instead of “L can prepend any T” it’s now “Any T allows L to use the T in a prepend operation”. Literally nothing else in the language works that way that I can think of. Maybe marker traits like CanEqual?

In any case, even if we say “well, we’ll keep all the extension blocks in T’s companion”, we’ve still divided the specification of T’s behavior across multiple scopes. And not in pursuit of better abstractions, polymorphism or anything else. Just because placing it on T directly looks ugly (to some people, I guess).

1 Like

Confusing or not is subjective, but I’m not opposed to change as a matter of course. If there is a better way (I’m not convinced yet, but I’m willing to be), I’m fine with deprecating and moving to it. I’ll get used to it.

What I’m NOT ok with is that we didn’t deprecate anything. We broke it. Colon suffixed operators no longer work in a consistent manner that any user of any library can predict under any particular mental model. And if the resolution for that scenario is “well, we’ll just disallow them on classes”, that is going to break a MASSIVE amount of production code. So that seems unlikely to be the way to go any time soon.

If we need to change right-associativity behavior in Scala, fine. I’ll do my best to be helpful. But what we have is a regression that broke an existing feature of the language. And THAT is a problem, no matter what you think about how method application should work in whatever scenario.

1 Like

I’m happy my explanation was useful !

I totally agree, but it’s important to separate associativity from receptivity, we could totally have non RA methods that are right-receptive.

But here’s the thing, I don’t think we should, I think all methods in Scala should be left-receptive, and the decision on RA extension methods follow that principle:

// We extend every type T with a method +: that prepends the element x to a list of T
extension [T](x: T)
  def +: (l: List[T])

// x calls method +: on l
x +: l

// even if the de-sugaring is
List.+:(l)(x)
// even if it was something like 
l.leftApply(x.rightUnary_+:)

(leftApply and rightUnary do not exist, and I don’t think they should)

I see what you mean, for me it does, namely the enclosing class

Sure but that still respect left-receptivity !
“x is piped into the prepend operation on l”

Finally, I really don’t get all this talk of “breakage”, extension methods did not exist before, no backward compatibility was harmed in the making of this feature !
Unintuitive new feature, while potentially undesirable, is not a regression

But they do !
You still put the element on the left and the collection on the right, what it gets desugarred to does not matter !
You should never call a symbolic method as .+(), RA or not, there’s append, prepend, etc for that purpose.

I don’t think there’s much more I can say on this subject, I exposed the historical reasons for why that decision was taken, and I gave my reasons for thinking it was the right choice, there’s not much more I can do ^^’

I’ve not really found this way of thinking of it particularly useful. c.m(a) and m(c)(a) have always been represented in my head as basically the same thing; the only question is whether you write the zeroth argument on the left and use . as your separator, or on the right and use () or ,.

Some languages like Rust and R drive this home more than others; Rust, because you have to always explicitly give the first argument, and R because it’s full of functions whose first arguments you chain together into a processing pipeline (return becomes first arg of next one) with %>% (tidyverse) or |> (newer, built-in).

It is true that import c.m is a more convenient and zero-overhead, as opposed to val f = m(c) _. And we empower c.m to do dynamic dispatch off the actual type of c, not the declared type, which we don’t do for m(c)(a) unless we have a language with multiple dispatch (like Julia).

But, anyway, other objections notwithstanding, I think the best way to deal with this one is to just embrace more deeply that c.m(a) is not fundamentally different from m(c)(a) even if it is facilitated by compilers/languages due to convenience.

Huh? Anywhere that you see a ~: b that worked before will still work. Any new cases will work, too.

scala> class S(val s: String):
     |   override def toString = s
     |   def \:(t: String) = S(s"$t<$s")
     | 
// defined class S
                                                                                
scala> extension (t: String)
     |   def ~:(s: S) = S(s"$t<${s.s}")
     | 
def ~:(s: S)(t: String): S
                                                                                
scala> val s = S("eel")
val s: S = eel
                                                                                
scala> "cod" \: s
val res0: S = cod<eel
                                                                                
scala> "cod" ~: s
val res1: S = cod<eel

Whether we use \: or ~:, we get the same semantics. Nothing’s broken.

If you get the types wrong, at least under 3.4, the compiler helps you out both ways:

scala> 'e' \: s
-- [E007] Type Mismatch Error: -------------------------------------------------
1 |'e' \: s
  |^^^
  |Found:    ('e' : Char)
  |Required: String
  |
  | longer explanation available when compiling with `-explain`
1 error found
                                                                                
scala> 'e' ~: s
-- [E007] Type Mismatch Error: -------------------------------------------------
1 |'e' ~: s
  |^^^
  |Found:    ('e' : Char)
  |Required: String
  |
  | longer explanation available when compiling with `-explain`
1 error found

The only thing that is broken is, upon seeing the weird \: or ~: operators, whether we need to look in the docs of the RHS or LHS to figure out what the heck is going on. (And the docs should give a usage example to remind us which case we’re in.)

If you’re using an editor with no support, that could be a bit of a pain (but it’s also a pain to look up any extension method). If you’re using a full-featured editor, it should be able to jump you to the right thing regardless.

The only problem is when you are writing the methods. Having gotten used to putting on your backwards-hat when writing def &: in classes, you now have to get used to not putting it on when writing extension methods. I don’t particularly like this. It’s an irritating inconsistency. It also has implications for what kinds of type inference will work in more difficult cases.

But I don’t think it’s correct to say that at use-site it’s not something that “any user of any library can predict”. It doesn’t break anything that works, and anything new that works will work stably in the same way (so you can reason from examples). It “only” imposes an annoying new burden, which is that you need to know whether it’s a real method or an extension method when reading docs if the docs don’t give you a usage example.

2 Likes

I’ll start by saying, thank you both for your engagement. I really appreciate it. These types of conversations are rare in industry in my experience, but I really love learning how other smart engineers think about problems and solutions. Helps make me better and that’s always fun. So thank you.

Far from falling short in any way, your responses and thoughtfulness have been immensely helpful.

This is a very fascinating assertion to me that I’ve never really considered. I’m not sure how I feel about it, but would love to think on it and maybe even have the opportunity to discuss it more. Languages such as those you mentioned, I’ve just mentally categorized as different paradigms rather than merely a different expression of the same paradigm. This warrants additional thought on my part. In any case, I’m happy to concede this point for the sake of the discussion.

Well put. I see from both of your responses that I was unclear to the point of confusion. Sorry about that. I didn’t mean “user” of a library in the sense of someone just accepting whatever auto-complete suggestions their IDE pulls from the library. Yes, I agree that if written properly at call site, the invocations look identical.

By “user” I guess I meant more than that. Maybe “power user”? I don’t know. Someone who takes a library (maybe not even 3rd party? Or maybe the std lib or any other?) and builds upon it for their own purposes. Things like considering what I need included in implicit scope and when. Where I should logically organize my extensions (er… additions? not necessarily extension) to the capabilities of the library to fill in gaps that are necessary for my particular use case. For example:

I have use library that provides a type T. I want to add an operation on T that uses a primitive value (say Char for the sake of argument) as the operand. But for whatever code-local or even esthetic reasons, it needs to be right-associative. In what implicit scope do I place an extension ... def %: ...? For all kinds of reasons, many of which even have nothing at all to do with right-associativity, my inclination would be to say that the operation should live in/on/near/adjacent to type T. Call it interface segregation or abstraction coherency or encapsulation or whatever term you like, but it seems obvious to me that an operator that is entirely about T’s, but which just happens to use a Char as input should live near or adjacent to the definition of T. So placing it in the implicit scope related to T seems like the “right” choice. It just so happened that the prior behavior of right associativity allowed for matching that expectation with implementation.

But now, that operator would live in implicit scope related to Char! But vanishingly few chars need have anything what-so-ever to do with T and even fewer to do with %:. So I’m forced to put it in the “wrong” place and remember that I have to include the MyTChars scope whenever I want to use it. I know that’s not hard, but it is surprising and leaves me, as a developer in the code-base, forever scratching my head going “wait… was this on the type itself or did I or a co-worker have to add this as an extension on the argument’s type?” Auto-code navigation is nice and all when it works. But when everything’s working I don’t really care to perform that navigation – who cares where it is, it’s working! I typically want to do it when I’m half-way through a line of code, the compiler is unhappy with my .scala file and IDE features half-work because I’m refactoring a code base and there are 347 compiler errors in the project at the moment. I need to be able to just know where to find it. Full text search is … painful in large code bases.

I know that the std lib answer is that an extension (c: Char) would go into T’s companion and everything would be hunky dory in the world of implicit resolution for users of the type. Agreed. But as a user of the type, I don’t have the luxury of placing additional things into the companion of T. So… when someone comes behind me after I’ve managed to finally make it all work, it is impossible for any of them to predict where the code is and whether it is the new extension style on-the-argument approach or the old, encapsulation-centric T based definition.

I hope that clarifies what I vastly oversimplified into the word “user” previously.

We are 100%, without any reservation, in total agreement on this point.

So do that!

We’ve already established that you don’t have control over either T or Char, so you’re going to have to import the stuff from somewhere anyway.

package somewhere

extension (t: T)
  def myFoo(t2: T) = ???

extension (c: Char)
  def %:(t: T) = ???

Now you import somewhere._ and you’re good.

Note that this is not at all a problem unique to right-binding operators.

Suppose, for instance, you want to use + and - to operate on Java’s otherwise pretty usable java.time classes. Maybe * and / should be able to alter the size of Duration, too. You certainly don’t make a separate MyIntDurationOps! You just stick everything together in some scope, extending whatever you need to work, and import the whole thing at once.

(And you use 3.4, so you get relaxedExtensionImports on by default, so extensions don’t all clobber each other.)

1 Like