Pre-SIP: Allow fully implicit conversions in Scala 3 with `into`

Motivation and Background

We had already two long threads about implicit conversions in Scala 3:

At the time we identified a lot of issues with implicit conversions and discussed possible remedies, but we did not arrive at a solution that felt 100% convincing.

Here is the problem: Scala 2 implicits are going away. Not right now, but we should start planning to phase them out. The old style implicit def conversions will need to be replaced by instances of the Conversion class. Unlike old implicit def, Conversion requires a language import

import scala.language.implicitConversions

for the files in which implicit conversions are used (i.e. expected to be inserted implicitly, explicit conversions are fine).

If the import is missing, a feature warning is currently issued, and this will become an error in future versions of Scala 3. The motivation for this restriction is two-fold:

  • Code with hidden implicit conversions is hard to understand and might have correctness or performance issues that go undetected.
  • If we require explicit user-opt in for implicit conversions, we can significantly improve type inference by propagating expected type information more widely in those parts of the program where there is no opt-in.

So, how much of a restriction will that be? We identified one broad use case where implicit conversions are sometimes very hard to replace. This is the case where an implicit conversion is used to adapt a method argument to its formal parameter type. An example from the standard library:

scala> val xs = List(0, 1)
scala> val ys = Array(2, 3)
scala> xs ++ ys
val res0: List[Int] = List(0, 1, 2, 3)

The input line xs ++ ys makes use of an implicit conversion from Array[Int] to IterableOnce[Int]. This conversion is defined in the standard library as an implicit def. Once the standard library is rewritten with Scala 3 conversions, this will require a language import at the use site, which is clearly unacceptable. It is theoretically possible to avoid the need for implicit conversions using method overloading or type classes, but

  • this would often lead to longer and more complicated code,
  • it would break binary backwards compatibility, and
  • it would not work for vararg parameters.

Previous proposals concentrated on some kind of modifier (in its latest incarnation it was called into) that would allow conversions on function arguments. There were some concerns that this would be too inflexible and that it would be better to make into be a type that can be used anywhere, including in type parameters and type aliases. That is what this pre-SIP proposes.

Proposal

I propose a new type constructor into[T] that is treated specially in the compiler to enable fully implicit conversions that do not require a language import. As a first example, here is a signature of a ++ method on List[A] that uses it:

def ++ (elems: into[IterableOnce[A]]): List[A]

The into wrapper on the type of elems means that implicit conversions can be applied to convert the actual argument to an IterableOnce value, and this without needing a language import.

into is defined as follows in the companion object of the scala.Conversion class:

opaque type into[T] >: T = T

Types of the form into[T] are treated specially during type checking. If the expected type of an expression is into[T] then an implicit conversion to that type can be inserted without the need for a language import.

Note: Unlike other types, into starts with a lower-case letter. This emphasizes the fact that into is treated specially by the compiler, by making into look more like a keyword than a regular type.

Example:

given Conversion[Array[Int], IterableOnce[Int]] = wrapIntArray
val xs: List[Int] = List(1)
val ys: Array[Int] = Array(2, 3)
xs ++ ys

This inserts the given conversion on the ys argument in xs ++ ys. It typechecks without a feature warning since the formal parameter of ++ is of type into[IterableOnce], which is also the expected type of ys.

into in Function Results

into allows conversions everywhere it appears as expected type, including in the results of function arguments. For instance, consider the new proposed signature of the flatMap method on List[A]:

def flatMap[B](f: A => into[IterableOnce[B]]): List[B]

This accepts all actual arguments f that, when applied to a value of type A, give a result that is convertible to type IterableOnce[B]. So the following would work:

scala> val xs = List(1, 2, 3)
scala> xs.flatMap(x => x.toString * x)
val res2: List[Char] = List(1, 2, 2, 3, 3, 3)

Here, the conversion from String to Iterable[Char] is applied on the results of flatMap’s function argument when it is applied to the elements of xs.

Vararg arguments

When applied to a vararg parameter, into allows a conversion on each argument value individually. For example, consider a method concatAll that concatenates a variable number of IterableOnce[Char] arguments, and also allows implicit conversions into IterableOnce[Char]:

def concatAll(xss: into[IterableOnce[Char]]*): List[Char] =
  xss.foldLeft(List[Char]())(_ ++ _)

Here, the call

concatAll(List('a'), "bc", Array('d', 'e'))

would apply two different implicit conversions: the conversion from String to Iterable[Char] gets applied to the second argument and the conversion from Array[Char] to Iterable[Char] gets applied to the third argument.

Unwrapping into

Since into[T] is an opaque type, its run-time representation is just T. At compile time, the type into[T] is a known supertype of the type T. So if t: T, then

val x: into[T] = t

typechecks but

val y: T = x // error

is ill-typed. We can recover the underlying type T using the underlying extension method which is also defined in object Conversion:

import Conversion.underlying

val y: T = x.underlying // ok

However, the next section shows that unwrapping with .underlying is not needed for parameters, which is the most common use case. So explicit unwrapping should be quite rare.

Dropping into for Parameters in Method Bodies

The typical use cases for into wrappers are for parameters. Here, they specify that the corresponding arguments can be converted to the formal parameter types. On the other hand, inside a method, a parameter type can be assumed to be of the underlying type since the conversion already took place when the enclosing method was called. This is reflected in the type system which erases into wrappers in the local types of parameters as they are seen in a method body. Here is an example:

def ++ (elems: into[IterableOnce[A]]): List[A] =
  val buf = ListBuffer[A]()
  for elem <- elems.iterator do // no `.underlying` needed here
  buf += elems
  buf.toList

Inside the ++ method, the elems parameter is of type IterableOnce[A], not into[IterableOne[A]]. Hence, we can simply write elems.iterator to get at the iterator method of the IterableOnce class.

Specifically (meaning in spec-language): we erase all into wrappers in the local types of parameter types, on the top-level of these types as well as in all top-level covariant subparts. Here, a part S of a type T is top-level covariant if it is not enclosed in some type that appears in contra-variant or invariant position in T.

Into in Aliases

Since into is a regular type constructor, it can be used anywhere, including in type aliases and type parameters. This gives a lot of flexibility to enable implicit conversions for user-visible types. For instance, the Laminar framework
defines a type Modifier that is commonly used as a parameter type of user-defined methods and that should support implicit conversions into it. Patterns like this can be supported by defining a type alias such as

type Modifier = into[ModifierClass]

The into-erasure for function parameters also works in aliased types. So a function defining parameters of Modifier type can use them internally as if they were from the underlying ModifierClass.

Implementation

A full implementation of the proposal is provided by this PR:

Alternatives

  1. Make into a modifier, as in the experimental scheme supported until now. The proposal here is a lot simpler since it makes use of the power of the type system instead of building up a parallel structure based on modifiers. It is also considerably more flexible than the previous scheme. One open question might be whether it’s too flexible.
  2. Also allow implicit conversions on literals without requiring an into in the target type. This would be even more flexible. I am not sure we need it, but we could add it later.
  3. Go back to allowing all implicit conversions without restrictions or language imports, or with just some exclusions (like: no implicit conversions where extension methods would also do the trick). I believe allowing all conversions would perpetuate the problems we had with them in Scala 2, and am not convinced that we can find a class of ā€œharmlessā€ conversions that are always unproblematic yet flexible enough for all use cases. Requiring explicit co-operation from libraries via into gives us more power to arrive at precisely tailored solutions.
10 Likes

There’s one variant on this idea that I mentioned in the earlier discussion:

Rather than having trait Foo0; type Foo = into[Foo0] as the way to make Foo implicit convertible, what about making implicit convertibility a modifier on the type e.g. into trait Foo? Then into case class Into[T](underlying: T) can just be a user-land helper built on top of the more fundamental construct.

Allowing into on the type definition would also simplify the migration process:

  1. The current proposal would require we change every signature everywhere to wrap the parameter type T in into[T], possibly causing weird issues with binary compatibility, overload resolution, type inference, and other things

  2. Allowing us to annotate the types themselves as into traits or into classes would leave all the signatures intact, and only require a single new modifier to be added on the trait/class definition

The problem with the current proposal is that it adds a whole lot of wrapping and unwrapping, which adds a lot of incidental complexity. What we really want to say is ā€œthis type is a valid implicit conversion target, this other type is notā€, and allowing into to be a modifier on the type definition allows us to say that in the most concise way possible, while keeping the rest of the language unchanged

1 Like

I don’t think that’s what we want? If a type is always a valid implicit conversion target, just make a typeclass, or use Conversion, or something?

But what to do about cases where we sometimes want a conversion (i.e. at the use site)? into[T] provides exactly the solution for that: we want T, and we won’t be shy about figuring out how to get it.

I guess the question is which pattern is more in need of facilitation. The pattern that is going away, save with Conversion, is that I don’t control Foo but I want to be able to convert Bar to it transparently.

These would have to be analyzed carefully of course.

  • Binary compatibility: into[T] erases to T, so the binary signature is not affected.

  • Overload resolution: The prototypical case where things change is

    class A extends B
    def foo(x: into[A]): Unit = ???
    def foo(x: B): Unit = ???
    foo(new A)
    

    Without into, the first foo is more specific than the second, so it will be chosen. With into the call is ambiguous. I don’t think it will a problem, since a pattern like this looks weird to me. But if it does turn out to be a real problem we could tweak the overloading rules to ignore into. Likewise for implicit resolution.

In the case of the standard library that would not work. I certainly don’t want to annotate the trait IterableOnce as into! Also it will not work if the target type is not under your control. So I fear this will be too inflexible.

The question would be: why do you not want to annotate IterableOnce as into? If I define a method def foo(values: IterableOnce[Int]), does making it fail when I try to run foo(Array(1, 2, 3)) provide any value to the user? It seems to me that having Seq() ++ Array(1, 2, 3) work due to an implicit conversion, but making foo(Array(1, 2, 3)) not work because I forgot the into wrapper, be arbitrary and confusing.

I think we should consider two-levels to this user choice:

  1. If you control the type, you should be able to mark it as an implicit conversion target in a concise and convenient manner. After all, as the owner of that type you know how it’s meant to be used. This can be done by allowing implicit conversions in the companion object (the most common place to find them, e.g. for the magnet-pattern) or allowing the type itself to be marked into

  2. If you do not control the target type, we should allow implicit conversions but in a more explicit/verbose/inconvenient manner. This could be via an into[] wrapper on the target type, adding a bit of friction but not prohibiting it

This dichotomy is similar to the orphaned typeclass problem, where most typeclasses are defined in the companion object, but uncommonly you find typeclasses defined elsewhere for good reason (e.g. integrating two independent libraries). We generally do not want to prohibit orphaned typeclasses, but it does make sense to e.g. require an explicit import whereas non-orphaned typeclasses get picked up automatically.

Similarly, it should be easy for the definer of a type (class, trait, etc.) to mark it as an implicit conversion target, and it should be possible for downstream code to use a type not marked as an implicit conversion target even if it is less easy than it is for the definer

TBH I think the problem with this proposal, and all the proposals before it, is that the process is exactly backwards. The way we find a path forward here is by

  1. Inventorying the usage of implicit conversions in the ecosystem, perhaps starting from scala-library and scala-toolkit (which are not even a lot of code!)

  2. Categorizing them into groups of common design patterns

  3. Deciding which patterns are valid and justified and which patterns are problematic.

  4. From there, it would then be possible to design a language feature grounded in what we want to achieve, which is to prohibit or inconvenience the problematic usage patterns, while allowing the valid ones.

  5. Design a migration path for the problematic usage patterns so they have something to migrate to, rather than just breaking those libraries forever

By starting first from an implementation and language spec changes, that gives us no insight whatsoever on the impact of the language change, which is what we actually care about. We end up just talking in circles in hypotheticals with nothing anchoring us to the reality of the ecosystem except vague hand-waving. And there is no way at all to judge whose hand-waving is more correct!

To solve that, we need to flip the process on its head, and start by studying the various libraries in the ecosystem to really understand what implicit conversions are currently being used for. That is the only way we can find a reasonable path forward that does not involve breaking the entire Scala ecosystem that relies on this feature

4 Likes

I completely agree. Putting the carriage before the horses is not the right path forward, IMO.

And without getting into the specifics of the proposal, I would say that I require an equivalent from, when I have an AnyVal wrapper that I want to convert back to its underlying value wherever applicable. It won’t make sense to mark everything else as into for this use-case.

I think that would be too sweeping. In general, I would not know all the use cases of IterableOnce when I define the trait. There might well be uses where I do not want an implicit conversion (and get better type inference in return).

About two ways to define things: once as a modifier on the type definition, the other as a wrapper described: Since you get the type modifier on definition functionality simply by defining an additional type alias, the modifier idea seems to be largely redundant?

About a survey of libraries: I agree this would be a good idea. I believe we have already unearthed many examples in the previous discussion traits, where I explicitly invited people to come with use cases for implicit conversions. A more systematic investigation could be done by people who know some of these library stacks. If someone wants to step up to do a systematic survey this would be useful. Just be aware that the goal will not be ā€œwe will support all existing use cases of implicit defsā€ because that would means we also inherit the mess they caused.

Regarding the proposal,
I think underlying should be named convert.

import Conversion.convert
trait Foo
object Foo:
  given Conversion[Int, Foo] = ???

def withInto(arg: into[Foo]) = ???
def withoutInto(arg: Foo) = ???

withInto(42) //works
withoutInto(42) //fails
withoutInto(42.convert) //works

Additionally, one use-case that this doesn’t solve is virtualization of if expressions/statements (overloading if with my own type):

trait MyBool
implicit def BooleanHack(from: MyBool): Boolean = ???
val b: MyBool = ???
if(b) {/*something*/} else {/*something else*/}

I have a compiler plugin that identifies BooleanHack ifs and converts them to method calls. I would be happy to abandon this mechanism in favor a proper ability if virtualization similarly to Virtualized Scala Reference Ā· TiarkRompf/scala-virtualized Wiki Ā· GitHub

1 Like

I mean… Is it? I think I actually like the explicit import.

Usually, if I’m using Array (instead of ArraySeq or another collection) it’s because I’m writing performance sensitive code.

I’m not sure about the JVM, but at least in Scala Native I’ve had unexpected performance degradation due to those implicit conversions, so it’s kind of nice to have the compiler give a heads up.

Then it’s up to the user to:

  1. Rewrite the code to avoid the conversion, if they need the extra performance guarantees
  2. Rewrite the code to use ArraySeq, if they care about not having error-prone code
  3. Add the import, if they really want to use the Array with automatic boxing

Both 2 and 3 seem pretty straightforward.

And sure, this is not just about arrays, but in general adding the import doesn’t seem like much of a big deal, especially if the compiler is able to recommend it.

5 Likes

I’d also be in favor of something else, perhaps as ?
(and then we can rename isInstanceOf to is, so we have the two letter methods which are safe, and the long one which is not)

One huge difference between the type alias and the modifier is backwards compatibility. Having every library rename all of its trait Foos to trait Foo0 to add a new type Foo = into[Foo0] is a source and bincompat breaking change. Adding a new modifier into trait Foo is not a breaking change. Presumably preserving source and bincompat across the ecosystem - at least for ā€œvalidā€ usage patterns - is a big requirement here.

For example if previously we had

// library
class Foo
// downstream code
val foo = new Foo 
object Bar extends Foo

And the library was changed to accomodate the stricter requirements around implicit conversions

// library
class Foo0
type Foo = into[Foo0]

Then the user code would break, because new into[Foo0] is not a valid constructor, and extends into[Foo0] is not a valid class type for inheritance. into is opaque, so it doesn’t get resolved to the underlying class type

Similarly, any Java libraries calling method signatures containing Foo parameters will now find the new signatures instead containing Foo0 parameters, which breaks binary compatibility

On the other hand, if the library was changed to

// library
into class Foo

Then new Foo and extends Foo could continue to work, and any Java callsites can continue to work, even after adopting the stricter requirements around implicit conversions

1 Like

Surely ā€œnot knowing all the use cases when you define the traitā€ is the whole point of open traits like IterableOnce? If you wanted to know all implementors ahead of time, you would make it sealed. The fact that you left it open means you intend there to be implementors you are not aware of, and any code written must be robust against that.

What are these concrete cases of IterableaOnce where you don’t want to accept Array? We need to be concrete here to have something to discuss. If we can’t come up with any concrete examples, it seems reasonable to conclude maybe there are no such cases, which is the state of the discussion so far

Maybe we can come up with a different example than IterableOnce?

It is almost uniquely well-suited to simply being into[Iterator]. If we’d had that feature during the collection rewrite, I’d have argued for that.

Let’s pick something else like Seq? Array is coerced into Seq. On the one hand, this is lovely; you can Array("hi", "there ").map(_.trim) and it just works. It is also terrible because you can Array(1, 2).map(x => x*x) and wonder why you’re 30x slower than Java.

And suppose you have def findSecondIndex[A](s: Seq[A], a: A): Int = ???. You can simply findSecondIndex("troublingly", 'l'), except it’s about 100x slower than you expect. If it were s: into[Seq[A]] at least there’d be fair warning.

We’ve already made this bed, so we probably have to sleep in it, but these are big enough peas under the mattress so that we probably want to re-think whether we want to create more.

The random promotion of high-efficiency primitive to boxed low-efficiency operations is one of the more annoying features of Scala when one needs to compute something computationally intensive, not just keep track of tricky logic.

2 Likes

Yes that is entirely my point.

For this proposal to move forward, we need clear and convincing examples of the usage patterns we deem problematic and would like to guard against. Without such a problem statement, there’s literally nothing to discuss.

If we go forward without, we would go through a herculean effort changing the language and breaking everyone’s code, ending up with no idea whether we accomplished our goals because we never had concrete goals to begin with!

So, what is the exact problem we are trying to guard against here? That’s the first thing we need to figure out. Without such a convincing problem statement, there isnt really any point in discussing the minutiae of spec changes and syntax implementation details.

But I just gave exactly two of those: the problematic conversion of Array[Primitive] and ā€œhi i am a stringā€ into surprisingly low-efficiency generic operations.

Note that you are protected against this in regular code because StringOps shields almost all methods on String from dispatching through Seq, and ArrayOps shields almost all methods on Array from dispatching through Seq and failing to return an Array because of the lack of a ClassTag. But in user-space, if you innocently write s: Seq[A], you unexpectedly unleash the ability to process Array and String with shockingly poor performance. At least with into[Seq[A]] you would be asking for it.

It is very usefull usecase for us too. It allows to implement 3vl.

IMHO: The same logic in sql and procedural language reduces the number of errors when people writes many sql qeuies.

I agree, IterableOnce should have been into[Iterator], that’s a great observation. So yes, IterableOnce might not be the best example.

But I reject the attitude that says ā€œunless you know all library use cases intimately you have no right to make or discuss a proposalā€. I have written the Scala compiler and large parts of the Scala standard library. I take my examples from there because I know them well and can judge them. Based on this, it’s my role to come up with a proposal and present it for discussion. You and other library authors are invited to give feedback and together we should evaluate the proposal. That’s how the discussion works. Stating that the proposal should not have been made because it comes from insufficient evidence is not helpful and feels a bit like gatekeeping. The point of the discussion here is to gather that evidence!

Getting back to into as a modifier. Its advantages are that it allows an easy retrofit of most existing code. As long as conversions are between types defined in your code, just add into to all possible target classes and traits and you are done. The downside is that you can again easily shoot yourself in the foot as @Ichoran has well demonstrated.

Binary compatibility is not really an issue. Let’s take Laminar’s Modifier as an example. A backwards binary compatible way to upgrade would introduce the alias

type ToModifier = into[Modifier]

and then all methods would use ToModifier instead of Modifier in their parameter types. Or they would use into[Modifier] directly. Either scheme would be binary compatible with the current codebase.

That said, I am not fundamentally opposed to also have into as a modifier. It would certainly alleviate all concerns about migration. It should come with a culture that emphasizes that having many into targets just in case someone needs a conversion is a code smell and that it’s better to use more specific into[...] types in arguments. As long as we can foster that culture I would be OK with it.

1 Like

I think that has it backwards. The status quo is that all conversions will need language imports. If the SIP committee does not agree on a proposal, this is what will happen. So the proposal should identify large classes of patterns that are deemed generally unproblematic and investigate mechanisms to allow those. That’s why I made this proposal.

One might argue that this is just playing with words, but it really is not. The question is what happens if there is insufficent evidence to either support or contradict a hypothesis that a class of conversions is problematic.

Two use-cases which require conversion between F[T] and T:

  • automatic coloring (now deleted, but this was more social than a technical issue). [dotty-cps-async]
  • DSL when you build expressions over some Repr[T] and want build with Repr[T] the same expressions as over T. (i.e.:
    collection[A].query(_.name == collection[String].queryOne(cond)) )
    Example, as I can remember - Quill.

still not solved.

I’m not sure that they need to be covered. (Maybe not - first is not used now, for the second exists an alternative technique (HOAS embedding) with the same (or maybe slightly worse, because of arrows at the beginning) syntax)