Pre-SIP: Allow fully implicit conversions in Scala 3 with `into`

Thank you all for your input. That was very helpful improving the proposal. I believe the overall appreciation tended positive so I went ahead and made a formal SIP proposal:

12 Likes

There’s one variant on this idea that I mentioned in the earlier discussion:

Rather than having trait Foo0; type Foo = into[Foo0] as the way to make Foo implicit convertible, what about making implicit convertibility a modifier on the type e.g. into trait Foo? Then into case class Into[T](underlying: T) can just be a user-land helper built on top of the more fundamental construct.

Allowing into on the type definition would also simplify the migration process:

  1. The current proposal would require we change every signature everywhere to wrap the parameter type T in into[T], possibly causing weird issues with binary compatibility, overload resolution, type inference, and other things

  2. Allowing us to annotate the types themselves as into traits or into classes would leave all the signatures intact, and only require a single new modifier to be added on the trait/class definition

The problem with the current proposal is that it adds a whole lot of wrapping and unwrapping, which adds a lot of incidental complexity. What we really want to say is ā€œthis type is a valid implicit conversion target, this other type is notā€, and allowing into to be a modifier on the type definition allows us to say that in the most concise way possible, while keeping the rest of the language unchanged

3 Likes

I don’t think that’s what we want? If a type is always a valid implicit conversion target, just make a typeclass, or use Conversion, or something?

But what to do about cases where we sometimes want a conversion (i.e. at the use site)? into[T] provides exactly the solution for that: we want T, and we won’t be shy about figuring out how to get it.

I guess the question is which pattern is more in need of facilitation. The pattern that is going away, save with Conversion, is that I don’t control Foo but I want to be able to convert Bar to it transparently.

1 Like

These would have to be analyzed carefully of course.

  • Binary compatibility: into[T] erases to T, so the binary signature is not affected.

  • Overload resolution: The prototypical case where things change is

    class A extends B
    def foo(x: into[A]): Unit = ???
    def foo(x: B): Unit = ???
    foo(new A)
    

    Without into, the first foo is more specific than the second, so it will be chosen. With into the call is ambiguous. I don’t think it will a problem, since a pattern like this looks weird to me. But if it does turn out to be a real problem we could tweak the overloading rules to ignore into. Likewise for implicit resolution.

In the case of the standard library that would not work. I certainly don’t want to annotate the trait IterableOnce as into! Also it will not work if the target type is not under your control. So I fear this will be too inflexible.

1 Like

The question would be: why do you not want to annotate IterableOnce as into? If I define a method def foo(values: IterableOnce[Int]), does making it fail when I try to run foo(Array(1, 2, 3)) provide any value to the user? It seems to me that having Seq() ++ Array(1, 2, 3) work due to an implicit conversion, but making foo(Array(1, 2, 3)) not work because I forgot the into wrapper, be arbitrary and confusing.

I think we should consider two-levels to this user choice:

  1. If you control the type, you should be able to mark it as an implicit conversion target in a concise and convenient manner. After all, as the owner of that type you know how it’s meant to be used. This can be done by allowing implicit conversions in the companion object (the most common place to find them, e.g. for the magnet-pattern) or allowing the type itself to be marked into

  2. If you do not control the target type, we should allow implicit conversions but in a more explicit/verbose/inconvenient manner. This could be via an into[] wrapper on the target type, adding a bit of friction but not prohibiting it

This dichotomy is similar to the orphaned typeclass problem, where most typeclasses are defined in the companion object, but uncommonly you find typeclasses defined elsewhere for good reason (e.g. integrating two independent libraries). We generally do not want to prohibit orphaned typeclasses, but it does make sense to e.g. require an explicit import whereas non-orphaned typeclasses get picked up automatically.

Similarly, it should be easy for the definer of a type (class, trait, etc.) to mark it as an implicit conversion target, and it should be possible for downstream code to use a type not marked as an implicit conversion target even if it is less easy than it is for the definer

4 Likes

TBH I think the problem with this proposal, and all the proposals before it, is that the process is exactly backwards. The way we find a path forward here is by

  1. Inventorying the usage of implicit conversions in the ecosystem, perhaps starting from scala-library and scala-toolkit (which are not even a lot of code!)

  2. Categorizing them into groups of common design patterns

  3. Deciding which patterns are valid and justified and which patterns are problematic.

  4. From there, it would then be possible to design a language feature grounded in what we want to achieve, which is to prohibit or inconvenience the problematic usage patterns, while allowing the valid ones.

  5. Design a migration path for the problematic usage patterns so they have something to migrate to, rather than just breaking those libraries forever

By starting first from an implementation and language spec changes, that gives us no insight whatsoever on the impact of the language change, which is what we actually care about. We end up just talking in circles in hypotheticals with nothing anchoring us to the reality of the ecosystem except vague hand-waving. And there is no way at all to judge whose hand-waving is more correct!

To solve that, we need to flip the process on its head, and start by studying the various libraries in the ecosystem to really understand what implicit conversions are currently being used for. That is the only way we can find a reasonable path forward that does not involve breaking the entire Scala ecosystem that relies on this feature

11 Likes

I completely agree. Putting the carriage before the horses is not the right path forward, IMO.

And without getting into the specifics of the proposal, I would say that I require an equivalent from, when I have an AnyVal wrapper that I want to convert back to its underlying value wherever applicable. It won’t make sense to mark everything else as into for this use-case.

2 Likes

I think that would be too sweeping. In general, I would not know all the use cases of IterableOnce when I define the trait. There might well be uses where I do not want an implicit conversion (and get better type inference in return).

About two ways to define things: once as a modifier on the type definition, the other as a wrapper described: Since you get the type modifier on definition functionality simply by defining an additional type alias, the modifier idea seems to be largely redundant?

About a survey of libraries: I agree this would be a good idea. I believe we have already unearthed many examples in the previous discussion traits, where I explicitly invited people to come with use cases for implicit conversions. A more systematic investigation could be done by people who know some of these library stacks. If someone wants to step up to do a systematic survey this would be useful. Just be aware that the goal will not be ā€œwe will support all existing use cases of implicit defsā€ because that would means we also inherit the mess they caused.

1 Like

Regarding the proposal,
I think underlying should be named convert.

import Conversion.convert
trait Foo
object Foo:
  given Conversion[Int, Foo] = ???

def withInto(arg: into[Foo]) = ???
def withoutInto(arg: Foo) = ???

withInto(42) //works
withoutInto(42) //fails
withoutInto(42.convert) //works

Additionally, one use-case that this doesn’t solve is virtualization of if expressions/statements (overloading if with my own type):

trait MyBool
implicit def BooleanHack(from: MyBool): Boolean = ???
val b: MyBool = ???
if(b) {/*something*/} else {/*something else*/}

I have a compiler plugin that identifies BooleanHack ifs and converts them to method calls. I would be happy to abandon this mechanism in favor a proper ability if virtualization similarly to Virtualized Scala Reference Ā· TiarkRompf/scala-virtualized Wiki Ā· GitHub

2 Likes

I mean… Is it? I think I actually like the explicit import.

Usually, if I’m using Array (instead of ArraySeq or another collection) it’s because I’m writing performance sensitive code.

I’m not sure about the JVM, but at least in Scala Native I’ve had unexpected performance degradation due to those implicit conversions, so it’s kind of nice to have the compiler give a heads up.

Then it’s up to the user to:

  1. Rewrite the code to avoid the conversion, if they need the extra performance guarantees
  2. Rewrite the code to use ArraySeq, if they care about not having error-prone code
  3. Add the import, if they really want to use the Array with automatic boxing

Both 2 and 3 seem pretty straightforward.

And sure, this is not just about arrays, but in general adding the import doesn’t seem like much of a big deal, especially if the compiler is able to recommend it.

5 Likes

I’d also be in favor of something else, perhaps as ?
(and then we can rename isInstanceOf to is, so we have the two letter methods which are safe, and the long one which is not)

1 Like

One huge difference between the type alias and the modifier is backwards compatibility. Having every library rename all of its trait Foos to trait Foo0 to add a new type Foo = into[Foo0] is a source and bincompat breaking change. Adding a new modifier into trait Foo is not a breaking change. Presumably preserving source and bincompat across the ecosystem - at least for ā€œvalidā€ usage patterns - is a big requirement here.

For example if previously we had

// library
class Foo
// downstream code
val foo = new Foo 
object Bar extends Foo

And the library was changed to accomodate the stricter requirements around implicit conversions

// library
class Foo0
type Foo = into[Foo0]

Then the user code would break, because new into[Foo0] is not a valid constructor, and extends into[Foo0] is not a valid class type for inheritance. into is opaque, so it doesn’t get resolved to the underlying class type

Similarly, any Java libraries calling method signatures containing Foo parameters will now find the new signatures instead containing Foo0 parameters, which breaks binary compatibility

On the other hand, if the library was changed to

// library
into class Foo

Then new Foo and extends Foo could continue to work, and any Java callsites can continue to work, even after adopting the stricter requirements around implicit conversions

3 Likes

Surely ā€œnot knowing all the use cases when you define the traitā€ is the whole point of open traits like IterableOnce? If you wanted to know all implementors ahead of time, you would make it sealed. The fact that you left it open means you intend there to be implementors you are not aware of, and any code written must be robust against that.

What are these concrete cases of IterableaOnce where you don’t want to accept Array? We need to be concrete here to have something to discuss. If we can’t come up with any concrete examples, it seems reasonable to conclude maybe there are no such cases, which is the state of the discussion so far

3 Likes

Maybe we can come up with a different example than IterableOnce?

It is almost uniquely well-suited to simply being into[Iterator]. If we’d had that feature during the collection rewrite, I’d have argued for that.

Let’s pick something else like Seq? Array is coerced into Seq. On the one hand, this is lovely; you can Array("hi", "there ").map(_.trim) and it just works. It is also terrible because you can Array(1, 2).map(x => x*x) and wonder why you’re 30x slower than Java.

And suppose you have def findSecondIndex[A](s: Seq[A], a: A): Int = ???. You can simply findSecondIndex("troublingly", 'l'), except it’s about 100x slower than you expect. If it were s: into[Seq[A]] at least there’d be fair warning.

We’ve already made this bed, so we probably have to sleep in it, but these are big enough peas under the mattress so that we probably want to re-think whether we want to create more.

The random promotion of high-efficiency primitive to boxed low-efficiency operations is one of the more annoying features of Scala when one needs to compute something computationally intensive, not just keep track of tricky logic.

5 Likes

Yes that is entirely my point.

For this proposal to move forward, we need clear and convincing examples of the usage patterns we deem problematic and would like to guard against. Without such a problem statement, there’s literally nothing to discuss.

If we go forward without, we would go through a herculean effort changing the language and breaking everyone’s code, ending up with no idea whether we accomplished our goals because we never had concrete goals to begin with!

So, what is the exact problem we are trying to guard against here? That’s the first thing we need to figure out. Without such a convincing problem statement, there isnt really any point in discussing the minutiae of spec changes and syntax implementation details.

1 Like

But I just gave exactly two of those: the problematic conversion of Array[Primitive] and ā€œhi i am a stringā€ into surprisingly low-efficiency generic operations.

Note that you are protected against this in regular code because StringOps shields almost all methods on String from dispatching through Seq, and ArrayOps shields almost all methods on Array from dispatching through Seq and failing to return an Array because of the lack of a ClassTag. But in user-space, if you innocently write s: Seq[A], you unexpectedly unleash the ability to process Array and String with shockingly poor performance. At least with into[Seq[A]] you would be asking for it.

2 Likes

It is very usefull usecase for us too. It allows to implement 3vl.

IMHO: The same logic in sql and procedural language reduces the number of errors when people writes many sql qeuies.

I agree, IterableOnce should have been into[Iterator], that’s a great observation. So yes, IterableOnce might not be the best example.

But I reject the attitude that says ā€œunless you know all library use cases intimately you have no right to make or discuss a proposalā€. I have written the Scala compiler and large parts of the Scala standard library. I take my examples from there because I know them well and can judge them. Based on this, it’s my role to come up with a proposal and present it for discussion. You and other library authors are invited to give feedback and together we should evaluate the proposal. That’s how the discussion works. Stating that the proposal should not have been made because it comes from insufficient evidence is not helpful and feels a bit like gatekeeping. The point of the discussion here is to gather that evidence!

Getting back to into as a modifier. Its advantages are that it allows an easy retrofit of most existing code. As long as conversions are between types defined in your code, just add into to all possible target classes and traits and you are done. The downside is that you can again easily shoot yourself in the foot as @Ichoran has well demonstrated.

Binary compatibility is not really an issue. Let’s take Laminar’s Modifier as an example. A backwards binary compatible way to upgrade would introduce the alias

type ToModifier = into[Modifier]

and then all methods would use ToModifier instead of Modifier in their parameter types. Or they would use into[Modifier] directly. Either scheme would be binary compatible with the current codebase.

That said, I am not fundamentally opposed to also have into as a modifier. It would certainly alleviate all concerns about migration. It should come with a culture that emphasizes that having many into targets just in case someone needs a conversion is a code smell and that it’s better to use more specific into[...] types in arguments. As long as we can foster that culture I would be OK with it.

5 Likes

I think that has it backwards. The status quo is that all conversions will need language imports. If the SIP committee does not agree on a proposal, this is what will happen. So the proposal should identify large classes of patterns that are deemed generally unproblematic and investigate mechanisms to allow those. That’s why I made this proposal.

One might argue that this is just playing with words, but it really is not. The question is what happens if there is insufficent evidence to either support or contradict a hypothesis that a class of conversions is problematic.

3 Likes

Two use-cases which require conversion between F[T] and T:

  • automatic coloring (now deleted, but this was more social than a technical issue). [dotty-cps-async]
  • DSL when you build expressions over some Repr[T] and want build with Repr[T] the same expressions as over T. (i.e.:
    collection[A].query(_.name == collection[String].queryOne(cond)) )
    Example, as I can remember - Quill.

still not solved.

I’m not sure that they need to be covered. (Maybe not - first is not used now, for the second exists an alternative technique (HOAS embedding) with the same (or maybe slightly worse, because of arrows at the beginning) syntax)