Proposal to drop Weak Conformance from the language

clhodapp · February 8, 2019, 9:10pm

That setup seems scary because it seems to me that it would make all numeric types implicitly convert to all other numeric types, which is even less precise than today! In my opinion, we shouldn’t be memorializing implicit conversions like this in the language standard.

Ichoran · February 8, 2019, 9:18pm

List.empty is a “literal” of a sort, but what it means is context-dependent. So is x => f(x.foo). These are already well-established and non-problematic in a language with type inference. You can make each one concrete with a type annotation.

Number literals should also be able to be made concrete: 151235: Int or 121.190f or 'p', for example.

Ichoran · February 8, 2019, 9:20pm

This throws away the ability for the compiler to use statically known information about the range of the value to check for correctness.

This is not a good idea. It’s central to your proposal AFAICT, but it would be a showstopper for me.

japgolly · February 8, 2019, 9:27pm

IMO LUB’ing to Any / AnyVal (when any of the subject types aren’t Any or AnyVal themselves) is the problem.

odersky · February 9, 2019, 12:08pm

I was intrigued by the remark about Float literals. If we’d redesign more of the numeric hierarchy it would make sense to treat 1.0 as a “polymorphic” Float literal the same way we treat 1 as a polymorphic Int literal. But right now it’s not:

scala> val x:Float = 1.0
<console>:11: error: type mismatch;
 found   : Double(1.0)
 required: Float
       val x:Float = 1.0

Ichoran · February 9, 2019, 11:57pm

Yeah, 1.0 is a Double through-and-through right now. I’m just inviting you to redesign more of the numeric hierarchy

odersky · February 10, 2019, 9:55am

Yeah, 1.0 is a Double through-and-through right now. I’m just inviting you to redesign more of the numeric hierarchy

Yes, but maybe not right now. There’s a lot on our plate already,

danarmak · February 10, 2019, 6:18pm

The reason bare number literals are ints and doubles by default, and not e.g. longs or floats, is “mere” tradition. Because of this, I’m unconvinced by arguments about intuitiveness to non-programmers (who are unaware of tradition) or to people coming from other languages with different behavior. The important thing IMO is to be consistent and unsurprising.

We want to avoid inferring Array[AnyVal] because it’s not a useful type: e.g. we can’t do .map(_ + 1) on it. There should be a type Number <: AnyVal we could infer (unlike the existing java.lang.Number which is for reference types), but that means redesigning the hierarchy.

I do think it would be more consistent to infer Array(1.0, 1.0f) as Array[Double] and not Array[AnyVal]. That would not require redesigning the hierarchy, only adding another rule that float literals are adapted to other types (in practice, only to Double) as necessary, just like the rule about adapting ints.

propensive · February 11, 2019, 12:55am

Thanks for the proposal, Séb.

First of all, I believe it would be an improvement worth pursuing, so I have no complaints about the proposal.

But I believe, furthermore, that it might be possible to remove the concept of weak conformance from the language completely, and to rely on the common rules of least upper bounds, but I will need some help from other people to enumerate the various use-cases - and find the (almost inevitable!) counter examples I haven’t thought of.

Oron is along the right lines with his suggestion, but he should have used an intersection type, not a union: I would propose that every number literal is typed as the intersection of all of the numeric types that can accurately and precisely represent is.

For example,

42 would be typed as Int & Long & Short & Byte & Float & Double, as it can be precisely represented by every one of these types.
3.1415926 is a valid Double, but is not a valid Float (there’s rounding). Its type would be Double
2.718 could be reasonably interpreted as a Double and as a Float, so its type would be Double & Float
123456789012345 (note the lack of L) could be precisely interpreted as both a Long and as a Double, so would have the type Long & Double.
3.14159265358979323846 would become an error, because it cannot be precisely represented in any JVM number format.

Combining several such literal numbers in a List will calculate the LUB of their types, which is equivalent to taking the intersection of the subset of types which appear in the intersection types of every literal.

Any numbers with an explicit suffix (L, I, S, B, F, D) would be explicitly typed to that one type. FWIW, I’d propose to remove the lower-case suffices, particularly l.

Having these “complex” intersection types inferred could get very verbose, very quickly, if we were to see them appearing in error messages. So I would additionally propose that intersections of primitives be simplified for printing purposes only to elide any type which could be inferred to be included in the intersection.

every Byte can be represented as a Short
every Short can be represented as an Int
every Short can be represented as a Float
every Int can be represented as a Long
every Float can be represented as a Double, and
every Int can be represented as a Double

We could even define the following four special aliases which represent the intersection types that would commonly be inferred from literals.

type Byte' = Byte & Short & Int & Long & Float & Double
type Short' = Short & Int & Long & Float & Double
type Int' = Int & Long & Double
type Float' = Float & Double

For example,

List(1, 2, 3) would have the type List[Byte & Short & Int & Long & Float & Double] but would be displayed as List[Byte'].
List(3.14, 2.7) would have type List[Float & Double] but would be displayed as List[Float']
List(0, 32767) would be displayed as List[Short']
List(0, 32768) would be displayed as List[Int' & Float]

It is still desirable to always distinguish between Int and Int' (etc) because invariance would (quite rightly) not consider an Ordering[Int] to be related to an Ordering[Int'].

Some practical implications of this would be that:

the functionality of finding a suitable common supertype for a set of numeric literals would be very similar to that of weak conformance, but wouldn’t involve any special rules in the typer
inferred primitive intersection types would only be visible (with an &) for those integral numbers which are all larger than Shorts but all still representable precisely as a Float (and likewise for Longs representable precisely as Doubles); I think these are the more unusual cases
the familiar “flexibility” of literals to adapt to an expected type would be perpetuated beyond the literal’s use-site; we would be able to defer the choice on the number’s exact type until later.
overload resolution would sometimes need to force a decision when presented with an intersection type; Int should be chosen primarily
users become aware of the differences between the primitive types and the primed types; I think this is a desirable outcome, rather than dozens of counterexamples involving AnyVal.

Examples like this, and others, would now “just work”. No explicit types are required, and no AnyVals get inferred:

val xs = List(2, 3)
val ys = 65536 :: xs
val zs = Set(0.0D) ++ ys

xs has type List[Byte']
ys has type List[Int']
zs has type List[Double]

And most of this already works right now in Scala 3, and using with types in Scala 2, if you force the literals to the right types manually. We can define the type aliases (using backticks), and safely cast values to them. Erasure does the right thing. Putting different combinations of these cast primitives into Lists infers the answers I would expect for the few tests I’ve done. Given that the machinery for intersection types and LUBs already exists, it might be possible to make the changes without any particularly drastic changes to the typer. It would make a very interesting test case to throw it at the community build.

Obviously this is still very underspecified, and I’m hardly an expert in the finer details. It proposes a significant change to the compiler, much beyond Séb’s idea (which should go ahead anyway). But I think it would go quite a bit further towards simplicity, and would - if it works out - address a lot of the criticism Paul Phillips gave Scala about five years ago. It would be nice to hear some feedback on it.

soronpo · February 11, 2019, 1:52am

What about narrow (singleton) representations?
Currently (post SIP23), def foo[T <: Int with Singleton](t : T) : T = t will accept foo(42). If we wish further expand on literal representation then we need def foo[T <: Double with Singleton](t : T) : T = t (and others) to accept the literal 42 as well, but this IMO, should output 42.0.

Furthermore, I’m rethinking my idea because it becomes problematic even for simple inline arithmetic.
What type will we be expecting for 42 + 1?
Which operation should we invoke? An Int + Int, a Double + Double,… ? Additionally, what happens at overflow, for example 255 + 1 (a char is expected to goto 0, while an int continues on to 256)?

Maybe there is a way to somehow use this idea, but currently there are too many ambiguities pre/post arithmetic operations.

MarkCLewis · February 11, 2019, 2:35am

First, I really love the idea of this proposal. I just have one question, what is the type of val theAnswer = 42? Would that actually be Int & Long & Short & Byte & Float & Double? Could we get the REPL to say it is Byte'? Would we want to? I worry about the confusion that might be caused by simple declarations.

soronpo · February 11, 2019, 3:16am

I think Weak Conformance is a private case of potential implicit conversions use, and therefore we can widen the criteria where implicit conversions are applied to handle Weak Conformance and other cases.

Let’s consider the following example:

class Foo(val value : Int)
object Foo1 extends Foo(1)
object Foo2 extends Foo(2)

class Bar(val value : Int)
object Bar {
  implicit def toFoo(b : Bar) : Foo = new Foo(b.value)
}
object Bar1 extends Bar(1)

val list = List(Foo1, Foo2, Bar1, new Foo(3))

By current definition of Scala’s type inference, list : List[Object], since Object is the common ancestor of all these types. If we expand the implicit conversion rule, we can expect the compiler to apply the conversion from Bar to Foo to get a lower common ancestor, Foo.

Proposal (sorry, I’m not good at language formulation):
Given n T1...Tn types with common superclass T, implicit conversions can be applied to infer a lower superclass S under the following conditions:
For all i=(1...n) we have Ti <:< T and Ti !=:= T, and there are implicit conversions that bring all Ti to a common superclass S, so S <:< T and S !=:= T, and there is a Tj such that S =:= Tj.

If this proposal is applied, we can get rid of the Weak Conformance special case, and the user can choose not to have it by excluding predef('s implicit conversions).

Ichoran · February 11, 2019, 4:14am

Understood. But this is probably the last chance for at least another decade to make improvements of this kind. Changes to the fundamental behavior of literals and primitive types are usually very hard to make seamless, and after moving to Scala 3, there is going to be an incredible amount of pressure to not rock the boat for a good long while.

Jasper-M · February 11, 2019, 8:19am

I like the idea in general. But, without thinking it through in detail, I guess this would have to mean that all numbers have to be boxed, and extra type tests and casts added to basic arithmetic operations.

propensive · February 11, 2019, 8:33am

I didn’t talk about erasure, but val theAnswer = 42 would be typed as Byte', hence Int & Long & Short & Byte & Float & Double, but in the absence of any explicit Byte or Short type, would be erased in the bytecode to Int by default, or Long if Int is not in the intersection. But TASTY would still encode that it’s a Byte' and not an Int. (To Java, it would look like an int, though.)

propensive · February 11, 2019, 8:34am

No, there should be no need for boxing. The intersection types would erase to primitives in the bytecode.

odersky · February 11, 2019, 8:40am

We considered the intersection type trick, but then decided it was out of scope. Intersection types and union types are rather costly for typechecking and type inference. Making every literal have a large intersection type is not going to make the compiler faster.

Besides, intersection types on the left of a subtype judgement also have the problem that they lead to incompleteness. Consider a subtype goal like:

A & B <: C

This is true if A <: C or B <: C. The or is bad news since both of the two subgoals might constrain type variables. In the end one has to choose one or the other of the two subgoals to pursue, and stick to it, which means that some solutions to constraint solving might be lost. This is the essential problem behind set constraints and subtyping constraints only make it worse.

Bottom line: Intersection types are not a free lunch either.

danarmak · February 11, 2019, 9:42am

For the intersection types to be really useful, we’d also need to add a common supertype Numeric <: AnyVal that defined common arithmetic operations. Otherwise you couldn’t write (list: List[Int']).map(_ + 1), and without that, how useful is a List[Int'] really? And Martin has said they don’t want to change the number hierarchy.

Jasper-M · February 11, 2019, 11:14am

Actually, this already works out pretty well:

scala> type IntLike = Int & Long & Double
// defined alias type IntLike = Int & Long & Double

scala> val list = List[IntLike](42.asInstanceOf)
val list: List[IntLike] = List(42)

scala> list.map(_ + 1)                                                          
val res19: List[Int & Long & Double] = List(43)

scala> list.map(_ + 1.2)                                                       
val res20: List[Double] = List(43.2)

scala> list.map(_ + 2L)                                                    
val res21: List[Long & Double] = List(44)

propensive · February 11, 2019, 11:45am

If it’s been considered already, then I guess there’s no need to pursue it…

But FWIW, I wouldn’t have thought that the performance issue would be significant: Literals are not that common in code; the intersections are not that large (there’s a limit of six types). I would have thought a single implicit search would be orders of magnitude more work. On the other hand, it’s not clear to me yet how widely these intersection types would typically be perpetuated.

The incompleteness of unification sounds like more of a problem, but I’m not sure I’m understanding correctly when it would come into play: these are general problems with unification, but the intersections of AnyVal subtypes arising from literals would be non-overlapping and not parameterized. I suspect I’ve missed the point, somewhere! For the type system to provide useful results, we would probably want to introduce a new arbitrary rule to select certain priorities (as I suggested for overload resolution).

Anyway, given @Jasper-M’s example (which surprised me, actually… I’m not convinced weak-conformance isn’t being applied here, and I actually expected the intersection to collapse under application of an overloaded +), it might be a fun exercise for someone with more time than me to attempt to implement and run some tests against, on the basis that it probably won’t be merged, but could provide interesting evidence about whether it’s a better or worse situation than the status quo. If I had a couple of days free, I’d love to try it myself…