Proposal to drop Weak Conformance from the language


#1

Hi Scala Community!

This thread is the SIP Committee’s request for comments on a proposal to remove Weak Conformance from the language. You can find all the details here.

Summary

In some situations, Scala uses a weak conformance relation when testing type compatibility or computing the least upper bound of a set of types. The principal motivation behind weak conformance was to make an expression like this have type List[Double]:

List(1.0, math.sqrt(3.0), 0, -3.3) // : List[Double]

It’s “obvious” that this should be a List[Double]. However, without some special provision, the least upper bound of the lists’s element types (Double, Double, Int, Double) would be AnyVal, hence the list expression would be given type List[AnyVal].

A less obvious example is the following one, which was also typed as a List[Double], using the weak conformance relation.

val n: Int = 3
val c: Char = 'X'
val n: Double = math.sqrt(3.0)
List(n, c, d) // used to be: List[Double], now: List[AnyVal]

Here, it is less clear why the type should be widened to List[Double], a List[AnyVal] seems to be an equally valid – and more principled – choice.

Weak conformance applies to all “numeric” types (including Char), and independently of whether the expressions are literals or not. However, in hindsight, the only intended use case is for integer literals to be adapted to the type of the other expressions. Other types of numerics have an explicit type annotation embedded in their syntax (f, d, ., L or ' for Chars) which ensures that their author really meant them to have that specific type).

Therefore, we propose to drop the general notion of weak conformance, and instead keep only one rule: Int literals (only) are adapted to other numeric types if necessary. This rule yields the following results as examples:

inline val b = 33
def f(): Int = b + 1
Array(b, 33, 5.5)      : Array[Double] // b is an inline val
Array(f(), 33, 5.5)    : Array[AnyVal] // f() is not a constant
Array(5, 11L)          : Array[Long]
Array(5, 11L, 5.5)     : Array[AnyVal] // Long and Double found
Array(1.0f, 2)         : Array[Float]
Array(1.0f, 1234567890): Array[AnyVal] // loss of precision
Array(b, 33, 'a')      : Array[Char]
Array(5.toByte, 11)    : Array[Byte]

Implications

The changes in weak conformance mostly change the inferred type of some expressions, typically from a precise numeric type such as Double to AnyVal. In most cases, the new inferred type will result in a type error soon after the given expression, which can be fixed by using explicit calls to .toDouble (or similar) on the subexpressions that the user wants to be converted.

In some cases, it is possible that the new code does not trigger a compile error, and might subtly change some behavior at run-time. For example, the following snippet:

def sameClass[T](xs: T*): Boolean = xs.tail.forall(_.getClass == xs.head.getClass)

val x: Int = 5
sameClass(x, 5.5)

will compile before and after the change, but will display true in Scala 2 and false in Scala 3.

Opening this Proposal for discussion by the community to get a wider perspective and use cases.


#2

From the standpoint of an educator using Scala with novice programmers, I see nothing objectionable in this change. If you didn’t preserve the Int literals behavior I would have a problem, because I think that comes up in CS1 and without it, there would be an unnecessary burden created by explaining what is happening in that situation. Given this proposal, my guess is that this won’t come up in introductory courses. By the time students did something where it did come up, they would have sufficient background to understand an explanation of what was going on.


#3

looks good to me


#4

I personally have never really seen the point of even the less powerful adaptation being proposed here: it’s not that hard to write a d, f, or L sigil on the occasion that you want to make something other then an Int and it makes for much more predictable code.


#5

The literal adaptation is very important for simple user code. It would be very annoying to infer List[AnyVal] from List(1, 3.4)


#6

Why not just write it as List(1.0, 3.4).


#7

because people might not be programmers, and they might be live coding. i’d say don’t force your code style on others.

SinOsc.ar(freq = List(441, 485.1)

I don’t need to (want to) decide when I type 441 if this going to be an integer or a float. I might even want to express that that’s an integer frequency, even if the API uses a float in the end.

And if you write literally any other language, from JavaScript to Python to Matlab/Octave, then it’s very annoying forcing that extra syntax.


#8

Because people might not be programmers…

I think that, by definition, anyone who is writing a Scala program is a programmer :wink:

I don’t need to (want to) decide when I type 441 if this going to be an integer or a float.

I would argue that whether something ends up being an integer or a float has a tremendous impact on the outcome of running the program and can’t just be waved away with an “I don’t want to think about it”.

And if you write literally any other language, from JavaScript to Python to Matlab/Octave, then it’s very annoying forcing that extra syntax.

Those other languages are all dynamically typed and, in fact, don’t perform coercion in the location we are talking about:

JS, Matlab, and Octave all create floats for all simple numeric literals, regardless of context and regardless of whether you use a decimal point. In JS, you can’t even make a primitive number that isn’t a float.

Python is dynamically typed so you don’t lose the ability to perform math ops on the elements without casting but it will actually make a list with an integer and a float in it when you write [1, 2.5], just as Scala would if it didn’t coerce integer literals.


#9

Especially given the reasoning about Int literals, I would expect these two to behave the same:

Array(b, 33, 5.5)      : Array[Double] // b is an inline val
Array(f(), 33, 5.5)    : Array[AnyVal] // f() is not a constant

I have a b. I don’t have an Int literal. (By the way, you could have written def b: 33 = 33 and the result would be the same. The fact that it’s an inline val is a tangent.)
This feels more like an implementation detail where the compiler can’t see the difference between a real literal and a variable or method with a literal type.


#10

It’s irrelevant whether the language is statically or dynamically typed, as long as it’s typed. And no, not everybody that uses a language is a programmer. Arguably Python’s user base is mainly scientists and engineers.


#11

Edit: This was a reply to a part of a post that has since been edited out and is thus no longer relevant.


#12

The non-programmers will have a hard time understanding why the following two Lists will give different results:

List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10.0)


#13

I disagree


#14

I vote against this part. Literals should not behave differently than values or methods by the principle of least confusion.

Btw. do we have a precedent of modifiers (val/var/def/literal) affecting type inferece/type checking?


#15

Whatever; statically typed counter example:

// c#
var foo = new[] { 1, 1.2 };
Console.WriteLine(foo); // System.Double[]

#16

I don’t see why. Having Ints go to Doubles is an immensely useful time-saver. Its inferring a List[AnyVal] that seems totally unhelpful to me. When is that ever useful?


#17

Perhaps a wild though, but maybe we can take a different approach to literals in the compiler, by using union types:

32 /*as a literal*/ : '\u0020' | 32 | 32.0f | 32.0
32.0  /*as a literal*/ : 32.0f | 32.0

This way we don’t contradict any Scala type inference, and yet maintain what the user expects.
I personally never liked the semantics that hardcoded the type of a literal, where one can expect it to mean several types as demonstrated in the case of 32.

In cases of precision loss, the compiler should generate a warning unless flagged otherwise.


#18

I overall think this is a good change. However, singling out Int literals alone feels weird to me. If List(1, 2L) is a List[Long] and List(1, 2: Byte) is a List[Byte], then why isn’t List(1.0, 2.5f) a List[Float]?

This is, for example, how Rust does it; vec![1.0, 2f32] is a Vec<f32>, not a Vec<f64> or a compile-time error.

Furthermore, 895789127817 seems like a perfectly reasonable number in the context where it can be inferred to be a Long or Double. Forcing literals to be Int seems weirdly specific.

The mental model I’d use is that numbers like 94 are of type IntLike[T <: Byte | Short | Char | Int | Long | Float | Double] and that numbers like 971.153 are of type IntLike[T <: Float | Double], and that the unions are refined via inference both to restrict it to valid ranges, and based on other information about what is going on.


#19

I think my issue is that (in my worldview) having the meaning of a “literal” be context-dependent makes it be not a literal. In my opinion, (my concept of) true literals should be available for all platform-primitive types, else writing carefully type-controlled code is much more difficult. I actually don’t so much have an issue with there being a “magic” syntax that figures out what type it should be from context as I do with the fact that there isn’t a syntax that is not subject to contextual interpretation. Soronpo’s suggestion is very interesting because it does at least take advantage of platform features to make the meaning of the literal not be context-dependent, though I note that the 32 on the right-hand side of the colon actually makes the definition cyclic (and further illustrates why there should be a my-definition literal syntax for primitives).


#20

Here’s how I would do it:

(1) All integer literals are Long and all floating point literals are Double

(2) All numeric primitive types have a common super-type Number

(3) There are implicit conversions from Number to any numeric primitive type

(4) There are implicit conversions from C[Number] to C[N] for any standard collection C and any numeric type N if N is a covariant parameter of C

So, List(1, 2.0) would be List[Number] and there is an implicit conversion to List[Double]