Yeah,
1.0
is aDouble
through-and-through right now. I’m just inviting you to redesign more of the numeric hierarchy
Yes, but maybe not right now. There’s a lot on our plate already,
Yeah,
1.0
is aDouble
through-and-through right now. I’m just inviting you to redesign more of the numeric hierarchy
Yes, but maybe not right now. There’s a lot on our plate already,
The reason bare number literals are ints and doubles by default, and not e.g. longs or floats, is “mere” tradition. Because of this, I’m unconvinced by arguments about intuitiveness to non-programmers (who are unaware of tradition) or to people coming from other languages with different behavior. The important thing IMO is to be consistent and unsurprising.
We want to avoid inferring Array[AnyVal]
because it’s not a useful type: e.g. we can’t do .map(_ + 1)
on it. There should be a type Number <: AnyVal
we could infer (unlike the existing java.lang.Number
which is for reference types), but that means redesigning the hierarchy.
I do think it would be more consistent to infer Array(1.0, 1.0f)
as Array[Double]
and not Array[AnyVal]
. That would not require redesigning the hierarchy, only adding another rule that float literals are adapted to other types (in practice, only to Double) as necessary, just like the rule about adapting ints.
Thanks for the proposal, Séb.
First of all, I believe it would be an improvement worth pursuing, so I have no complaints about the proposal.
But I believe, furthermore, that it might be possible to remove the concept of weak conformance from the language completely, and to rely on the common rules of least upper bounds, but I will need some help from other people to enumerate the various use-cases - and find the (almost inevitable!) counter examples I haven’t thought of.
Oron is along the right lines with his suggestion, but he should have used an intersection type, not a union: I would propose that every number literal is typed as the intersection of all of the numeric types that can accurately and precisely represent is.
For example,
42
would be typed as Int & Long & Short & Byte & Float & Double
, as it can be precisely represented by every one of these types.3.1415926
is a valid Double
, but is not a valid Float
(there’s rounding). Its type would be Double
2.718
could be reasonably interpreted as a Double
and as a Float
, so its type would be Double & Float
123456789012345
(note the lack of L
) could be precisely interpreted as both a Long
and as a Double
, so would have the type Long & Double
.3.14159265358979323846
would become an error, because it cannot be precisely represented in any JVM number format.Combining several such literal numbers in a List
will calculate the LUB of their types, which is equivalent to taking the intersection of the subset of types which appear in the intersection types of every literal.
Any numbers with an explicit suffix (L
, I
, S
, B
, F
, D
) would be explicitly typed to that one type. FWIW, I’d propose to remove the lower-case suffices, particularly l
.
Having these “complex” intersection types inferred could get very verbose, very quickly, if we were to see them appearing in error messages. So I would additionally propose that intersections of primitives be simplified for printing purposes only to elide any type which could be inferred to be included in the intersection.
Byte
can be represented as a Short
Short
can be represented as an Int
Short
can be represented as a Float
Int
can be represented as a Long
Float
can be represented as a Double
, andInt
can be represented as a Double
We could even define the following four special aliases which represent the intersection types that would commonly be inferred from literals.
type Byte' = Byte & Short & Int & Long & Float & Double
type Short' = Short & Int & Long & Float & Double
type Int' = Int & Long & Double
type Float' = Float & Double
For example,
List(1, 2, 3)
would have the type List[Byte & Short & Int & Long & Float & Double]
but would be displayed as List[Byte']
.List(3.14, 2.7)
would have type List[Float & Double]
but would be displayed as List[Float']
List(0, 32767)
would be displayed as List[Short']
List(0, 32768)
would be displayed as List[Int' & Float]
It is still desirable to always distinguish between Int
and Int'
(etc) because invariance would (quite rightly) not consider an Ordering[Int]
to be related to an Ordering[Int']
.
Some practical implications of this would be that:
&
) for those integral numbers which are all larger than Short
s but all still representable precisely as a Float
(and likewise for Long
s representable precisely as Double
s); I think these are the more unusual casesInt
should be chosen primarilyAnyVal
.Examples like this, and others, would now “just work”. No explicit types are required, and no AnyVal
s get inferred:
val xs = List(2, 3)
val ys = 65536 :: xs
val zs = Set(0.0D) ++ ys
xs
has type List[Byte']
ys
has type List[Int']
zs
has type List[Double]
And most of this already works right now in Scala 3, and using with
types in Scala 2, if you force the literals to the right types manually. We can define the type aliases (using backticks), and safely cast values to them. Erasure does the right thing. Putting different combinations of these cast primitives into List
s infers the answers I would expect for the few tests I’ve done. Given that the machinery for intersection types and LUBs already exists, it might be possible to make the changes without any particularly drastic changes to the typer. It would make a very interesting test case to throw it at the community build.
Obviously this is still very underspecified, and I’m hardly an expert in the finer details. It proposes a significant change to the compiler, much beyond Séb’s idea (which should go ahead anyway). But I think it would go quite a bit further towards simplicity, and would - if it works out - address a lot of the criticism Paul Phillips gave Scala about five years ago. It would be nice to hear some feedback on it.
What about narrow (singleton) representations?
Currently (post SIP23), def foo[T <: Int with Singleton](t : T) : T = t
will accept foo(42)
. If we wish further expand on literal representation then we need def foo[T <: Double with Singleton](t : T) : T = t
(and others) to accept the literal 42 as well, but this IMO, should output 42.0
.
Furthermore, I’m rethinking my idea because it becomes problematic even for simple inline arithmetic.
What type will we be expecting for 42 + 1
?
Which operation should we invoke? An Int + Int
, a Double + Double
,… ? Additionally, what happens at overflow, for example 255 + 1
(a char is expected to goto 0, while an int continues on to 256)?
Maybe there is a way to somehow use this idea, but currently there are too many ambiguities pre/post arithmetic operations.
First, I really love the idea of this proposal. I just have one question, what is the type of val theAnswer = 42
? Would that actually be Int & Long & Short & Byte & Float & Double
? Could we get the REPL to say it is Byte'
? Would we want to? I worry about the confusion that might be caused by simple declarations.
I think Weak Conformance is a private case of potential implicit conversions use, and therefore we can widen the criteria where implicit conversions are applied to handle Weak Conformance and other cases.
Let’s consider the following example:
class Foo(val value : Int)
object Foo1 extends Foo(1)
object Foo2 extends Foo(2)
class Bar(val value : Int)
object Bar {
implicit def toFoo(b : Bar) : Foo = new Foo(b.value)
}
object Bar1 extends Bar(1)
val list = List(Foo1, Foo2, Bar1, new Foo(3))
By current definition of Scala’s type inference, list : List[Object]
, since Object
is the common ancestor of all these types. If we expand the implicit conversion rule, we can expect the compiler to apply the conversion from Bar
to Foo
to get a lower common ancestor, Foo
.
Proposal (sorry, I’m not good at language formulation):
Given n
T1...Tn
types with common superclass T
, implicit conversions can be applied to infer a lower superclass S
under the following conditions:
For all i=(1...n)
we have Ti <:< T
and Ti !=:= T
, and there are implicit conversions that bring all Ti
to a common superclass S
, so S <:< T
and S !=:= T
, and there is a Tj
such that S =:= Tj
.
If this proposal is applied, we can get rid of the Weak Conformance
special case, and the user can choose not to have it by excluding predef('s implicit conversions).
Understood. But this is probably the last chance for at least another decade to make improvements of this kind. Changes to the fundamental behavior of literals and primitive types are usually very hard to make seamless, and after moving to Scala 3, there is going to be an incredible amount of pressure to not rock the boat for a good long while.
I like the idea in general. But, without thinking it through in detail, I guess this would have to mean that all numbers have to be boxed, and extra type tests and casts added to basic arithmetic operations.
I didn’t talk about erasure, but val theAnswer = 42
would be typed as Byte'
, hence Int & Long & Short & Byte & Float & Double
, but in the absence of any explicit Byte
or Short
type, would be erased in the bytecode to Int
by default, or Long
if Int
is not in the intersection. But TASTY would still encode that it’s a Byte'
and not an Int
. (To Java, it would look like an int
, though.)
No, there should be no need for boxing. The intersection types would erase to primitives in the bytecode.
We considered the intersection type trick, but then decided it was out of scope. Intersection types and union types are rather costly for typechecking and type inference. Making every literal have a large intersection type is not going to make the compiler faster.
Besides, intersection types on the left of a subtype judgement also have the problem that they lead to incompleteness. Consider a subtype goal like:
A & B <: C
This is true if A <: C
or B <: C
. The or is bad news since both of the two subgoals might constrain type variables. In the end one has to choose one or the other of the two subgoals to pursue, and stick to it, which means that some solutions to constraint solving might be lost. This is the essential problem behind set constraints and subtyping constraints only make it worse.
Bottom line: Intersection types are not a free lunch either.
For the intersection types to be really useful, we’d also need to add a common supertype Numeric <: AnyVal
that defined common arithmetic operations. Otherwise you couldn’t write (list: List[Int']).map(_ + 1)
, and without that, how useful is a List[Int']
really? And Martin has said they don’t want to change the number hierarchy.
Actually, this already works out pretty well:
scala> type IntLike = Int & Long & Double
// defined alias type IntLike = Int & Long & Double
scala> val list = List[IntLike](42.asInstanceOf)
val list: List[IntLike] = List(42)
scala> list.map(_ + 1)
val res19: List[Int & Long & Double] = List(43)
scala> list.map(_ + 1.2)
val res20: List[Double] = List(43.2)
scala> list.map(_ + 2L)
val res21: List[Long & Double] = List(44)
If it’s been considered already, then I guess there’s no need to pursue it…
But FWIW, I wouldn’t have thought that the performance issue would be significant: Literals are not that common in code; the intersections are not that large (there’s a limit of six types). I would have thought a single implicit search would be orders of magnitude more work. On the other hand, it’s not clear to me yet how widely these intersection types would typically be perpetuated.
The incompleteness of unification sounds like more of a problem, but I’m not sure I’m understanding correctly when it would come into play: these are general problems with unification, but the intersections of AnyVal
subtypes arising from literals would be non-overlapping and not parameterized. I suspect I’ve missed the point, somewhere! For the type system to provide useful results, we would probably want to introduce a new arbitrary rule to select certain priorities (as I suggested for overload resolution).
Anyway, given @Jasper-M’s example (which surprised me, actually… I’m not convinced weak-conformance isn’t being applied here, and I actually expected the intersection to collapse under application of an overloaded +
), it might be a fun exercise for someone with more time than me to attempt to implement and run some tests against, on the basis that it probably won’t be merged, but could provide interesting evidence about whether it’s a better or worse situation than the status quo. If I had a couple of days free, I’d love to try it myself…
It doesn’t really work - I can’t make the list contain doubles, only ints:
scala> val list = List[Int&Double](1.asInstanceOf, 1.1.asInstanceOf)
val list: List[Int & Double] = List(1, 1)
scala> list.last
val res1: Int & Double = 1
ETA: which makes sense, because the type Int&Double
doesn’t allow non-integer values. So I don’t see how this is useful, but maybe I’m missing your point.
In @Jasper-M’s example, the +
isn’t really overloaded. The type Int&Double
is a subtype of Int
, so Dotty coerces the values to Ints and binds the +
to Int.+
. Illustrated by the exception here:
scala> val list = List[AnyVal](1, 1.1).asInstanceOf[List[Int&Double]] // Illegal cast
val list: List[Int & Double] = List(1, 1.1)
scala> list.map(_+1)
java.lang.ClassCastException: java.lang.Double cannot be cast to java.lang.Integer
I’m not saying that this is working perfectly already. But apparantly once you correctly set up all types, dotty is already able to handle this (as far as the type system is concerned). And (list: List[Int']).map(_ + 1)
does work.
The way some intersections get erased might need some tweaking indeed. But the casts in your example are unsound in this new scheme as well. As you said 1.1
is not a valid Int & Double
in any scenario. So it’s expected that this blows up.
@propensive Here’s a problematic example:
Int & Double <: X & Y
where X
and Y
are type variables. You can solve this by taking X=Int
and Y=Double
or
the other way around. The two solutions could have drastically different consequences downstream.
But apparantly once you correctly set up all types, dotty is already able to handle this (as far as the type system is concerned). And
(list: List[Int']).map(_ + 1)
does work.
map(_+1)
binds to some one concrete type’s +
method. It doesn’t work if the List[Int']
contains values of more than one concrete type. That’s the important bit, and it’s missing. For similar reasons, you can’t actually construct such a list holding values of different concrete types without explicit casts, and if you construct it using asInstanceOf
you won’t be able to apply any methods that don’t exist on Any
.
@propensive proposed that:
every number literal is typed as the intersection of all of the numeric types that can accurately and precisely represent is.
But it’s not that way today, so Int&Double
is an empty type. I’m afraid don’t understand what the relevant value/behavior is that you think already exists and didn’t in Scala 2.x, beyond type intersections per se. Can you please explain?
Thanks, @odersky. I suppose it would be equally correct (more generally correct?) to choose X = Int | Double
and Y = Int | Double
, but I can easily believe that that could be the start of further problems downstream…