By coincidence, today I was looking at Java bugs related to overloading. One report had a bad cast, like yours, and that value was passed to an overloaded method in Object and String. In Java 7, type inference inferred Object and the Object-taking method was picked; in Java 8, it could infer String, picked the other method, which failed. I think the lesson is that these “under-the-hood” casts, which seem to be just getting in your face, are actually doing useful type system work.
Ah, right, I forgot about those. In that case, I guess Unit must remain a legal argument for type parameters.
Makes me wonder how often T => Unit will return BoxedUnit when the author intended it to return nothing at all, and how that affects performance?
I’m assuming that the following will create a Seq with 100 references to BoxedUnit, right?
(1 to 100).map(println)
Yep. Note that BoxedUnit
is a singleton read from a Java static field, so it doesn’t itself increase memory usage (that’d be the data structure itself).
Right. And it’s also true that whenever we assign a Unit
to a value the actual type will be a BoxedUnit
. More than that, the actual value assigned will be a reference to the same singleton object:
def doSomething(): Unit = { new Object }
val u1 = () // Debugger: {BoxedUnit@793}
val u2 = doSomething() // Debugger: {BoxedUnit@793}
val u3 = (new Object).asInstanceOf[Unit] // Debugger: {BoxedUnit@793}
decompiled Java code:
public void doSomething() { new Object(); }
BoxedUnit u1 = BoxedUnit.UNIT;
doSomething(); BoxedUnit u2 = BoxedUnit.UNIT;
new Object(); BoxedUnit u3 = BoxedUnit.UNIT;
Note, we don’t see a cast to BoxedUnit
in any of those cases.
A different situation happens when we pass a Unit
as a type parameter:
def execute[R](): R = (new Object).asInstanceOf[R]
val o1: Unit = execute[Object]()
val o2: Unit = execute[String]()
val o3: Unit = execute[Unit]() // class java.lang.Object cannot be cast to class scala.runtime.BoxedUnit
decompiled Java code:
public <R> R execute() { return (R)new Object(); }
execute(); BoxedUnit o1 = BoxedUnit.UNIT;
execute(); BoxedUnit o2 = BoxedUnit.UNIT;
BoxedUnit o3 = (BoxedUnit)execute();
o3
is the only case when Scala doesn’t throw away the object but tries to cast it to BoxedUnit
.
Function0
, Function1
and Function2
are specialized for Unit
in their result type.
Same here:
class MyFunc[R] extends Function0[R] {
def apply(): R = (new Object).asInstanceOf[R]
}
val myFunc2 = new MyFunc[Unit]
val r2: Unit = myFunc2() // java.lang.ClassCastException: class java.lang.Object cannot be cast to class scala.runtime.BoxedUnit
val myFunc1 = new MyFunc[AnyVal]
val r1: Unit = myFunc1() // SUCCESS!
decompiled Java code:
public class MyFunc<R> extends Object implements Function0<R> {
public R apply() { return (R)new Object(); }
}
MyFunc myFunc2 = new MyFunc();
BoxedUnit r2 = (BoxedUnit)myFunc2.apply();
MyFunc myFunc1 = new MyFunc();
myFunc1.apply(); BoxedUnit r1 = BoxedUnit.UNIT;
Related:
**Welcome to Scala 2.13.0 (OpenJDK 64-Bit Server VM, Java 1.8.0_222).
Type in expressions for evaluation. Or try :help.
> 1.asInstanceOf[Double]
res0: Double = 1.0
> def convert[A, B](x: A): B = x.asInstanceOf[B]
convert: [A, B](x: A)B
> convert[Int, Double](1)
java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Double
at scala.runtime.BoxesRunTime.unboxToDouble(BoxesRunTime.java:113)
... 28 elided**
There is clearly an inconsistency in how asInstanceOf works. Sometimes it converts the value to a value of another type, and sometimes it preserves the value and merely ascribes a new type to it.
I suppose if the target is a parameter, the only possible thing to do is to preserve the value. For the sake of consistency, it should always do that.
No, this is not what Unit
means in the spec. Unit
means you expect/request/require that the method returns ()
, the singleton value of the Unit
type. Since it in fact returns an Object
, which is not an instance of Unit
, the cast .asInstanceOf[R]
is invalid, and from that point on the compiler is free to throw a ClassCastException
, either immediately or later, or not at all.
There is no other way to interpret the specification of the language. The compiler is not doing anything wrong.
What’s incorrect is the original definition of execute
. It should say that it returns an Object
. It shouldn’t lie by pretending it can return any R
. This is whether the code is written in Scala or in Java. This is the root of the problem. The behavior of the compiler with Unit
/BoxedUnit
has nothing to do with it.
That’s an entirely different problem, for which I completely agree that the compiler is wrong. It has nothing to do with the fact that x
is a parameter or not. This behavior happens when the static type of x
is known to be a primitive type, and the type in the brackets is also a primitive type. In that case, for some reason, it compiles it as a coercion (like .toDouble
) instead of a cast.
There is nothing in the spec supporting this behavior. Writing this code should warn that it doesn’t do the right thing, and eventually become a compile error.
Would you like to submit a PR?
Then why is the following cast valid?
val u: Unit = (new Object).asInstanceOf[Unit] // u: Unit = ()
Do you have reason to believe that it is? It looks to me like a compiler artifact that it happens to work…
I’ve never submitted a PR for the compiler before, but it sounds exciting and I might consider it if there is some guidance. Are there some instructions? Thanks!
This what @sjrd just described:
Except that the compiler actually always tries to do this when the type between brackets is a primitive type. But Unit
is the only primitive type for which the coercion works when type x
is not a primitive type.
The difference with def execute[R](): R = (new Object).asInstanceOf[R]
is that R
is not statically known to be a primitive type.
The difference with execute[Unit]()
is that as far as Scala-the-language is concerned there is no cast here. In the bytecode a cast gets inserted but that’s to satisfy the JVM.
You get the same thing with other primitive types:
scala> def foo[A]: A = { val i: Int = 42; i.asInstanceOf[A] }
foo: [A]=> A
scala> foo[Double]
java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Double
at scala.runtime.BoxesRunTime.unboxToDouble(BoxesRunTime.java:113)
... 36 elided
scala> val i: Int = 42; i.asInstanceOf[Double]
i: Int = 42
res3: Double = 42.0
The readme of GitHub - scala/scala: Scala 2 compiler and standard library. Bugs at https://github.com/scala/bug; Scala 3 at https://github.com/lampepfl/dotty, along with some pages it points to, is actually quite complete. It should get you through the basics. In particular Redirecting… is quite good!
Thanks, I’ll check it out!
I have a slightly different understanding, or, if you like, a different misunderstanding.
Here is the ticket for folks who want asInstanceOf[Unit]
to fail.
Here is Lukas’s improvement to boxing I mentioned before. It includes special casing x.asInstanceOf[Unit]
so that null
is handled correctly.
But I think it’s OK not to throw; maybe there should be a lint for this case, something like “dubious cast.” Because usually asInstanceOf
means you have more information than the compiler about the safety of the cast.
This thread began with “why is there too much unboxing of Unit”, but there is too little unboxing:
scala> def fromNull[A]: A = null.asInstanceOf[A]
fromNull: [A]=> A
scala> null.asInstanceOf[Unit] == fromNull[Unit]
^
warning: comparing values of types Unit and Unit using `==` will always yield true
res0: Boolean = false
scala> println(fromNull[Unit])
null
That is, it needs box(unbox(x))
, where unbox
could fail, as it ought to do here:
scala> def f[A]: A = new Object().asInstanceOf[A]
f: [A]=> A
scala> () == f[Unit]
^
warning: comparing values of types Unit and Unit using `==` will always yield true
res0: Boolean = false
scala> 42 == f[Int]
java.lang.ClassCastException: class java.lang.Object cannot be cast to class java.lang.Integer (java.lang.Object and java.lang.Integer are in module java.base of loader 'bootstrap')
at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:100)
... 28 elided
When you do a comparison the BoxedRunTime.equals(Object, Object)Boolean
method is called. It accepts parameters of type Object
, which is Any
in Scala. I that case, there is no need to downcast the actual values and the behaviour is equivalent to:
def f[A]: A = new Object().asInstanceOf[A]
val a: Any = f[Unit]() // a: Any = java.lang.Object@4f45ed22
The behaviour is different for numbers, characters and null. If we compare numbers, then BoxedRunTime
will perform an actual cast.
Looking at all these corner cases I can see that the Unit
has a special place in the type hierarchy, regarding its conversion principles. Some of them are aimed at satisfying the JVM and others at making the life of the developers easier.
I don’t see any point in casting a value to a BoxedUnit
when the expected type is Unit
. For me, it’s one of the omitted cases, when the value discarding mechanism should take place.
I see your opinion is unchanged from your original post.
Unit is interesting only because it is so boring. Moreover, boxed unit is also boring. The utility of the behavior you decry is that it exposes your bug. There is some interest in avoiding needless paired calls to box/unbox the Unit value.
The other behavior you haven’t addressed is your example in Nothing
. It’s obvious the result of type Nothing
must throw. Your example throws ClassCastException
because of the bug, but if it did not unbox the value, you wouldn’t see the usual NPE
as thrown by null.asInstanceOf[scala.runtime.Nothing$]
.
The value discard conversion only happens if conversion is required. Since Nothing
and Unit
already conform to Unit
, there can be no discard.
In a related ticket, someone suggests that Unit
should not be special in this regard, but that any Singleton
type should incur a discard.
I think a lot of confusion stems from the fact that type erasure is not exactly intuitive. Decompiling Scala code to Java code won’t help IMO. First, consider a Scala code:
object Scala {
class Something
class Something1 extends Something
class Something2 extends Something
def cast[A <: Something](param: Any): A =
param.asInstanceOf[A]
def main(args: Array[String]): Unit = {
val x = cast[Something2](arg)
println(x)
}
// val arg = new Object
// val arg = new Something1
}
You need to uncomment one of the lines that define arg
. If you uncomment val arg = new Object
then ClassCastException will be thrown inside def cast
. If you uncomment val arg = new Something1
then ClassCastException will be thrown outside of def cast
, in the line val x = cast[Something2](arg)
.
Why the cast fails in different places if there is only one cast in source code? Because type erasure causes some but sometimes not all checks to be moved from definition site to use site. What can be checked inside a erased method is checked there, what can’t is moved to use site. In this case param.asInstanceOf[A]
inside def cast
is erased to param.asInstanceOf[Something]
as Something
is the only known upper bound of A
during compilation of def cast
. A more precise cast to Something2
is thus moved to line val x = cast[Something2](arg)
as that’s the line where type A
of value returned from def cast
is statically known.
java -p -c
confirms my explanation:
Compiled from "Scala.scala"
public final class temp.Scala$ {
public static temp.Scala$ MODULE$;
private final java.lang.Object arg;
public static {};
Code:
0: new #2 // class temp/Scala$
3: invokespecial #22 // Method "<init>":()V
6: return
public <A extends temp.Scala$Something> A cast(java.lang.Object);
Code:
0: aload_1
// first cast to `Something`
1: checkcast #7 // class temp/Scala$Something
4: areturn
public void main(java.lang.String[]);
Code:
0: aload_0
1: aload_0
2: invokevirtual #33 // Method arg:()Ljava/lang/Object;
5: invokevirtual #35 // Method cast:(Ljava/lang/Object;)Ltemp/Scala$Something;
// second, more precise cast to `Something2`
8: checkcast #12 // class temp/Scala$Something2
11: astore_2
12: getstatic #40 // Field scala/Predef$.MODULE$:Lscala/Predef$;
15: aload_2
16: invokevirtual #44 // Method scala/Predef$.println:(Ljava/lang/Object;)V
19: return
public java.lang.Object arg();
Code:
0: aload_0
1: getfield #49 // Field arg:Ljava/lang/Object;
4: areturn
private temp.Scala$();
Code:
0: aload_0
1: invokespecial #50 // Method java/lang/Object."<init>":()V
4: aload_0
5: putstatic #52 // Field MODULE$:Ltemp/Scala$;
8: aload_0
9: new #4 // class java/lang/Object
12: dup
13: invokespecial #50 // Method java/lang/Object."<init>":()V
16: putfield #49 // Field arg:Ljava/lang/Object;
19: return
}
In def f[A]: A = new Object().asInstanceOf[A]
there’s no upper bound of A
thus no casts are done inside of def f
and instead they are moved to all use sites. If there are many use sites of method f
then in each one compiler can decide to handle the cast differently.
Changing value discarding semantics would suprisingly change semantics of some valid (?) programs. Following code prints null
:
object Scala {
def execute[R](param: Any): R =
param.asInstanceOf[R]
def main(args: Array[String]): Unit = {
val x = execute[Unit](null)
println(x)
}
}
What is more important then:
-
anything.asInstanceOf[Unit] == ()
for any type ofanything
- or
-
null.asInstanceOf[A] == null
for any typeA
?
These requirements are contradictory as it seems.