Pre SIP: Replace non-sensical @unchecked annotations

That’s precisely the kind of hair splitting I was referring to. Users do not care whether the type error they get is a “type mismatch message”, a “refutable pattern message” or an “exhaustiveness message”. It’s all the same to them. Previously it was not, since only type errors were errors and the others were silent or maybe sometimes warnings. But if we want to make them errors, we need to address this point. And we should not do it by designing little isolated solutions. That’s not Scala’s philosophy.

There’s another reason, which is partly how I came to this. We are looking into pluggable refinement type systems. That is you could have a more fine-grained type system where e.g. Pos is different from Int or NonEmptyList is different from List. Technically, Pos is a subtype of Int and NonEmptyList is a subtype of List. The idea would be that we run plugins or custom-phases after the type checker to verify that an Int value is really a Pos, or that a list is non-empty. For instance by running some kind of solver. We can do some of this in the type system today by clever encodings but it gets very clunky very quickly so I am not a fan of these encodings.

Now, technically, we can represent types like Pos with annotatons. Scala 3 has the concept of a RefiningAnnotation that helps here. But the down casts like Int --> Pos cannot be expressed in a general system. My idea was that we could have both

checkedCast   // checked at runtime
provenCast    // proven to be safe at compile time

That would open the door to more refined type checking without blowing the user’s complexity budget. provenCast for now is not on the table, but checkedCast is. As I argued, we need it immediately since we want to tighten the pattern checking rules.

I thoroughly disagree with this. First of all, I don’t think most users think of casting as a “type mismatch silencer.” Second, it’s most certainly not hairsplitting. The only thing that should exist as a single thing whose purpose is to silence multiple unrelated kinds of errors is something like @nowarn, because that’s on the abstraction level of controlling errors per se. I would never assume a “method … overrides nothing” error should be fixed in the same manner as a “recursive value needs type” error. They are just very different errors with different causes. Type mismatch errors usually mean you’re using the wrong variable or you need to add a conversion or you’re passing the parameters in the wrong order or whatever. Most of the time you shouldn’t just go for the “off” button. And exhaustivity errors generally mean you should add a case or fix an existing one. It’s true that you can’t do that with val patterns though, although I consider that to be a relatively advanced feature so I’m not as worried about it being “all the same to them,” because I think most people using them would understand the difference between exhaustivity issues and type mismatches already.

Anyway I think the main thing is that the error message should contain instructions on how to fix it. And if it does then I don’t see what we gain by making certain casts in certain places be treated specially by nearby patterns.

I totally agree with you about that. And I think the best way to do that is by starting with a clear problem statement. I understand that killing more birds with one stone is desirable, so we don’t have to think about each problem in total isolation. But please let’s disentangle what the problems are from predetermined solutions, so we can explore the solution space more thoroughly.

Coming back to the the thing about Scala unifying things and not having little isolated solutions, that’s a superb thing about Scala as long as it unifies things that are at their heart the same all along. For instance, making (lazy) val/var/def syntax symmetrical (other than def supporting parameters) is brilliant, because they’re all just a way of binding a name to an expression, different only in evaluation semantics and support for reassignment. Using objects instead of statics is brilliant, because implementation details aside, at its heart that’s what statics really were all along: a way of saying the class itself also acts as its own object. Unifying the syntax for creating a local scope with the syntax for methods with more than one statement is brilliant, because we have a simpler language model: methods’ right hand side is are always a single expression, but block expressions are a general kind of expression.

But making checkedCast do two unrelated things is not unifying. There’s no single underlying concept that it represents in an unsurprising way. It’s not unifying, it’s hijacking.

Again, I think we’ll be able to come up with a better solution if we start with a clear framing of the problem in its own terms, and then proceed to enumerate possible kinds of solutions and their tradeoffs, and only then get into concrete proposals.

2 Likes

Perhaps it would be better to have a separate thread about refinement types, and about how to fix the problems with casting currently. It should explore even things that aren’t on the table to actually implement now, so that in the long run we have consistent naming conventions. Then, based on what the conclusion is, it would be possible to see if it can help solve the problem in this thread too. But I think it would be good to see how the community approaches a problem from scratch, rather than just reacting to a concrete proposal. Bring them into the whole thought process, so to speak.

We are looking into pluggable refinement type systems.

I think this generalization and future enhancement of the capability of the type system sounds very promising!

My main itch is that the names you propose seems less obvious from the intent+behavior perspective than what you wrote in the comment:

checkedCast   // checked at runtime
provenCast    // proven to be safe at compile time

I have seen many times that learners struggle with their mental model on what is actually happening at runtime and compiletime respectively, so I’d like the intent+behavior to be obvious in the names, such as (or something shorter but still related to if it is runtime or not):

castCheckedAtRuntime   // checked at runtime
castProvenAtCompileTime    // proven to be safe at compile time

(But if they are a bit long it might be ok, comparing to the rationale behind isInstanceOf and asInstanceOf also being long as the clunky-ness sticks out for a reason: it should be noted by the reader… But on the other hand I usually like shorter than longer names if they are still easy to understand by just reading them and not looking at the docs, so I’m unsure what’s best)

I think the expression like val x = y.unsafe looks like something with effect. The expression x.unsafe should be essentially the same as x, but it looks like this expression has a side effect. For me this is the difference between unsafe and other function-like things. An explicit keyword should be the best solution for this case, I think.

Having read (most) of the thread, I am under the impression that whether or not the replacement of @unchecked should be a method is the heart of the discussion. Until this question is answered, the name of the method, keyword, annotation, or <insert other construct> that will implement the feature is mostly a distraction at this point.

I think that the only way out of the conundrum is to better identify and/or state the specific problem that this thread aimed to address and carefully evaluate the options we have to solve this specific problem. If the issue is big enough to warrant a language feature, I think it would be best to consider it in isolation before trying to unify it with some existing (or possible future) concepts. Perhaps it could be helpful to restate @nafg’s attempt:

FWIW, this description convinces me that there’s a problem and that it may be worth solving. I’ll leave this judgement to people with more Scala lines under their belt.

One clarification I might ask is whether we’re talking about a backward compatibility issue due to the fact that the language has moved to more conservative pattern refutation, or about opt-in stronger static guarantees with using yet-to-be-defined feature. If it’s the former, my two cents coming from a language with very conservative pattern refutation (Swift) is that asking users to write “catch-all” patterns is not that bad.

Here are the two strongest arguments I’ve parsed in favor of methods. Apologies if I missed others:

Here is the strongest argument Ive parsed against:

I don’t have enough Scala expertise to form a strong opinion about the best solution or come up with another one. What I can say, however, is that I very much agree with the fact that a method-looking feature does not seem to be the right tool.

I find the idea that method-like features should behave like methods quite compelling in terms of explainability and learnability. If a method is in general not able to influence the behavior of the type checker (outside of inferring its arguments and return value), then abusing method call syntax is probably a confusing choice, regardless of conciseness or convenience.

Abusing syntax makes the language harder to learn and understand. I’d say that it’s because humans like predictable patterns, but I’m no HCI expert. That being said, I completely disagree that users do not care about the kind of error messages they get. I certainly don’t and I certainly would be confused to get an error I generally associated with concept A while using the syntax of concept B.

I am sympathetic to the discoverability argument but at the same time I wonder how important it is for a feature that should be discouraged (I assume) and that already gets a pretty nice fix-it in the error diagnostic.

Now it’s easy to criticize a proposal without offering a solution. So if I was pressed for one, I would say that it feels to me that if we want to customize the behavior of pattern matching, then it is the construction that does the matching that should be adorned, not the values on which it operates.

7 Likes

I like this level of explicitness, if not more. I really think the unsafe matching is an anti-pattern almost always. If you tighten up exhaustiveness and refutability checking in a helpful way, then the complaints should be heeded; and if the tightening just exposes how the type system is actually unable to provide actionable information then one should reconsider whether that tightening is wise.

However, I completely agree with the complaints that xs.unsafe does not itself, as described, seem to fit well into how the language works. The unsafeness, or request for a runtime type check, or relief from overly pedantic refutability, does not seem to be reflected at all in the type system.

That’s honestly pretty weird. Almost always, if you want to communicate behavior to some outer scope, and it looks like regular language behavior, you do so with a type.

But here, xs.checkedCast has exactly the same type as xs. It’s just xs.type.

And because there’s no type, there’s no intuition for what might work.

val xs: List[Any] = null
val y :: ys = xs.checkedCast.tap(println) match
  case null => "hi" :: Nil
  case z => z

I mean…does this work? How do we know? Did the tap mess it up? Does the z => z case preserve the checkedcastedness? Did it work on both the match and the val at the same time, or only one?

In contrast, if the method actually changes the type then at least we have the idea that different types are witnesses to different allowable behavior.

For instance, if the compiler synthetically did something with an opaque type wrapper (akin to the explicit call in my linked example, but probably with TypeTest to be safer), it would be far less surprising: Scastie - An interactive playground for Scala.

And then, with a big awkward wrapper type effectively discouraging you from doing this, you could pass the capability around, which is usually considered to be more scalable.

7 Likes

In a lot of cases, this is what is done by ascription, right ?

val x: 4 = 4
val y: Int | T = x: Int // proves at compile time that 4 <:< Int

For vals at least it makes a lot of sense to continue like this:

val x: Int = ???
val (y: Pos) = (x * x): @provenCast // proves at compile time that x * x has type Pos
// or even
val (y: Pos) = (x * x): ProvenCast

(And both annotations and types have documentation, just like methods !)

It also somewhat works for pattern matching:

z match
  case (h: T) :: Nil : @provenCast =>

You’ll note it’s not clear what would need to be proven, that z is of type List[Any], List[T], something like NonEmptyList[T], or SingletonList[T] ?
Right now it’s clear (the input type to the unapply method), but with refinement types, the line is much blurrier !

This syntax has the upside that we can mix proven and checked casts freely, but the downside that we need to repeat the “casting keyword” on every case
If this is deemed too much to bear, we can then move it to the match, as I had proposed previously:

z match @checkedCast
  case (h: T) :: Nil =>

Which does not really work with the magic type approach:

z match: CheckedCast
  case (h: T) :: Nil =>

z match[CheckedCast] // Fun fact: some intermediate representations are printed like this
  case (h: T) :: Nil =>

Or on the scrutinee as below, but I think it is conceptually weird, as we’re not really changing the scrutinee, we’re changing what the match means:

z @checkedCast match
  case (h: T) :: Nil =>

(And there is also some precedence issues (?), but honestly I’d rather add a special case in the parser for it rather than use the method syntax for it)

With the magic type:

(z: CheckedCast) match
  case (h: T) :: Nil =>
// or with some tweaks to the syntax:
z: CheckedCast match
  case (h: T) :: Nil =>

The latter is not ideal, but technically unambiguous since match types are forbidden in ascriptions

Note:
I don’t know if I like the idea of a magic type, but we can make it work
We would of course only allow it in ascriptions, which can be even tested during parsing
And if I’m not mistaken, there are already cases where an ascription can change the runtime behavior of a program

I forgot to quote @alvae which makes this point much better than me:

2 Likes

That just means statis. We cannot tighten the rules and have no useful alternative to defer checks to runtime or an external prover. I am not particularly keen on tightening the rules (others are, though). But I will object to any effort to do so if we don’t have a good alternative to @unchecked in place.

We don’t customize pattern matching. We are asserting that a value fits its expected type, just like asSeenFrom. Nobody objects that being a method (in Scala, and lots of other languages), even though in C or Java it is special syntax.

What does asSeenFrom does? Is it a type cast?

If so then nobody objects because there’s a way to at least conceptualize (if not actually implement) the behavior of a cast as an abstraction taking an argument of A and producing a value of B. The objection is not about the cast but about the fact that x.foo match case bar => y does not have the same static semantics as x match case bar => y. To me, that is the heart of the problem.

Sorry, typo, I meant asInstanceOf. That’s my fingers typing… :smile:

I claim that’s the wrong way to see it. A type cast does not do or transform anything. It simply asserts that a value is of the given type at runtime. And asInstanceOf’s type argument can even be inferred (albeit with pitfalls). What I am saying over and over again is that checkedCast is just the same, only without the pitfalls of asInstanceOf.

2 Likes

For most intents and purposes, the type checker doesn’t have to care about what a function does at the call site. Whether or not it is actually no-op has no impact on the static semantics. That is the heart of my point and what I mean by “method-like features should behave like methods”.

I’m more interested about the static semantics here because this is the level at which I and the compiler can have a conversation about the well-typedness of my program.

More formally, you can for example write a single typing rule

Γ ⊢ e1 : t -> u   Γ ⊢ e2 : t
----------------------------- (app)
Γ ⊢ e1 e2 : u

and have your type checker be happy to “apply” any function you want as long as you convince it that it has type t -> u. In a language like Scala, that simply means providing a signature that the checker can trust. Whether or not the function performs magic is irrelevant. Likewise, I can use this one single rule to think about each and every application of my program, regardless of the spelling of e1.

IIUC, using method syntax does change the behavior of the type checker because it has to trust that your patterns will be checked at runtime if and only if the subject of the match has a particular syntax. Otherwise, it would check them at compile time. Going back to my pseudo-formal example, that would mean introducing another application rule that applies if and only if e1 is spelled unchecked.

“most intents and purposes” means no target typing. But target typing is a fact in Scala and many other languages. Implicit conversions would not work without it and half of type inference would not work without it either. The only difference between standard target typing and checkedCast is that we now also assume that the target type gives information about refutable patterns and inexhaustive matches. That’s an easy and natural generalization step in a situation where we want to convert these faults from runtime errors or warnings to static errors. Rejecting this analogy and elevating it to a fundamental difference is what I am continuously calling “hair-splitting”.

1 Like

So conceptually if I understand, what is happening is this transformation:

starting from

def xs: List[Any] = ???
val y :: ys = xs.checkedCast

xs.checkedCast match
  case y :: ys => ...

is “desugared” to

def xs: List[Any] = ???

val xs0: ::[Any] = xs match
  case xs0: ::[Any] => xs0
  case _ => ???

val y :: ys = xs0

val xs1: ::[Any] = xs match
  case xs1: ::[Any] => xs1
  case _ => ???

xs1 match
  case y :: ys => ...

and this desugaring is dependent on expected type.

so therefore

def myChecked[T](t: T): t.type = t.checkedCast

is “desugared” to

def myChecked[T](t: T): t.type = t match
  case t0: t.type => t0
  case _ => ???

which is a no-op and therefore checkedCast behaves like a method, can we move on :slight_smile:

the last conceptual desugaring to consider is when we have disjoint patterns, what is the meaning of the following:

val foo: Matchable = ...

foo.checkedCast match
  case str: String => str + " is matched"
  case i: Int => i * 23

well we do have union types so, massaging a bit there is this “desugaring”

val foo: Matchable = ...

val foo0: Int | String = foo match
  case foo0: String => foo0
  case foo0: Int => foo0
  case _ => ???

foo0 match
  case str: String => str + " is matched"
  case i: Int => i * 23

I guess the spec might not need to change to explicitly mention these rules, (because the desugaring doesn’t actually occur) but hopefully these examples show that the behaviour isn’t surprising

That’s not the only issue. With target typing, you can do

  • identity(foo.uncheckedCast) match
  • Seq(foo, bar).map(_.uncheckedCast: TargetType).map{ ...partial function }
  • val x: TargetType = foo.uncheckedCast; x match
  • foo.uncheckedCast[TargetType] match
  • val x = foo.uncheckedCast: TargetType; x match

People do such refactors all the time to work around target typing issues, and I’ve done several at work in the past week since target-type/implicit inference changed between 2.12 and 2.13, and you can always explicitly add the type ascription and/or explicitly specify the type parameter. This is not some edge case rules lawyering, but a very standard SOP for how you can refactor Scala expressions in the presence of target typing

IMO, if we can encode .checkedCast’s behavior properly in the type system, so that the above things work with some kind of notation for TargetType, I would believe it’s “just target typing” and is an elegant solution. This would be similar to how implicitly or identity are not special forms, but normal userland methods making use of generic Scala features. Those are great, and if we can solve more fundamental problems with similar userland methods, all the better

We already have PartialFunction which serves this purpose for lambdas, with some special language support, letting people write their own .collect methods that take incomplete match blocks. It has proven super useful. A special type that checkedCast returns with similar language support would resolve almost all the problems discussed by me, @nafg, @sjrd, and others

If we cannot encode .checkedCasts behavior such that it can be abstracted over or re-defined in userland, that looks a lot more like a special form than target typing to me. A special modifier that applies to certain keywords like val or match, rather than a target typed method returning a value of a well-defined type. And special forms deserve special syntax

People get confused by “method calls that are actually special syntax” all the time.

  • There was just another post on the #sbt discord a few hours ago about .value not being refactorable the way they expect. Some baseline level of confusion has been a constant ever since SBT 1.0 (“why does my task evaluate even when referenced in the non-taken branch of the if statement”) and Mill inherited the design and suffers similarly.

  • Scala-Async suffered from fundamental issues like “why can’t i put my await in a for loop?”, things that look like they should work but don’t since async/await are not really methods

  • Morgan Stanley’s discussed at Scaladays how their optimus platform that does similar transforms has constant issues with people writing code the “wrong way” and causing bottlenecks, because people assume syntax that looks like a normal method call is a normal method call, when it isn’t and obeys a different set of refactorability rules

Confusion around special forms that look too much like methods is not a hypothetical problem. In this case maybe the issues will be less extreme, but the fundamental problem of “this method call isnt really a method call” remains. In those other cases there’s a huge upside in code clarity that makes the downside perhaps justifiable, saving a whole bunch of zips and maps and flatMaps. But in the case of foo.uncheckedCast, the syntactic improvement over (foo: @unchecked) seems marginal enough that it doesn’t look worth it to me. I’ve heard a lot of complaints about Scala and the syntax for : @unchecked, while awkward, isn’t really a huge deal

4 Likes

What I was referring to wasn’t asInstanceOfs weird target-typing (mis?)behavior, but it’s ability to be wrapped in a forwarder/wrapper function that does the same thing, demonstrating that despite it’s magic implementation, it’s type signature is perfectly normal:

@ def foo[T](x: Any): T = x.asInstanceOf[T] 
defined function foo

@ foo[CharSequence]("hello") 
res3: CharSequence = "hello"

@ foo[CharSequence](123) 
java.lang.ClassCastException: class java.lang.Integer cannot be cast to class java.lang.CharSequence

This is not the case with .checkedCast. There is no way you can define foo to make foo(x) match behave the same way as x.checkedCast match, even though we can trivially define foo[T](x) to behave the same way as x.asInstanceOf[T]

The one quirk is that asInstanceOf treats the lack of an explicit type parameter poorly. I’m actually not sure why x.asInstanceOf without the explicit type parameter [T] defaults to Nothing even in the presence of target-typing, while a normal method (foo below) does not:

@ def foo[T](x: Any) = x.asInstanceOf[T] 
defined function foo

@ val x: CharSequence = foo("hello") 
x: CharSequence = "hello"

@ val x: CharSequence = "hello".asInstanceOf 
java.lang.ClassCastException: class java.lang.String cannot be cast to class scala.runtime.Nothing

Shouldn’t it be target-typed to T? Clearly normal methods would target-type correctly, as shown by foo above. This is something that has annoyed me for years writing Scala, there are plenty of times where I want to cast something with a good target type but am forced to tediously spell out the entire target type just so asInstanceOf doesn’t blow up.

If anything, this is an argument that we should have fewer methods that behave specially, not more. I have never found a scenario where asInstanceOf’s “special” type-inference/target-typing behavior was beneficial, but regularly find scenarios where it’s a hinderance and I would prefer it to behave like exactly other methods

7 Likes

So here’s a proposed tweak that keeps @odersky’s .checkedCast proposal mostly in place, but aims to make it more “well behaved” as a Scala “method” (names are all arbitrary and can be discussed later):


  1. checkedCast is defined in the standard library as below (it could also be an extension method or static method)
trait Any{ 
  def checkedCast: CheckedCast[this.type] = this.asInstanceOf[CheckedCast[this.type]] 
}
  1. scala.CheckedCast[T] is a “magic” opaque type alias defined in the standard lib
opaque type CheckedCast[T] = T
  1. scala.CheckedCast[T] is handled specially by the compiler. It has no members, and the only thing you can do to it is match or it or assign it to things. match or assigning a CheckedCast[T] treats it as a T, but disables the exhaustivity checking (this is the necessary compiler magic)

AFAICT, this should keep most of @odersky’s examples working, but also make all the cases that @sjrd and I brought up work


{ val y: x.type = x; y.checkedCast } match { ... }

This will work because we will no longer be looking syntactically at a trailing .checkedCast method, and would instead have the y.checkedCast return CheckCast[x.type] which would propagate out of the block and thus reach the match

val y: x.type = x.checkedCast
y match { ... }

This would now work because instead of val y: x.type, we would now have val y: CheckedCast[x.type], which would then reach the match

def f(y: CheckedCast[Foo]) = y match ...

f(x.uncheckedCast)

This would now work because CheckedCast[T] is a proper type, and we can have values of that type, parameters of that type, and it allows the x.uncheckedCast’s CheckedCast[Foo] propagate into f as the parameter y, to then reach the match

def foo[T](x: T) = x.checkedCast

A forwarder/alias foo can now be defined, with the type being foo[T](x: T): UncheckedCast[T]. You can now do foo(x) match and the .uncheckedCast inside the foo method will propagate out out of the foo to reach the match outside of it

identity(foo.checkedCast) match

This might work, because identity would infer the type parameter UncheckedCast[foo.type], which would propagate and then reach the match

Seq(foo, bar).map(_.checkedCast: TargetType).map{ ...partial function }

This would now work, where TargetType is written CheckedCast[T]. It would propagate into the parameter of map’s lambda, and allow the partial function passed to map to be partial

val x: TargetType = foo.checkedCast; x match

This would now work. TargetType would be spelled CheckedCast[foo.type]

foo.checkedCast[TargetType] match

This would work, as checkedCast is a normal method. It would not take a type parameter, so it would just be foo.checkedCast match

val x = foo.checkedCast: TargetType; x match

This would work, with TargetType being spelled CheckedCast[foo.type]


Overall, I think this would be a strict improvement over the original proposal. We still have some magic, but the magic is much more well defined:

  • A single generic opaque type alias which is handled specially during match and =.

  • No more fragile pattern matching on the exact shape of the syntax tree

  • .checkedCast becomes a normal method defined entirely in user-land that behaves identically to all other methods in all regards.

  • Even the type CheckedCast[T] behaves exactly like any other type, and can be written out in type ascriptions, inferred, used in generics (e.g. Seq[CheckedCast[T]]), etc. up until it reaches a match or = at which point the magic kicks in in a very localized fashion

I’m still not convinced that .checkedCast should look like a method. In all the examples given, it looks like it should be a keyword modifier on val or match. However, if we do assume that it should look like a method, these tweaks would ensure that it behaves like a method for most intents and purposes, except for a tiny bit of compiler magic to disable exhaustivity checking in match and val when the expression is of type CheckedCast[T]

8 Likes

As I see it, checkedCast will allow abstraction, as long as the expected type has enough information, to make this regular the change to spec would be to allow match to impart some constraints to type inference (to enable the tunnelling)