Synthesize constructor for opaque types

There is a massive difference here in that most bad typed patterns give unchecked warnings, and if not, when someone tries to use an Integer like a String, at least there is a ClassCastException. This isn’t the case with opaque types since they have the same erasure.

What’s more: type tests which was previously always sound is now leaky because we can’t know without resorting to the closed world assumption that someone won’t come along and define a opaque type alias we don’t know about.

For this reason, one should never think of opaques as the same as Haskell’s newtype. If you can’t afford someone being able to successfully casting your opaque to the underlying representation by accident, then you should use something else.

But yes, Scala’s pattern matching is a leaky abstraction. I hope TypeTest will improve things, but phasing out unsound pattern matching from the language is not a small project.

5 Likes

To add to what @mbloms said:

That’s exactly what asInstanceOf is as of today. It completely violate the soundness of any and all type system features which go beyond the JVM’s own very limited runtime type system.

Path-dependent types, refinements, singleton types, intersections, etc. are all unsound in the presence of asInstanceOf. This is understood and does not nullify the usefulness of the type system. You have similar unsound escape hatches in most practical languages, including OCaml, Haskell, and even Idris.

On the other hand, well-typed and warning-free pattern matching is not supposed to be an unsound escape hatch.

Not an unfounded point of view! :slight_smile:

Whenever we talk about pattern matching in Scala, we’re implicitly also talking about isInstanceOf, which is supposed to correspond precisely with the type-filtering part of pattern matching semantics (something that fails to compile or warns with one should also fail to compile or warn using the other). With parametric Top, isInstanceOf would simply not be allowed on values of Top type.

4 Likes

I think the right way to go about leaky abstraction problems is to look into a parametric top type. Once we have Top, opaque types should have it as their default bound. This will rule out pattern matching, equality, hashCode, and toString on opaque types.

Here are some things to sort out:

  • This is a rather complex change, so it will probably have to come after 3.0. But since it does not affect binary compatibility, it could come soon after. However, there will then be a window of vulnerability where code can use universal methods on opaque types. This code will break when we change to parametric Top. Is that OK?

  • Logically, the same change should be done for abstract types and type parameters. I.e. type X would be type X <: Top, and [X] would be [X <: Top]. But this would probably break too much code. So we might have to live with the existing default Any and demand explicit opt-in for the Top bound.

11 Likes

I’m really looking forward to this! Would it make sense to have a transition period where using non-parametric methods on unannotated abstract types raises a warning? This way, people will have time to move to <: Any bounds or proper type classes before the warning becomes an error.

2 Likes

Would it make sense to have a transition period where using non-parametric methods on unannotated abstract types raises a warning?

Yes, maybe. We could make the default upper bound type an alias of Any and rig the type-checker to issue a warning if an Any method is called on this one. But we’d have to evaluate how annoying this would be in practice.

4 Likes

I think there are many use cases that are hard to represent using the current type hierarchy, and changing it would be a very good opportunity to try and address as much of that as possible, while staying as compatible as possible with current code. It seems to me to be quite a big feat so it would probably be smart not to rush it.

That said, while having the full power of newtype in opaque types would be extremely useful, especially when it comes to IArray and similar use cases, they are already very useful despite the weaknesses.

For example, many of the use cases in Haskell export the constructor/destructor. This is used all the time to make it possible to safely define multiple given instances. A notable example of this is Monoid:
https://hackage.haskell.org/package/base-4.14.0.0/docs/Data-Monoid.html

This use case fits perfectly with opaque types, and isn’t at all diminished by morally dubious pattern matching:

I think the problem of code downcasting from Any breaking in the future could be solved quite cleanly:

  • Even with the addition of Top as a supertype of Any, opaque types with an upper bound that is a subtype of Any will still have the problems described above. One solution to this could be to already now disallow downcasts to opaque types. That way, opaque types can still be converted into the underlying type via Any, but not the other way around. Then when the opaque type is changed to have Top as it’s upper bound, it will no longer be possible to cast it to Any.
  • If in the future someone wants morally dubious type conversions to be possible, they can use Any as the upper bound explicitly.

Not allowing downcasting to an opaque type also fits quite nicely with the fact that usually if I define an

opaque type MySpecialBox <: Box = Box

I probably don’t want someone to downcast from any Box to MySpecialBox. If on the other hand I have:

opaque type Boxish <: AnyRef = Box

I really see no way to prevent someone from turning something Boxish into a Box. A Top type doesn’t help at all. The only way to prevent that would be by restricting typed patterns quite radically, which would be an interesting move, but probably a bit too extreme. Actually @odersky, do you have an idea of how hard it would be to require a TypeTest instance for all pattern matching?

TL;DR:

  • Opaques are great, they will be even better with Top
  • Considering adding any kind of sugar like case opaque type should wait until we have Top
  • Downcasting to opaque types should be illegal (unless using the-method-which-should-not-be-named)
  • Let users worry about the other direction until we have Top

I agree that pattern matching against an opaque type pattern should give at least an unchecked warning: https://github.com/lampepfl/dotty/pull/10664

7 Likes

Would it be possible to make it illegal? People that want to do a type coercion could always use asInstanceOf. Maybe wouldn’t be so bad to have some precedence for illegal type testing? People defining opaque types could always define TypeTest instances if they want type tests to be possible.

Ignoring unchecked warnings is already unsafe, so I don’t think that going further for one special case is useful, perhaps there’s a debate to be had on whether unchecked warnings should be errors by default but that’s straying off topic.

2 Likes

Yeah that’s fascinating. I’ve had plenty of experiences where avoiding boxing made a significant perf improvement (very rigorously validated by JMH), and I’ve also had experiences where I’d a bunch of work that is undoubtedly a theoretic improvement only the see the results either not change or even get worse sometimes. Part of me thinks domain vs generic is a heuristic for determining with boxing will make a difference but I wouldn’t put much faith in that, even myself. The JVM still perplexes me to this day. Anyway…

I did not close any PR or issue, or shut down any discussion, just offered my opinion. Others are free to disagree.

100% and I want to emphasise that I haven’t seen you personally close a PR or shut down a discussion. As far as I can see, no one at all has shut down any discussion on this topic yet. To clarify, the reason I mentioned that at all is that it’s a pattern I’ve seen a few times where a discussion concludes but, for right or wrong, a majority (or maybe just a very large group) are still unconvinced, and the documentation and/or the community fail to sway new users who come across the same common use case. In those situations, PRs and discussions do end up just getting shut down because in the eyes of maintainers, the ship has sailed, the debate has been had and there’s no value in repeating it. To me this issue is (was?) starting to look like it would go down that path which is why I want to highlight it now so that even if we don’t modify the implementation, we beef up the documentation to transparently best address those common use cases as best we can, even if a significant amount of people disagree that opaque types are relevant to those use cases.

4 Likes

Fair enough. I think it’s safe to say that I already derailed this topic to the limit already. :sweat_smile:

Before I get a grip on myself and stop spamming I just want to say thanks to @odersky and @smarter for taking the time to address this despite everything on your plate with the Scala 3 launch! As a newcomer on this forum, that feels great! :tada:

3 Likes

Some new info gained from experiments is here: https://github.com/lampepfl/dotty/issues/10662#issuecomment-739852480

5 Likes

This is very interesting. Am I correct in assuming this deals with edge cases where the newtype-like opaque type usage is unsafe? (I’m missing the tl;dr)

Yes, that’s right. I’ll try to summarize:

EDIT: To clarify, the solution makes it safe to emulate Haskell’s newtype using opaque type, but only if no upper type bound is exposed, like in Haskell which has no subtype polymorphism at all, only parametric polymorphism.

Summary

Problem

The semantics of opaque type is not the same as newtype in Haskell. opaque type is actually much more general. The main reason opaque type doesn’t do encapsulation in the same way newtype do is that since Scala allows you to pattern match on litterally Anything, you can expose the underlying type representation (intentionally or by mistake) using pattern matching.

Quoted example (minimized)

For this reason, users using the pattern of defining a constructor/unapply method will experience some gotcha moments when opaque types don’t behave like case classes would have.*

Solution

This is solved in two steps:

  1. Disallow opaque types in typed patterns (like the one above).
  2. Restrict Any so that only subtypes of a new trait Matchable can be used as the scrutinee in pattern matching.

(1) prevents conversions like String -> Name by warning on patterns like:

(str: String) match {case n: Name => n}

Because Name can’t appear in a typed pattern in the case clause anymore.

(2) prevents conversions like Name -> String by warning on patterns like:

(n: Name) match {case str: String => str}

Because Name isn’t a subtype of Matchable so it can’t be pattern matched on at all.

Demo: (minimized)
scala> object n :
     |   opaque type Name = String
     |   object Name :
     |     def apply(str: String): Name = str
     |     def unapply(n: Name): Some[String] = Some(n)
// defined object n

scala> import n._

scala> "hi there" match {case n: Name => n}
1 |"hi there" match {case n: Name => n}
  |                       ^^^^^^^
  |                     the type test for n.Name cannot be checked at runtime
val res0: String & n.Name = hi there

scala> Name("Franz Kafka") match {case str: String => str}
1 |Name("Franz Kafka") match {case str: String => str}
  |                                     ^^^^^^
  |                      pattern selector should be an instance of Matchable,
  |                      but it has unmatchable type n.Name instead
val res1: n.Name & String = Franz Kafka

See this for more:

Note that this isn’t in master yet, and warnings won’t be turned on in 3.0.

*This is also why case classes should still be encouraged and preferred when their semantics is needed! Some kind of sugar like case opaque type could maybe be added in the future when the semantics of opaque type are better understood in practice and the feature has matured more. Preferably these unsafe patterns should be compiler errors rather than just warnings before such sugar is added.

1 Like

Matchable isn’t enough to make all usages of opaque types safe as I mentioned in Add `Matchable` trait - #3 by smarter, the problem is that users are free to define a visible upper-bound for their opaque type which is itself a subtype of Matchable.

2 Likes

Yes, it is enough to emulate Haskells newtype though (unless I’m mistaken?) since that doesn’t support type bounds anyway.

I have the feeling that this discussion, like the previous ones on the topic, has failed to conclude one major question – What is the main motivation behind opaques?

The original SIP explains the motivation as so:

Authors often introduce type aliases to differentiate many values that share a very common type (e.g. String , Int , Double , Boolean , etc.). In some cases, these authors may believe that using type aliases such as Id and Password means that if they later mix these values up, the compiler will catch their error. However, since type aliases are replaced by their underlying type (e.g. String ), these values are considered interchangeable (i.e. type aliases are not appropriate for differentiating various String values).

The current design of opaque types does not achieve that in the slightest, IMHO, and for one simple reason – opaques do not provide access to the methods of the underlying type (like type aliases do).

If better performing value-classes/wrappers is what people want, then there’s nothing wrong with the current design, but please let’s at least have the motivation for opaques cleared up and updated.

1 Like

It’s rather boilerplate heavy, as (at least as I understand it) macros can’t currently produce definitions and you have to re-wrap the type after each call, but you can use them in a way that provides access to the methods of the underlying type:

object types {
  object Name {
    opaque type Type <: String = String
    def apply(s: String): Type = s
  }
  type Name = Name.Type
  
  object Email {
    opaque type Type <: String = String
    def apply(s: String): Type = s
  }
  type Email = Email.Type
}
object using {
  import types.{Name, Email}

  @main
  def test (): Unit = {  
    def foo(str: String): Unit = {
      println(str)
    }
    def bar(n: Name, e: Email): Unit = {
      println(s"$n @ $e")
    }

    val name = Name("JDoe")
    val email = Email("[email protected]")
    
    //foo(name)
    //  Found:    (name : types.Name.Type)
    //  Required: String

    //foo(email)
    //  Found:    (email : types.Email.Type)
    //  Required: String

    foo(name)
    foo(email)

    bar(name, email)

    println(name.toLowerCase())
    
    //bar(name.toLowerCase(), email)
    //  Found:    String
    //  Required: types.Name
    
    bar(Name(name.toLowerCase()), email)
    
    //bar(email, name)
    //  Found:    (email : types.Email.Type)
    //  Required: types.Name
    //  Found:    (name : types.Name.Type)
    //  Required: types.Email
  }
}

You can extract the boiler plate into a trait::

  trait OpaqueOf[A] {
    opaque type Type <: A = A
    def apply(s: A): Type = s
  }
  object Name extends OpaqueOf[String]
  type Name = Name.Type
  object Email extends OpaqueOf[String]
  type Email = Email.Type

And it can be extended further. One could add a validating function to go from A to the opaque type, for example. I think that isn’t the biggest issue here. The criticism, as I understand it, is that you can’t do this:

object types {
  object Duration {
    opaque type Type <: Double = Double
    def apply(n: Double): Type = n
  }
  type Duration = Duration.type
}

object using {
  import types.Duration

  @main
  def test(): Unit = {
    val d1 = Duration(5.0)
    val d2 = Duration(2.0)
    val d3: Duration = d1 + d2  // should compile but doesn't
    println(s"d1: $d1 d2: $d2 total: $d3")

    val d4 = 10.0 + d1 // should not compile but does
  }
}

My understanding is that this is less than ideal because it will force primitives to box. I’d normally use this for stuff that’s already objects, so it’s less of a problem for me.

With a minor tweak, it does work. The missing step is that the result needs to be re-wrapped because the result of _: Duration + _: Duration is a Double - which makes sense because operation on the value may make it no longer valid, like there’s no way to guarantee that (_: Email).take(5) is still a valid Email:

object types {
  object Duration {
    opaque type Type <: Double = Double
    def apply(n: Double): Type = n
  }
  type Duration = Duration.Type
}

object using {
  import types.Duration

  @main
  def test(): Unit = {
    val d1 = Duration(5.0)
    val d2 = Duration(2.0)
    val d3: Duration = Duration(d1 + d2)
    println(s"d1: $d1 d2: $d2 total: $d3")

    val d4: Double = 10.0 + d1
    val d5: Duration = Duration(10.0d + d1)
  }
}

The d4 compiles because d1 known to be a Double subclass, which is usually fine for these sort of wrappers. Usually what you’d want to avoid is mixing subclasses, or passing a Double where you want to be sure that it’s a valid Duration.

Edit: sorry for being scatter-brained, here’s the Scastie link

1 Like