Synthesize constructor for opaque types

morgen-peschke · November 17, 2020, 9:13pm

There’s almost definitely a cleaner way of unwrapping it. The intention was to sanity check the feasibility of using a union with a path-dependent type to implement a zero-allocation wrapper that’s got nicer semantics than opaque types.

Opaque types are a bit awkward for this, as that’s apparently not really what they’re meant for. So if this type of encoding can work, then it’s possible we can refine that approach to get to a zero-allocation wrapper that can be defined in a couple of lines and has the sort of quality of life stuff you see in existing newtype implementations.

smarter · November 17, 2020, 9:25pm

I still don’t get it, you can do the same with an opaque type as far as I can tell:

trait NewType[Wrapped] {
  opaque type Type = Wrapped
  
  def apply(w: Wrapped): Type = w
  
  extension (t: Type) {
    def unwrap: Wrapped = t
  }
}

morgen-peschke · November 17, 2020, 10:36pm

Yep, that looks promising.

Attempting a direct translation of the existing solution missed a way to use opaque types to build a replacement (rather than opaque types being a direct replacement)

It looks like all the imports/scopes work intuitively, though there’s probably additional opportunities for exploring ways to keep primitives from boxing.

object definitions {
  trait NewType[Wrapped] {
    opaque type Type = Wrapped

    def apply(w: Wrapped): Type = w

    extension (t: Type) {
      def unwrap: Wrapped = t
    }
  }
}
object types {
  object Name extends definitions.NewType[String]
  type Name = Name.Type

  object Email extends definitions.NewType[String]
  type Email = Email.Type
}
object using {
  import types.{Name, Email}

  @main
  def test (): Unit = {  
    def foo(str: String): Unit = {
      println(str)
    }
    def bar(n: Name, e: Email): Unit = {
      println(s"$n @ $e")
    }

    val name = Name("JDoe")
    val email = Email("[email protected]")
    
    //foo(name)
    //  Found:    (name : types.Name.Type)
    //  Required: String

    //foo(email)
    //  Found:    (email : types.Email.Type)
    //  Required: String

    foo(name.unwrap)
    foo(email.unwrap)

    bar(name, email)

    //bar(email, name)
    //  Found:    (email : types.Email.Type)
    //  Required: types.Name
    //  Found:    (name : types.Name.Type)
    //  Required: types.Email
  }
}

fanf · November 18, 2020, 9:02am

As a business user, not a library one, this is my use case, and the one for which I thought opaque type will help:

all my business objects are identified by UUID (ie a string)
I have (tens? hundreds?) millions of these ids in hundreds of different business objects.
they are used in a lots of Map, Set, etc
they all have a trivial “debug/display string” (the underlying string - note that I would LOVE to be able to compile-time forbid toString on them) and a trivial constructor (from the string).

My primary goal is to have a compiler that helps and so I never ever want to have to wonder if I mixed a “GroupId” with a “RuleId”, be able to safely refactor, etc. For exemple I want to be able to test for equality on them and know when I don’t compare same kind of IDs (it’s always an error, likely due because I’m refactoring and I changed a method parameter or whatever).
That help MUST be as easy to use and natural in scala code as possible, BUT explicit. I never ever want a string to be magically used as a RuleId for ex. Like I never ever want to print a RuleId directly, always its debug string.

Just to be very very clear: defining such objects must be as easy and boilerplate free as possible, else they are not used as much as they should. This is integral part of the first goal. Typically, I often define algorithm-local identifier to differentiate between different step of the computation. [EDIT: instanciating these objects must be trivial, ie case class-like and no more, but defining them for the simple case must be simple too]

So for that, until now, I used case class for that. And in some cases value class but the constraint rarely worth it (top level definition, etc)

final case class RuleId(value: String)

A second goal is to minimize as much as possible the runtime cost of my model that is only here to help developpers at compile time. The fact that a ruleid is more than a string is useless at runtime. Actually, I would like to be able to explore alternative runtime representation without changing compile-time API (which of course includes whatever serialization/debug representation needed). For ex: I would love to know if switching from a string encapsulated in class is much more costly than, say, array of two longs in real load. But today the cost of that test is big.

Finally, some IDs are composed with 2 or more UUIDs and I have a lot of others similare objects like that, composed from some simple types, like stats objects (tens of long, each long having its own semantic). And I want to have a consistant recipe for dealing with all these cases in a consistant way, so that new dev on the project (or old dev coming to a similar problem a couple of month after) can just copy/past existing example without having to worry of too much ceremony, scalac implementation details, or other surprising things.

For these three points, I would LOVE to have a consistent, boiler-plate light way to deal with that case (without having to wait for Valhalla)
I will explore if opaque type are a good feet for that.

LPTK · November 18, 2020, 9:06am

This is totally unsound, regardless of how erasure is encoded.

@ toName("Frodo"): NameTag
java.lang.ClassCastException: java.base/java.lang.String cannot be cast to ammonite.$sess.cmd0$NameTag
  ammonite.$sess.cmd5$.<clinit>(cmd5.sc:1)

I also don’t know what you’re trying to show. As @smarter says, the same can be done (soundly) with opaque types today. I think the main problem raised by the community is that this is already too much boilerplate.

FelixHargreaves · November 18, 2020, 5:05pm

This!

eyalroth · November 19, 2020, 2:54pm

That is exactly the major motivation mentioned in the original SIP:

Authors often introduce type aliases to differentiate many values that share a very common type (e.g. String , Int , Double , Boolean , etc.). In some cases, these authors may believe that using type aliases such as Id and Password means that if they later mix these values up, the compiler will catch their error. However, since type aliases are replaced by their underlying type (e.g. String ), these values are considered interchangeable (i.e. type aliases are not appropriate for differentiating various String values).

However, the SIP mentions that – up until now – this was possible to implement via value classes (AnyVal), but that the problem with them is that they do not have sufficient performance.

In short, the SIP tries to “kill two birds with one stone”; i.e, non-interchangeable types and better value classes (which have other uses as well).

Similar discussions seem to have risen in the TypeScript community:

@RyanCavanaugh had a great analogy about this where 3 kids are asking their parents for a pet. One wants a dog, one wants a cat, one wants a fish. They ask their parents “when are we getting a pet!?” Clearly they all agree they want a pet, but each wants a different pet!

(quoted from this open issue about a similar feature for TS)

So yeah, I’m in favor of separating these features. One should target non-interchangeable type aliases (like UUID), and the other should target better-performing value classes.

FelixHargreaves · November 20, 2020, 3:03pm

I agree, however, I foresee arguments against this due to feature bloat. Which leads me back to annotation macros. If we don’t want it as a language feature, we should allow easier language extensions. In general it feels like extending things that are not expressions is difficult / impossible in the current state of Scala 3 (please correct me if I’m wrong)

SethTisue · November 30, 2020, 3:39pm

latest example of opaque types being perceived by the community primarily as a value-class replacement: Dean Wampler’s https://medium.com/scala-3/opaque-type-aliases-and-open-classes-13076a6c07e4

FelixHargreaves · December 2, 2020, 1:36pm

It’s pretty clear that this was one of the key features in the SIP, so it’s natural that people will expect this.

mbloms · December 2, 2020, 2:30pm

I would like to add I find the main use case for opaque types is when you want to have one type represent something else. If you want to expose that any string can be used as a usernames, this is easily done in the current implementation:

object o {
  opaque type Username >: String = String
}
val uname: o.Username = "user"

If you want to expose that usernames are strings, then that is also easy:

object o {
  opaque type Username <: String = String
  def Username(str: String): Username = if str.contains(' ') then ??? else str
}
val str: String = o.Username("user")

There are many ways to implement other wanted behavior which would be much harder to do in other languages.

The current implementation is very general and that is desirable.

eyalroth · December 2, 2020, 5:10pm

mbloms:

If you want to expose that usernames are strings, then that is also easy:

object o {
  opaque type Username <: String = String
  def Username(str: String): Username = if str.contains(' ') then ??? else str
}
val str: String = o.Username("user")

I believe that this is not the desired functionality. Yes it is desired to have String’s methods on a Username variable, but there shouldn’t be an easy/implicit cast to String:

somekeyword type Password = String

val password: Password = "god123"

val strPassword: String = password // compilation error
password.toLowerCase() // what's the return type? (String or Password?)
password + "x" // should be allowed? what's the return type?

mbloms · December 2, 2020, 5:24pm

I think you have it backwards. If you write

opaque type Password <: String = String

then this does not compile:

Because Password is a subtype of String, but the opposite is not true.

All of this is compiles fine and since you use the Password as a String, the return type is String in all examples:

eyalroth:

val strPassword: String = password // compilation error
password.toLowerCase() // what's the return type? (String or Password?)
password + "x" // should be allowed? what's the return type?

In particular, the first line is not an error.

eyalroth · December 2, 2020, 5:31pm

Well, yeah, that’s exactly my point – the current opaque implementation does not provide the desired functionality of non-interchangeable types, which is stated as one of the major motivations for the feature.

That’s kind of the whole point of this thread.

mbloms · December 2, 2020, 5:40pm

I don’t understand what you mean now. What is missing? If you don’t want to expose a subtyping relation between eg. Username and String then you just skip the bounds:

opaque type Username = String
def Username(str: String): Username = if <valid username> then ??? else str

Now you can’t use a Username as a String or use a String as a Username

FelixHargreaves · December 2, 2020, 6:00pm

If the validation fits here, it should probably return an Either[Error,Username], but I wouldn’t expect any apply method to do that. The call site syntax would be very confusing:

val myUsername = Username("JohnDoe") // I would expect the type to be `Username` here, not `Either[Error, Username]`

I think the problem arises because there’s no way to mitigate the boilerplate, especially in the many cases where I don’t want to do any validation, but am using the opaque type only for tagging.
I foresee many many lines of code similar to this:

opaque type Username = String
def Username(str: String): Username = str

opaque type Token = String
def Token(str: String): Token = str

opaque type UserId = UUID
def UserId(uuid: UUID): UserId = uuid

...

Add to this extension methods to make the reverse transformation for each opaque type.

I would consider this good practice compared to using the underlying types directly, however, I think people will grow tired of this boiler plate and start using the types directly.

mbloms · December 2, 2020, 6:12pm

Well yes, you could do that with a normal method instead of a simple apply if that fits better, which is also my whole point here. The current implementation does not favor one use case over another, and that’s a good thing.

I really don’t see how this is that big of a deal. It’s readable and I’ve had to write much worse boilerplate than this. It also really should be easy to automate this if you want to.

eyalroth · December 2, 2020, 6:29pm

But then I don’t get the methods of String for Username. For the same reason I don’t use AnyVals to provide me with non-interchangeable types. It’s not because they’re not performant enough (I rarely care for this level of performance), but because they hide all the methods of the underlying type (just like opaque).

FelixHargreaves · December 2, 2020, 6:39pm

How? I would turn to macros for this, but it doesn’t seem like that’s possible. Generating code from the outside feels like a non-solution to me.

kavedaa · December 2, 2020, 6:59pm

There are 3 variants of opacity: full opacity, and opacity one or the other way:

object A:

  opaque type Id1 = String
  def Id1(s: String) = s: Id1
  extension (id1: Id1):
    def s1: String = id1

  opaque type Id2 <: String = String
  def Id2(s: String) = s: Id2

  opaque type Id3 >: String = String
  extension (id3: Id3):
    def s3: String = id3

end A

For the first variant you need explicit “conversion” both ways, for the 2 others only one of the ways.

Usage:

val a1: A.Id1 = A.Id1("foo")
val s1: String = a1.s1
val l1 = a1.s1.length

val a2: A.Id2 = A.Id2("foo")
val s2: String = a2
val l2 = a2.length

val a3: A.Id3 = "foo"
val s3: String = a3.s3
val l3 = a3.s3.length

When you want to distinguish between e.g. User.Id and Invoice.Id (which to me is the most useful application of the feature), variant 2 would be most suitable. Here you can use the opaque type as the underlying type, but not vice versa. The methods on the underlying type are also directly available.

I agree that it would be useful to optionally synthesize both the “constructor” and the extension method.