Synthesize constructor for opaque types

Good to know. Do you know if the encoding used by https://github.com/rudogma/scala-supertagged has the same issues?

It seems to be relying on similar tricks with intersection types and casting so it’s likely also problematic.

I’m not familiar with what the generated bytecode looks like, do you think it would be possible to implement some of these techniques using union types rather than intersection types?

This is obviously extremely rough, and I imagine it’d probably box primitives, but it looks like it might be workable, depending on what the compiler generates:

trait NewType[Wrapped] {
  trait Wrapper
  type Type = Wrapped | this.Wrapper
  
  def apply(w: Wrapped): Type = w
  
  extension (t: Type) {
    def unwrap: Wrapped = t.asInstanceOf[Wrapped]
  }
}

object Name extends NewType[String]
type Name = Name.Type
             
object Email extends NewType[String]
type Email = Email.Type

@main
def test (): Unit = {  
  def foo(str: String): Unit = {
    println(str)
  }
  def bar(n: Name, e: Email): Unit = {
    println(s"$n @ $e")
  }
  
  val name = Name("JDoe")
  val email = Email("[email protected]")
  //foo(name)
  //  Found:    (name : Name.Type)
  //  Required: String
  
  //foo(email)
  //  Found:    (email : Email.Type)
  //  Required: String
  
  foo(name.unwrap)
  foo(email.unwrap)
  
  bar(name, email)
  
  //bar(email, name)
  //  Found:    (email : Email.Type)
  //  Required: Name
  //  Found:    (name : Name.Type)
  //  Required: Email
}

I don’t recommend doing anything that involves asInstanceOf as it’s really hard to guarantee that it’ll always work correctly. I also don’t understand what the code you wrote is trying to achieve, what is the purpose of the union type here? What is this supposed to do compared to using an opaque type?

1 Like

There’s almost definitely a cleaner way of unwrapping it. The intention was to sanity check the feasibility of using a union with a path-dependent type to implement a zero-allocation wrapper that’s got nicer semantics than opaque types.

Opaque types are a bit awkward for this, as that’s apparently not really what they’re meant for. So if this type of encoding can work, then it’s possible we can refine that approach to get to a zero-allocation wrapper that can be defined in a couple of lines and has the sort of quality of life stuff you see in existing newtype implementations.

I still don’t get it, you can do the same with an opaque type as far as I can tell:

trait NewType[Wrapped] {
  opaque type Type = Wrapped
  
  def apply(w: Wrapped): Type = w
  
  extension (t: Type) {
    def unwrap: Wrapped = t
  }
}
2 Likes

Yep, that looks promising.

Attempting a direct translation of the existing solution missed a way to use opaque types to build a replacement (rather than opaque types being a direct replacement) :man_facepalming:

It looks like all the imports/scopes work intuitively, though there’s probably additional opportunities for exploring ways to keep primitives from boxing.

object definitions {
  trait NewType[Wrapped] {
    opaque type Type = Wrapped

    def apply(w: Wrapped): Type = w

    extension (t: Type) {
      def unwrap: Wrapped = t
    }
  }
}
object types {
  object Name extends definitions.NewType[String]
  type Name = Name.Type

  object Email extends definitions.NewType[String]
  type Email = Email.Type
}
object using {
  import types.{Name, Email}

  @main
  def test (): Unit = {  
    def foo(str: String): Unit = {
      println(str)
    }
    def bar(n: Name, e: Email): Unit = {
      println(s"$n @ $e")
    }

    val name = Name("JDoe")
    val email = Email("[email protected]")
    
    //foo(name)
    //  Found:    (name : types.Name.Type)
    //  Required: String

    //foo(email)
    //  Found:    (email : types.Email.Type)
    //  Required: String

    foo(name.unwrap)
    foo(email.unwrap)

    bar(name, email)

    //bar(email, name)
    //  Found:    (email : types.Email.Type)
    //  Required: types.Name
    //  Found:    (name : types.Name.Type)
    //  Required: types.Email
  }
}

As a business user, not a library one, this is my use case, and the one for which I thought opaque type will help:

  • all my business objects are identified by UUID (ie a string)
  • I have (tens? hundreds?) millions of these ids in hundreds of different business objects.
  • they are used in a lots of Map, Set, etc
  • they all have a trivial “debug/display string” (the underlying string - note that I would LOVE to be able to compile-time forbid toString on them) and a trivial constructor (from the string).

My primary goal is to have a compiler that helps and so I never ever want to have to wonder if I mixed a “GroupId” with a “RuleId”, be able to safely refactor, etc. For exemple I want to be able to test for equality on them and know when I don’t compare same kind of IDs (it’s always an error, likely due because I’m refactoring and I changed a method parameter or whatever).
That help MUST be as easy to use and natural in scala code as possible, BUT explicit. I never ever want a string to be magically used as a RuleId for ex. Like I never ever want to print a RuleId directly, always its debug string.

Just to be very very clear: defining such objects must be as easy and boilerplate free as possible, else they are not used as much as they should. This is integral part of the first goal. Typically, I often define algorithm-local identifier to differentiate between different step of the computation. [EDIT: instanciating these objects must be trivial, ie case class-like and no more, but defining them for the simple case must be simple too]

So for that, until now, I used case class for that. And in some cases value class but the constraint rarely worth it (top level definition, etc)

final case class RuleId(value: String) 

A second goal is to minimize as much as possible the runtime cost of my model that is only here to help developpers at compile time. The fact that a ruleid is more than a string is useless at runtime. Actually, I would like to be able to explore alternative runtime representation without changing compile-time API (which of course includes whatever serialization/debug representation needed). For ex: I would love to know if switching from a string encapsulated in class is much more costly than, say, array of two longs in real load. But today the cost of that test is big.

Finally, some IDs are composed with 2 or more UUIDs and I have a lot of others similare objects like that, composed from some simple types, like stats objects (tens of long, each long having its own semantic). And I want to have a consistant recipe for dealing with all these cases in a consistant way, so that new dev on the project (or old dev coming to a similar problem a couple of month after) can just copy/past existing example without having to worry of too much ceremony, scalac implementation details, or other surprising things.

For these three points, I would LOVE to have a consistent, boiler-plate light way to deal with that case (without having to wait for Valhalla)
I will explore if opaque type are a good feet for that.

18 Likes

This is totally unsound, regardless of how erasure is encoded.

@ toName("Frodo"): NameTag
java.lang.ClassCastException: java.base/java.lang.String cannot be cast to ammonite.$sess.cmd0$NameTag
  ammonite.$sess.cmd5$.<clinit>(cmd5.sc:1)

I also don’t know what you’re trying to show. As @smarter says, the same can be done (soundly) with opaque types today. I think the main problem raised by the community is that this is already too much boilerplate.

2 Likes

This!

That is exactly the major motivation mentioned in the original SIP:

Authors often introduce type aliases to differentiate many values that share a very common type (e.g. String , Int , Double , Boolean , etc.). In some cases, these authors may believe that using type aliases such as Id and Password means that if they later mix these values up, the compiler will catch their error. However, since type aliases are replaced by their underlying type (e.g. String ), these values are considered interchangeable (i.e. type aliases are not appropriate for differentiating various String values).

However, the SIP mentions that – up until now – this was possible to implement via value classes (AnyVal), but that the problem with them is that they do not have sufficient performance.

In short, the SIP tries to “kill two birds with one stone”; i.e, non-interchangeable types and better value classes (which have other uses as well).

Similar discussions seem to have risen in the TypeScript community:

@RyanCavanaugh had a great analogy about this where 3 kids are asking their parents for a pet. One wants a dog, one wants a cat, one wants a fish. They ask their parents “when are we getting a pet!?” Clearly they all agree they want a pet, but each wants a different pet!

(quoted from this open issue about a similar feature for TS)

So yeah, I’m in favor of separating these features. One should target non-interchangeable type aliases (like UUID), and the other should target better-performing value classes.

2 Likes

I agree, however, I foresee arguments against this due to feature bloat. Which leads me back to annotation macros. If we don’t want it as a language feature, we should allow easier language extensions. In general it feels like extending things that are not expressions is difficult / impossible in the current state of Scala 3 (please correct me if I’m wrong)

4 Likes

latest example of opaque types being perceived by the community primarily as a value-class replacement: Dean Wampler’s https://medium.com/scala-3/opaque-type-aliases-and-open-classes-13076a6c07e4

5 Likes

It’s pretty clear that this was one of the key features in the SIP, so it’s natural that people will expect this.
image

6 Likes

I would like to add I find the main use case for opaque types is when you want to have one type represent something else. If you want to expose that any string can be used as a usernames, this is easily done in the current implementation:

object o {
  opaque type Username >: String = String
}
val uname: o.Username = "user"

If you want to expose that usernames are strings, then that is also easy:

object o {
  opaque type Username <: String = String
  def Username(str: String): Username = if str.contains(' ') then ??? else str
}
val str: String = o.Username("user")

There are many ways to implement other wanted behavior which would be much harder to do in other languages.

The current implementation is very general and that is desirable.

5 Likes

I believe that this is not the desired functionality. Yes it is desired to have String’s methods on a Username variable, but there shouldn’t be an easy/implicit cast to String:

somekeyword type Password = String

val password: Password = "god123"

val strPassword: String = password // compilation error
password.toLowerCase() // what's the return type? (String or Password?)
password + "x" // should be allowed? what's the return type?

I think you have it backwards. If you write

opaque type Password <: String = String

then this does not compile:

Because Password is a subtype of String, but the opposite is not true.

All of this is compiles fine and since you use the Password as a String, the return type is String in all examples:

In particular, the first line is not an error.

2 Likes

Well, yeah, that’s exactly my point – the current opaque implementation does not provide the desired functionality of non-interchangeable types, which is stated as one of the major motivations for the feature.

That’s kind of the whole point of this thread.

I don’t understand what you mean now. What is missing? If you don’t want to expose a subtyping relation between eg. Username and String then you just skip the bounds:

opaque type Username = String
def Username(str: String): Username = if <valid username> then ??? else str

Now you can’t use a Username as a String or use a String as a Username

If the validation fits here, it should probably return an Either[Error,Username], but I wouldn’t expect any apply method to do that. The call site syntax would be very confusing:

val myUsername = Username("JohnDoe") // I would expect the type to be `Username` here, not `Either[Error, Username]`

I think the problem arises because there’s no way to mitigate the boilerplate, especially in the many cases where I don’t want to do any validation, but am using the opaque type only for tagging.
I foresee many many lines of code similar to this:

opaque type Username = String
def Username(str: String): Username = str

opaque type Token = String
def Token(str: String): Token = str

opaque type UserId = UUID
def UserId(uuid: UUID): UserId = uuid

...

Add to this extension methods to make the reverse transformation for each opaque type.

I would consider this good practice compared to using the underlying types directly, however, I think people will grow tired of this boiler plate and start using the types directly.

1 Like