Synthesize constructor for opaque types

LPTK · November 12, 2020, 8:44am

As I have mentioned before, the only real advantage of opaque types, which cannot be achieved with a library solution, is that they have the same erasure as their underlying type (so they won’t box primitives).

The rest of the opaque type design is a failure IMHO, as it still requires too much boilerplate, and does not integrate very well in the rest of the language. I think a class-like syntax would have been nicer.

FelixHargreaves · November 13, 2020, 10:22am

I tend to agree - tbh. I care much more about boilerplate than performance in 99% of all cases.

odersky · November 15, 2020, 6:18pm

To discuss opaque types, it’s important to understand what they are. Opaque types are abstract types with a convenient way to define them. Here’s a typical example how to set up an abstract type.
For concreteness, I picked a functional queue abstraction.

class Elem
trait QueueSignature:
  type Queue
  def empty: Queue
  def append(q: Queue, e: Elem): Queue
  def pop(q: Queue): Option[(Elem, Queue)]
val QueueModule: QueueSignature =
  object QueueImpl extends QueueSignature:
    type Queue = (List[Elem], List[Elem])
    def empty = (Nil, Nil)
    def append(q: Queue, e: Elem): Queue = (q._1, e :: q._2)
    def pop(q: Queue): Option[(Elem, Queue)] = q match
      case (Nil, Nil) => None
      case (x :: xs, ys) => Some((x, (xs, ys)))
      case (Nil, ys) => pop((ys.reverse, Nil))
  QueueImpl

An abstract type such as Queue is a type member of some signature. Its concrete implementation is a type alias in a structure that implements that signature. I have picked the SML/OCaml terminology since that’s where this stuff comes from.

The idea of an abstract type is that it provides true encapsulation: You can interact with values of abstract types only by means of the functions that come with it. It’s a very powerful construct, but it’s also quite heavyweight. In particular the distinction between QueueSignature, QueueModule and QueueImpl can look like overkill if there’s only one implementation of the Queue type.

Opaque types optimize for this case. They give you exactly(*) the same properties as abstract types, but without the container boilerplate. Here is the definition of functional queues using an opaque type:

object queues:
  opaque type Queue = (List[Elem], List[Elem])
  def empty = (Nil, Nil)
  def append(q: Queue, e: Elem): Queue = (q._1, e :: q._2)
  def pop(q: Queue): Option[(Elem, Queue)] = q match
    case (Nil, Nil) => None
    case (x :: xs, ys) => Some((x, (xs, ys)))
    case (Nil, ys) => pop((ys.reverse, Nil))

As with abstract types, the important aspect of opaque types is that they naturally support true encapsulation: Everything one can do with an abstract type has to be explicitly defined with it.

Newtype in Haskell is different. It gives you a fresh type with conversions to and from another type. That just gives you a name, no encapsulation is achieved. You can achieve encapsulation by hiding the conversion functions but that requires additional effort. See Lexi-Lambdas excellent blog about this difference. https://lexi-lambda.github.io/

I think it’s best not to dilute the conceptual purity of the abstract type model with automatically generated conversions. If you need conversions, you should explicitly define them, just like any other function over an abstract type.

(*) Plus, they usually give you a more efficient implementation since the backend “knows” what the implementation type of an opaque type is.

jducoeur · November 15, 2020, 8:24pm

Hmm. I’m kind of worried about this prioritization. This use case is not what I mainly want to use opaque for – I’d guess that 95% of my usage is going to be simply about wrapping an existing type with a thickly-walled more-specific type, replacing the current unreliable usage of AnyVal. (Heck, I’d guess that 50% of my usage is going to be nothing but providing strongly-typed versions of String.)

Automatic conversion is undesirable for my use case, but explicit conversion is 100% normal, and usually highly desirable. That really ought to be easy, as requested by the original poster.

So to put it simply: the “conceptual purity” here is off-base, IMO – the use case you are optimizing for is not the use case that has led many of us to advocate for opaque since Erik originally proposed it, lo these many months ago…

som-snytt · November 16, 2020, 6:46am

So an opaque type alias is basically just a type alias, except opaque.

I wrote an opaque type once, as a learning exercise, and contributed it, but I see it was changed to a class. It was encapsulating a StringBuilder, and the StringBuilder fell out of favor. Anyway, it served its advertised purpose.

There are one or two other opaque types in the compiler code base. I wonder if object opaques will join object implicits and object util in the pantheon of first names that sprang to mind. I tend to not remember the name but I can picture their face.

I can understand from the SIP how folks might feel abandoned at the end of the garden path, or Borges’s bifurcating paths. But I appreciate the power-to-weight ratio of the existing feature. Probably someone already requested if they couldn’t drop the opaque and make opacity the default. Then you could use export to expose its underlying structure. Also allow opaque type declarations.

To recap, I’d like export this.{foo => bar} for aliased “targetName” and export this.mytype to make my (opaque) type alias transparent to the world. I forgot to start with, “Dear Santa,…”

FelixHargreaves · November 16, 2020, 7:05am

This is pretty accurate.

FelixHargreaves · November 16, 2020, 7:22am

I’m afraid this will mean fewer people will actually use the feature. We could definitely implement the same pattern over and over, but in the end people will use String, Int, etc. directly just to avoid the boilerplate.
Alternatively, people would look to macros, but since there’s no annotation based macros here, I don’t see how it could even be implemented. This leaves us in a bad place for how we want to write our code.
I believe it’s also a matter of perspective: If you write “business logic” code, you are often dealing with Strings for IDs, names, etc. and you will end up wanting to make these opaque, whereas for library code you might have fewer and more select opaque types for your API.
Probably, application developers are less vocal about their needs wrt. language design :-/

som-snytt · November 16, 2020, 8:06am

opaque type def Name(s: String) = String(s)
opaque type def Name(s: String) = s   // as ascribed

And you get to pun typedef.

bishabosha · November 16, 2020, 10:15am

For these wrapper types, do you protect how they are constructed, or are they completely public?

FelixHargreaves · November 16, 2020, 11:41am

In by far the most cases they are only used for tagging. Things like tokens, IDs, hashes, etc. we often don’t validate. In the cases where we do, I don’t see any problem in creating the companion object by hand.
To be honest, I would be fine with using https://github.com/estatico/scala-newtype but it’s using annotation macros from Scala 2, so I guess it’s not compatible with Scala 3 (and can’t be implemented with the current macros solution).

morgen-peschke · November 16, 2020, 5:18pm

To clarify, does this mean that they are not intended as a replacement for AnyVal wrappers? That’s how I’ve seen opaque types discussed, and from this explanation it sounds kind of like it’s sort of a bonus that they can be used to create a zero allocation wrapper (which would explain the relatively poor ergonomics for this use case).

To be honest, I’ve never seen anyone talking about using the type of encoding demoed by QueueSignature, is this just something I’ve somehow missed?

amsayk · November 16, 2020, 5:51pm

I totally see where @odersky is coming from. So yes by adding default constructors and read method, it will be tinting opaque types.

But it’s true that the majority of use case for opaque types will be for type-safe unbox alias to String, Long, etc.

So then, what is the solution ?

I think the solution proposed by @Jasper-M and @jdegoes is the best. By having case opaque type to generate the default constructor and read method, we can have the best of both.

smarter · November 16, 2020, 6:29pm

If you feel the need for something like case opaque, then use a case class and be done with it. Classes are a well understood concept, and we do not need to invent a new thing that emulates them poorly and has surprising semantics: people are already surprised that opaque types aren’t opaque when pattern matching on them in a generic context, the more we make them more like class, the more they’re likely to get confused by the subtle differences between the two concepts.
The cost of allocating a class isn’t worth worrying about unless your allocation rate is extremely high or you have some drastic latency requirements. As a rule of thumb, I’d say that if you’re doing pure FP in Scala and are fine with the performance costs associated with that compared with doing everything in an imperative way, then you probably won’t notice the difference between using a class or an opaque type.

morgen-peschke · November 16, 2020, 11:46pm

A good example would be working with a bunch of String fields in Spark. You’re going to churn through a lot of extra allocations if you have to use case classes to get a bit of help from the compiler keeping them all straight.

While using semantic types to help avoid bugs is common in FP, it’s hardly restricted to that paradigm. I’m not sure it’s valid to assume that if someone wants a simple way to wrap a type, then performance isn’t a consideration for them.

curoli · November 17, 2020, 1:48pm

How about:

Welcome to Scala 2.13.3 (OpenJDK 64-Bit Server VM, Java 11.0.9.1).
Type in expressions for evaluation. Or try :help.

> trait NameTag
trait NameTag

> type Name = String with NameTag
type Name

> def toName(string: String): Name = string.asInstanceOf[Name]
def toName(string: String): Name

> def printName(name: Name): Unit = println(name)
def printName(name: Name): Unit

> printName("Frodo")
^
error: type mismatch;
found : String("Frodo")
required: Name
(which expands to) String with NameTag

> printName(toName("Frodo"))
Frodo

smarter · November 17, 2020, 2:23pm

This encoding relies on the exact way that Scala 2 erases intersection types (you want String with NameTag to erase to String, not NameTag, there is no specification for how this is done, and I can tell you from experience that it depends on compiler implementation details, so it cannot be relied upon, and Dotty will probably use a different encoding.

morgen-peschke · November 17, 2020, 6:01pm

Good to know. Do you know if the encoding used by https://github.com/rudogma/scala-supertagged has the same issues?

smarter · November 17, 2020, 6:32pm

It seems to be relying on similar tricks with intersection types and casting so it’s likely also problematic.

morgen-peschke · November 17, 2020, 8:26pm

I’m not familiar with what the generated bytecode looks like, do you think it would be possible to implement some of these techniques using union types rather than intersection types?

This is obviously extremely rough, and I imagine it’d probably box primitives, but it looks like it might be workable, depending on what the compiler generates:

trait NewType[Wrapped] {
  trait Wrapper
  type Type = Wrapped | this.Wrapper
  
  def apply(w: Wrapped): Type = w
  
  extension (t: Type) {
    def unwrap: Wrapped = t.asInstanceOf[Wrapped]
  }
}

object Name extends NewType[String]
type Name = Name.Type
             
object Email extends NewType[String]
type Email = Email.Type

@main
def test (): Unit = {  
  def foo(str: String): Unit = {
    println(str)
  }
  def bar(n: Name, e: Email): Unit = {
    println(s"$n @ $e")
  }
  
  val name = Name("JDoe")
  val email = Email("[email protected]")
  //foo(name)
  //  Found:    (name : Name.Type)
  //  Required: String
  
  //foo(email)
  //  Found:    (email : Email.Type)
  //  Required: String
  
  foo(name.unwrap)
  foo(email.unwrap)
  
  bar(name, email)
  
  //bar(email, name)
  //  Found:    (email : Email.Type)
  //  Required: Name
  //  Found:    (name : Name.Type)
  //  Required: Email
}

smarter · November 17, 2020, 8:53pm

I don’t recommend doing anything that involves asInstanceOf as it’s really hard to guarantee that it’ll always work correctly. I also don’t understand what the code you wrote is trying to achieve, what is the purpose of the union type here? What is this supposed to do compared to using an opaque type?