SIP - Sealed Types

aboisvert · May 19, 2021, 11:36pm

I wonder, have you thought about expressing these sealed types as a “type enum”?

Something along the lines of:

type enum Num <: Int:
  case Zero <: 0.type:
    def isDefinedAt(n: Int) = (n == 0)
  opaque case Pos:
    def isDefinedAt(n: Int) = (n > 0)
  opaque case Neg:
    def isDefinedAt(n: Int) = (n < 0)

The connection for me is that (value) enums are sealed and so are the proposed type enums, and they also exhaustively list their value/type inhabitants. The isDefinedAt method is the required artifact to validate values before they are admitted/converted as type. The above seems to be all that should be required from the user to support the desugaring to the underlying values/types/extractor machinery.

I hope you give this idea some thought.

dwijnand · May 20, 2021, 7:54am

But do they? The compiler can’t be sure of that. With match syntax we already have a concept of exhaustivity of the input. Also the order of checking is obvious, while with the separate case you might want to check it’s definition order rather than, for example, alphabetic order or some hash order.

aboisvert · May 20, 2021, 3:39pm

What I mean is that enums list (as in “enumerate”) their allowed values; and type enums list their allowed types. It is on the programmer to define these allowed inhabitants.

The idea/model I’m proposing is based on partial functions from values to types (rather than total functions).

I would argue that:

Exhausitivity in the input is not desirable. Even though the underlying type is Int, you may only be interested in modeling Pos and Zero for your domain; Neg may not be relevant. By expressing the subtype (<: Num) relation, the definition is already explicitly saying that Num is a subset of Int. The Num type doesn’t (and shouldn’t, IMO) have to cover all values of Int.
Strict evaluation order across the cases within the definition is not be desirable either. I think that evaluation order should be directed by the use-site rather than the definition. And for evaluation efficiency, you may want to match against only a subset of sealed types (and relegate the others to the fall-through case).

aboisvert · May 20, 2021, 4:36pm

Hmmm upon further thought, I think the misunderstanding really is about this partial/total function distinction. I’d like to find a way to have type enums that can both partially cover the range of values of the underlying type, while also being able to express that they do exhaustively cover the range (which is seems to be the central objective of your proposal).

I think the enum-style expression is more natural, and so maybe what we need is a modifier that expresses the exhaustiveness claim.

dwijnand · May 20, 2021, 9:04pm

There’s no requirement that the match is exhaustive. It’s just that with match syntax we have already a system in place for determining if it’s exhaustive or not, which here determines if the sealed type is of a subset of values or all values of the underlying type. Have a look at Peano example which is exactly your non-negative Int case.

Have a look at the Alternative Design.

I’m not sure what you mean, but I’ll explain better what I mean:

type enum Name <: String:
  opaque case Regular:
    def isDefinedAt(s: String) = s.forall(Character.isLetter)
  case Foo <: "Foo".type:
    def isDefinedAt(s: String) = s == "Foo"

Will “Foo” correctly map to Foo? I think the following makes it more obvious that it won’t:

sealed type Name = String {
  case s if s.forall(Character.isLetter) => type Regular
  case "Foo"                             => val Foo
}

aboisvert · May 21, 2021, 2:30am

@dwijnand Glad we’re on the same page about not having to exhaustively cover the range of the underlying type.

I do like the alternative design with “@complete” although I agree using an annotation seems bolted-on rather than part of the language.

At the risk of throwing out yet-another-half-baked-idea, it would be interesting to explore how to express completeness using givens, such as given ExhaustiveMatch[Int, Neg | Zero | Pos]. The given instance would provide evidence of types that, when used in conjunction, are exhaustive against a scrutinee. This could effectively reify completeness and make it a more versatile component of the language.

I understand and I agree with your point about the clarity of a single match expression capturing all cases – at the same time I think defining the extractors individually is what Scala programmers are used to and they should be aware of the accompanying responsibility and risks.

Moreover, one extra benefit of separate extractors is the flexibility to have overlapping cases. As an example, this would allow the programmer to define a type for One and maintain Pos. One reason might be that it’s convenient to special-case One is some algorithm while in other cases Pos is more convenient.

This also happens to be an illustration where the flexibility of customizing completeness (e.g. using the given ExhaustiveMatch above) would come in handy.

sjrd · May 21, 2021, 10:04am

My main concern with this proposal is that, by its very design, it requires O(n²) type tests to occur in a typical match that uses it.

When I write

(n: Int) match
  case 0      =>
  case x: Pos =>
  case x: Neg =>

it will translate to

if summon[TypeTest[Int, Zero]](n) then
  ...
else if summon[TypeTest[Int, Pos]](n) then
  ...
else if summon[TypeTest[Int, Neg]](n) then
  ...
else
  throw new MatchError(n)

But that, in turn, will become

if (n: Num).ordinal == 0 then
  ...
else if (n: Num).ordinal == 1 then
  ...
else if (n: Num).ordinal == 2 then
  ...
else
  ...

Now, ordinal is a def that performs the real type tests, and returns 0, 1, or 2. When you consider the case where it returns 2, this ordinal method will be called 3 times, and each time, it will perform 3 tests before figuring out that it needs to return 2.

In general, for a sealed type with n cases, the ith case will use O(i²) type tests, so in general a pattern match will perform O(n²) type tests.

I don’t think this is a viable design. We need to find a way to have O(n) type tests for a typical match of a sealed type.

Other concerns:

The design introduces Num, but as a user I never care about Num. This is a fictional type.
It also introduces ordinal, which again, as a user, I don’t want to see, and it should not be part of the public API.
There are often cases where one of the cases should be reusable across several decompositions. For example, a Seq can be decomposed into Empty + As :+ B, or into Empty + A +: Bs. With the given proposal, I would need two different Empty values, it seems.

dwijnand · May 21, 2021, 4:43pm

That’s very true, good point. But it allows reused of the existing TypeTest machinery. Would it not be reasonable to be able to optimise that after the initial translation? I’m not sure how much of this still exists in dotc, but in nsc pattern matching goes through an optimisation transformation to reuse the same tests (MatchOptimisation).

Like the anonymous case suggestion, I’m not against making the name optional, if the usage looks reasonable. But the name isn’t always uncared for, e.g. Permission. I guess it’s never interesting when it’s for a non-opaque sealed types, like Num.

I was taking a leaf out of enum, but for a long time I’d named it $discriminator. I made a point of putting it on the sealed type rather than the underlying type, for safety (e.g. Permission vs Int). But I made the opaque part optional last minute, so now even if it’s on Num it’s on Int too.

Yeah, which might lead to some polish notation type names… But it’s not like we take issue with Nil and None being distinct classes, despite being isomorphic… Feels like an enhancement idea for a “phase 2”, to me.

sjrd · May 21, 2021, 6:14pm

You’d have to inline the call to TypeTest[Int, Pos].isInstance and the call to ordinal behind that before the pattern matcher even sees the underlying duplicate type tests. And even then, they would not be adjacent to each other, and typically accompanied with guards. That is a lot to ask to the match optimizer. I don’t think it is reasonable to expect that to happen in a remotely reliable way, if at all.

Here it’s not only a question of being isomorphic. It’s a question of being literally the same type. So I don’t think the argument about Nil vs None applies.

Also, the current design, by construction, does not allow for a smooth extension in a “phase 2” that would allow this. There’s no way you’ll be able to define twice the same value in two different sealed type definitions. So we have to think this through now; either to say that we need it or to say that it’s not necessary. But delaying the question is not an option, IMO.

dwijnand · May 21, 2021, 9:14pm

As we define those TypeTests, we know how they’re implemented, so my thinking is they would collapse into each other, into a tableswitch.

Fair enough, I’ll have a thinking about reusing types.