Pre-SIP: sealed enumerating allowed sub-types

 Hello,

This is a proposal to allow an alternative version of sealed that enumerates allowed sub-types instead of requiring them to be in the same compilation unit. Basically:

sealed (A, B, C, D) trait T { ...} 

means that T is a sealed trait and can only be extended by A, B, C and D, and these need not be in the same compilation unit as T.

Motivation:

It often happens (at least to me) that there is a large hierarchy of types (e.g. some sort of math expressions, or syntax elements of some language) that is best expressed as a tree with sealed traits as branches and case classes as leaves, but currently, all of these would have to go into the same file, which would then become uncomfortably large. This proposal allows to split the code for such a hierarchy into multiple files.

Best, Oliver

9 Likes

Java 11 introduced NestMates: JEP 181: Nest-Based Access Control which support your proposal on JVM level (i.e. no extra bridge methods needed), though Iā€™m not sure if itā€™s fully compatible (what about traits, multiple inheritance, etc?).

2 Likes

My need and I presume others is not that only a limited sub set of traits can inherit directly from the super trait, but that all final sub classes all inherit from at least one of that subset of traits. All weā€™re trying to ensure surly is that when we match against all of a specific set of traits that every instantiation will be matched?

Can not this need be met within Dotty using self types and type unions?

trait My
{ self: MyA | MyB | MyC =>
}

MyA extends My
MyB extends My
MyC extends My

then you can have traits that diverge along an orthogonal dimension of distinction say

trait My1 extends My
trait My2 extends My
trait My3 extends My

But then their descendants must be of the form

MyA1 extends MyA with My1
MyA2 extends MyA with My2
MyB1 extends MyB with My1

You could even have a double restraint such as

trait My
{ self: (MyA | MyB | MyC) & (My1 | My2 | My) =>
}

Allowing a second way of safely folding over the trait / class.

3 Likes

The main reason for sealed types is exhaustivity checks by the compiler. Do you get them with self types?

1 Like

I find @RichTypeā€™s idea quite elegant, as itā€™s more powerful than sealed but can be encoded entirely with existing features. It would be a nice simplification if it could be used to represent sealed-ness internally to the compiler.

In principle, self-types are not visible from the outside. But I think it would make sense to make an exception for pattern matching exhaustiveness checking.

We just have to consider the type of a scrutinee scrut of type My as one of type scrut.type & (MyA | MyB | MyC) for the purpose of pattern matching. Preliminary tests seem encouraging: [scastie]

trait My { this: MyA | MyB | MyC =>
  def self: (MyA | MyB | MyC) = this
}
class MyA extends My
class MyB extends My
class MyC extends My

def foo(my: My): Unit = my.self match {
  case _: MyA =>
  case _: MyB =>
  case _: MyC =>
  // ^ comment one case and you get a warning:
  //      match may not be exhaustive.
  //      It would fail on pattern case: _: MyC
}

The definition of self should actually be:

  def self: this.type & (MyA | MyB | MyC) = this

But making it as above, we get a spurious exhaustiveness warning from Dotty ("It would fail on pattern case: my", even if we use patterns of the form _: (MyA & my.type)). Perhaps @liufengyun or @AleksanderBG can tell us why?

So in my example MyA, MyB and MyC all inherit from My, but this is not actually necessary, nor do you even need to own the traits / classes, just as long as they can be mixed in with the final classes.

If this was successful at some point sealed could be deprecated and removed from the language. The question of finality could then considered completely separately.

Some widening is missing, should be easy to fix.

BTW, this idea seems to work nicely with This type Introduce `This` type to replace self types Ā· Issue #7374 Ā· lampepfl/dotty Ā· GitHub

That looks intriguing, but OTOH wonā€™t that lead us to Boolean satisfiability problem - Wikipedia inside an exhaustivity checker? I mean I donā€™t know what the theoretical bound of computational complexity of exhaustiveness checking is currently, but replacing a simple enumeration of subclasses with hierarchical union and intersection types seems risky.

1 Like

Yes, with a sealed that enumerates types, it is very clear what the consequences are, with self types, itā€™s less clear.

With enumerating sealed, itā€™s clear the direct sub types can only be the ones that are enumerated.

With self types, it appears to me I can introduce as many direct sub types as I want and the self type constraints only need to be satisfied in the concrete classes down the hierarchy, or am I missing something? If in the end the constraint is not satisfied, it is less clear where the problem is.

So, I do not consider this a substitute.

2 Likes

Suppose, p_i is encoded as x_i = T, Ā¬p_i encoded as x_i = F.

(p1 āˆØ Ā¬p3) āˆ§ (p2 āˆØ p3) can be transformed to the following code C:

   sealed trait O
   object T extends O
   object F extends O
   (x1, x2, x3) match {
    case (F, _, T) =>
    case (_, F, F) =>
  }

C unexhaustive <==> (x1 = T āˆØ x3 = F) āˆ§ (x2 = T āˆØ x3 = T) satisfiable <==> (p1 āˆØ Ā¬p3) āˆ§ (p2 āˆØ p3) satisfiable

Personally I find the biggest limitation of sealed is having to put all the direct descendants in the same file. Iā€™ve found when I do want to seal something, thereā€™s often more than one thing I want to seal, so sealing just totally messes my file and package code organisation, so I often abandon sealing in favour of keeping the file organisation I want.

So I would like this restriction removed. It would also be nice to remove the restriction on class and companion object being in the same file.

In terms of being able to use self types for match exhaustivity, just being able to use a simple

self: TraitA | TraitB | TraitC =>

would be advantageous. There are often situations where some typing capability gives great advantage, but little more is gained by full Turing completeness. Iā€™ve often wondered whether a defined limited safe subset of Scala could be defined, that would be useful in a number of situations.

Yes, that is by design. It is more powerful than sealed as it allows different mixin strategies. But that is not a problem for what sealed is normally used for: pattern matching exhaustiveness checking.

We could imagine custom errors to handle the specific case of self-type problems when the self type is a union. In any case, Iā€™m not proposing to replace sealed at the language level; just to enable this more general feature.

It is well-known that in the general case, deciding subtyping is at least NP-complete, if not undecidable. Things that come into play include: OOP hierarchies with generics and variance; union and intersection types; etc. This is not introducing any additional complexity to the compiler, which can already reason about the pattern.

I would rather state you need to check if:

{(F,_,T) | _ e {F,T}} u {(_,F,F) | _ e {F,T}} ==
 {(x(1),x(2),x(3)) | x(i) e {F,T}, i e {1,2,3})

which isnā€™t the case as evidenced by (T,T,_)

Is it required that the identifiers enlisted exists by the proposed syntax?
If yes we have a cycling problem. The sealed entity points to its extending entities and vice versa, if you write one of them first the IDE temporarily marks this as an error until the entity pointed to is written, too.

I question the need to spread out a sealed hierarchy over several files. Presumably, if this is a closed sum, we want to see all alternatives in one place. This is a problem only if each alternative is a large piece of code itself. But there are ways to avoid that and to keep each alternative small. For instance, each alternative could be a trait or abstract class that is implemented in a separate file, like this:

sealed trait ADT
abstract case class A(...) extends ADT
abstract case class B(...) extends ADT
...
object A { def apply(...) = Aimpl(...) }
object B { def apply(...) = Bimpl(...) }
...

and then, in separate files:

class Aimpl(...) extends A(...) {
  // lots of implementation code
}

This shows also two under-appreciated patterns:

  • It can be useful to extend a case class
  • sealed should not be transitive.
1 Like

No offense, but I donā€™t think Iā€™ve seen a single non-martin-odersky person who thinks the given example is a good idea :slight_smile: Most people recoil from it in horror as it goes against their hard-earned mental model of how they use Scala and how they see it being used in the wild. Itā€™s certainly un-appreciated, but I donā€™t see why reason why itā€™s under-appreciated

9 Likes

A less scary alternative might be

sealed trait ADT
case class A(...) extends ADT with AImpl
case class B(...) extends ADT with BImpl

Other file:

private[foo] trait AImpl { this: A => 
  // lots of implementation code
}
7 Likes

@Jasper-M yeah thatā€™s how I would expect to see it done

Agreed, this is nicer! In any case, it means that there are ways to keep a sealed class hierarchy short enough to fit in one source file.

Since ā€œcompilation unitā€ is loosely construed, you could imagine taking the contents of a directory my-unit.scala as a compilation unit organized in separate files. The contents would not be merely concatenated, but compiled as sibling contexts.

Some clarification would be necessary, such as that the scope of an import can extend to the end of the compilation unit, but that would mean the end of the file in this case.

This would also allow companions in separate files in a multi-file compilation unit.

Even the closest of companions sometimes prefer to live apart.

1 Like

To be clear, this is not something that exists, but something you are proposing, right? What would trigger a bunch of files to be considered to be one compilation unit?

I feel that kind of proposal would go into one of these directions:

(1) package-level sealed

sealed[mypackage] trait A

(2) Include files. Like header files in C/C++. This would open up completely new options for inlining and specialization. Biggest question would be whether the same file being included more than once would be treated special.