Proposal for Multi-level Enums

#1

Dotty has added a really fabulous way of declaring Algebraic Data Types (ADTs) in the form of the new enum syntax

e.g.

enum MyEnum {
  case A
  case B(i: Int, j: Int)
}

However these ADT are limited in that they can only be completely flat, while in general it is actually quite useful oftentimes to nest ADTs, and the new enum syntax seems like a great opportunity to provide support for those! I would propose the following syntax

enum MyEnum {
  case A
  case B(i: Int)
  case enum C {
    case D(s: String)
    case E
    case enum F {
      case G
    }
    case enum H {
      case I, J, K
    }
  }
}

But then the question arises: What type should be inferred for nested enum values, such as MyEnum.A, MyEnum.B.apply, and MyEnum.C.D.apply(s: String)?

I think a simple approach would be: Always infer the most specific enclosing enum. So each of the following expressions should have these types inferred

MyEnum.A // MyEnum
MyEnum.B(1) // MyEnum
MyEnum.C.D("") // MyEnum.C
MyEnum.C.E // MyEnum.C
MyEnum.C.F.G // MyEnum.C.F
MyEnum.C.H.I // MyEnum.C.H
MyEnum.C.H.J // MyEnum.C.H
MyEnum.C.H.K // MyEnum.C.H

Below I will include a collection of nice usecases of this feature, to demonstrate that it is not just for giggles:

  1. Paul Philips’ SizeInfo talked about here
enum SizeInfo {
  case Bounded(bound: Int)
  case enum Atomic {
    case Infinite
    case Precise(n: Int)
  }
}
  1. A Hierarchy of Reals/Rationals/Integers/Naturals
enum Real(computeDigits: Int => BigDecimal) {
  case enum Rational(numerator: Int, denominator: Int) 
      extends Real(_ => BigDecimal(numerator) / denominator) {
    case enum Integer(n: Int) extends Rational(n, 1) {
      case Natural(n: Int) extends Integer(n)
      case Negative(n: Int) extends Integer(n)
    }
    case Fraction(numerator: Int, denominator: Int) extends Rational(numerator, denominator)
  }
  case Irrational(computeDigits: Int => BigDecimal) extends Real(computeDigits)
}
  1. An generalization of Either to include “Inclusive OR”
enum AndOr[+A, +B] {
  case Both[+A, +B](left: A, right: B) extends AndOr[A, B]
  case enum Either[+A, +B] extends AndOr[A, B] {
    case Left[+A, +B](value: A) extends Either[A, B]
    case Right[+A, +B](value: B) extends Either[A, B]
  }
}
  1. Grouping JSON values into primitives and non primitives
enum JsValue {
  case Obj(fields: Map[String, JsValue])
  case Arr(elems: ArraySeq[JsValue])
  case enum Primitive {
    case Str(str: String)
    case Num(bigDecimal: BigDecimal)
    case JsNull
    case enum Bool(boolean: Boolean) { 
      case True extends Bool(true), 
      case False extends Bool(false) }
  }
}
24 Likes
#2

This looks really great to me. Subclasses of ADTs can be very useful. Would be great to have a nice syntax.

3 Likes
#3

I used this kind of nesting fairly often when I have complicated error ADTs. A ReadError can be DecodeError, a ParseError or an IOError. A DecodeError can be a TypeError or a NotFoundError. Etc…

2 Likes
#4

I second this proposal. Multi-level enums can be really useful for encoding more precise invariants into ADTs, without having to resort to nesting (which adds clutter and extra allocations).

4 Likes
#5

Good idea. This should work for inference since nested parameterized enum nodes aren’t case classes like their siblings. We want to avoid this type inference problem.

That said, is the current implementation of enums necessarily flat? Since the top-level enum is sealed rather than final, couldn’t we could already do something like the following?

enum JsValue { // desugars to a sealed abstract class
  case Obj(fields: Map[String, JsValue])
  case Arr(elems: ArraySeq[JsValue])
}
enum Primitive extends JsValue {
  case Str(str: String)
  case Num(bigDecimal: BigDecimal)
  case JsNull
}
enum Bool(boolean: Boolean) extends Primitive { 
  case True extends Bool(true), 
  case False extends Bool(false)
}

(Granted, the nested version is an improvement that’s easier to read and keeps all the types contained under the top level enum, and enumTag might not be built to handle this gracefully. If we did this, enum should probably also imply final)

1 Like
#6

I like this and think it would be a great addition. A lot my own own ADTs are naturally hierarchical, exactly like the examples given, such as this one from my Metascala JVM

sealed trait Type
object Type{
  sealed trait Ref extends Type
  object Ref{
    case class Cls extends Ref
    case class Arr extends Ref
  }
  sealed trait Prim extends Type
  object Prim{
    case object Z extends Prim
    case object B extends Prim
    case object C extends Prim
    case object S extends Prim
    case object I extends Prim
    case object J extends Prim
    case object F extends Prim
    case object D extends Prim
  }
}

This lets things have meaningful names like Type.Prim.I, while avoiding polluting the global (or even Type.*) namespace with dozens of prefixed TypePrimI names.

9 Likes
#7

I tried to refrain myself from doing a “me too” post… But really, not doing what is proposed here would be huge missed in uniformity… I would really love to have that.

5 Likes
#8

I’d honestly be surprised if nesting has never been considered – it just seems so natural and intuitive!

1 Like
#9

Oh but it has on at least one occasion: https://gitter.im/lampepfl/dotty?at=5cd583d2f251e60ffa5a990f

1 Like
#10

@smarter @nicolasstucki @odersky any thoughts? The proposal seems to be universally well received

3 Likes
#11

My take is that we should ship what we have now in 3.0, and we can revisit this in a later version. We can extend flat enums to multilevel enums in a backward compatible way in the future, so we should give ourselves more time to weigh this, and above all to really feel the limits of flat enums, before committing to more complexity.

4 Likes
#12

The problem with that strategy is that instead of the (preferred) “old way” and “new way” to encode ADTs you now have the “old way OR the only way to do hierarchical ADTs” and “the new, limited way”.

Is there no time to implement and test out hierarchical ADTs?

Are there no hierarchical ADTs in dotc?

6 Likes
#13

We need to accept that languages should continually evolve, and figure out how to solve problems that brings[1], rather than keep fooling ourselves that we’re finally in possession of the Final, Complete, and Perfect design. Every bit of software that ships and we start using changes our perspective on things in ways that we can’t predict.

Sorry for being melodramatic, or if I’m over-generalizing, but you get the point…

[1] One example of a problem is “we can’t wait so long.” The solution is not to evolve in fewer iterations, but to figure out how to make each iteration much shorter.

5 Likes
#14

I agree wholeheartedly with that point. There are hierarchical ADTs in my code base, and that basically means I cannot (consistently) rely on enum.

That being said, we can’t port every single fancy behaviour clever devs used to encode through sealed trait hierarchies in enum. Sometimes, it’s fine to say “you can do that, but by hand”. I just don’t think hierarchical ADTs are the right cutoff, but I use them pretty often, so I might be biased.

Also, last time I checked-a few weeks ago- hierarchical ADTs compiled, but just didn’t behave as you’d expect. If we’re going to forbid them, let them at least be a compilation failure.

2 Likes