Proposal for Multi-level Enums

#1

Dotty has added a really fabulous way of declaring Algebraic Data Types (ADTs) in the form of the new enum syntax

e.g.

enum MyEnum {
  case A
  case B(i: Int, j: Int)
}

However these ADT are limited in that they can only be completely flat, while in general it is actually quite useful oftentimes to nest ADTs, and the new enum syntax seems like a great opportunity to provide support for those! I would propose the following syntax

enum MyEnum {
  case A
  case B(i: Int)
  case enum C {
    case D(s: String)
    case E
    case enum F {
      case G
    }
    case enum H {
      case I, J, K
    }
  }
}

But then the question arises: What type should be inferred for nested enum values, such as MyEnum.A, MyEnum.B.apply, and MyEnum.C.D.apply(s: String)?

I think a simple approach would be: Always infer the most specific enclosing enum. So each of the following expressions should have these types inferred

MyEnum.A // MyEnum
MyEnum.B(1) // MyEnum
MyEnum.C.D("") // MyEnum.C
MyEnum.C.E // MyEnum.C
MyEnum.C.F.G // MyEnum.C.F
MyEnum.C.H.I // MyEnum.C.H
MyEnum.C.H.J // MyEnum.C.H
MyEnum.C.H.K // MyEnum.C.H

Below I will include a collection of nice usecases of this feature, to demonstrate that it is not just for giggles:

  1. Paul Philips’ SizeInfo talked about here
enum SizeInfo {
  case Bounded(bound: Int)
  case enum Atomic {
    case Infinite
    case Precise(n: Int)
  }
}
  1. A Hierarchy of Reals/Rationals/Integers/Naturals
enum Real(computeDigits: Int => BigDecimal) {
  case enum Rational(numerator: Int, denominator: Int) 
      extends Real(_ => BigDecimal(numerator) / denominator) {
    case enum Integer(n: Int) extends Rational(n, 1) {
      case Natural(n: Int) extends Integer(n)
      case Negative(n: Int) extends Integer(n)
    }
    case Fraction(numerator: Int, denominator: Int) extends Rational(numerator, denominator)
  }
  case Irrational(computeDigits: Int => BigDecimal) extends Real(computeDigits)
}
  1. An generalization of Either to include “Inclusive OR”
enum AndOr[+A, +B] {
  case Both[+A, +B](left: A, right: B) extends AndOr[A, B]
  case enum Either[+A, +B] extends AndOr[A, B] {
    case Left[+A, +B](value: A) extends Either[A, B]
    case Right[+A, +B](value: B) extends Either[A, B]
  }
}
  1. Grouping JSON values into primitives and non primitives
enum JsValue {
  case Obj(fields: Map[String, JsValue])
  case Arr(elems: ArraySeq[JsValue])
  case enum Primitive {
    case Str(str: String)
    case Num(bigDecimal: BigDecimal)
    case JsNull
    case enum Bool(boolean: Boolean) { 
      case True extends Bool(true), 
      case False extends Bool(false) }
  }
}
21 Likes
#2

This looks really great to me. Subclasses of ADTs can be very useful. Would be great to have a nice syntax.

2 Likes
#3

I used this kind of nesting fairly often when I have complicated error ADTs. A ReadError can be DecodeError, a ParseError or an IOError. A DecodeError can be a TypeError or a NotFoundError. Etc…

1 Like
#4

I second this proposal. Multi-level enums can be really useful for encoding more precise invariants into ADTs, without having to resort to nesting (which adds clutter and extra allocations).

2 Likes
#5

Good idea. This should work for inference since nested parameterized enum nodes aren’t case classes like their siblings. We want to avoid this type inference problem.

That said, is the current implementation of enums necessarily flat? Since the top-level enum is sealed rather than final, couldn’t we could already do something like the following?

enum JsValue { // desugars to a sealed abstract class
  case Obj(fields: Map[String, JsValue])
  case Arr(elems: ArraySeq[JsValue])
}
enum Primitive extends JsValue {
  case Str(str: String)
  case Num(bigDecimal: BigDecimal)
  case JsNull
}
enum Bool(boolean: Boolean) extends Primitive { 
  case True extends Bool(true), 
  case False extends Bool(false)
}

(Granted, the nested version is an improvement that’s easier to read and keeps all the types contained under the top level enum, and enumTag might not be built to handle this gracefully. If we did this, enum should probably also imply final)

#6

I like this and think it would be a great addition. A lot my own own ADTs are naturally hierarchical, exactly like the examples given, such as this one from my Metascala JVM

sealed trait Type
object Type{
  sealed trait Ref extends Type
  object Ref{
    case class Cls extends Ref
    case class Arr extends Ref
  }
  sealed trait Prim extends Type
  object Prim{
    case object Z extends Prim
    case object B extends Prim
    case object C extends Prim
    case object S extends Prim
    case object I extends Prim
    case object J extends Prim
    case object F extends Prim
    case object D extends Prim
  }
}

This lets things have meaningful names like Type.Prim.I, while avoiding polluting the global (or even Type.*) namespace with dozens of prefixed TypePrimI names.

7 Likes
#7

I tried to refrain myself from doing a “me too” post… But really, not doing what is proposed here would be huge missed in uniformity… I would really love to have that.

4 Likes
#8

I’d honestly be surprised if nesting has never been considered – it just seems so natural and intuitive!

1 Like
#9

Oh but it has on at least one occasion: https://gitter.im/lampepfl/dotty?at=5cd583d2f251e60ffa5a990f

1 Like
#10

@smarter @nicolasstucki @odersky any thoughts? The proposal seems to be universally well received

1 Like
#11

My take is that we should ship what we have now in 3.0, and we can revisit this in a later version. We can extend flat enums to multilevel enums in a backward compatible way in the future, so we should give ourselves more time to weigh this, and above all to really feel the limits of flat enums, before committing to more complexity.

2 Likes