Proposal for Enumerations in Scala

I think with enum, we’re conflating too many things here.

  1. Enumerations – declaring some finite, flat, plain data and assigning a natural number to each piece. Like enums in Java, C, protobuf, … This is where .value, .values, .valuesToEntriesMap, etc methods make sense. Basically this should replicate what enumeratum does. It would be great, if all of these inherited from java.lang.Enum automatically (I’m not sure it’s possible, though). We should use the keyword enum for this:
enum Fruit(sugarContent: Double) {
  case Apple(0.5)
  case Orange(1.25)
}
Fruit.Apple.value == 0
Fruit.Orange.sugarContent == 1.25
Fruit.valuesToEntriesMap(0).sugarContent == 0.5
Fruit.values == List(Fruit.Apple, Fruit.Orange)
Fruit.Apple: Fruit
// Fruit could also implement Eq and Ord out of the box? (based on `.value`)
  1. Tagged unions – ADTs, as we know from Haskell/F#/ML/… Can be recursive, but these are still data, so “nesting” doesn’t make much sense here, nor do .values or java.lang.Enum. We should use the keyword union for this.
union Expr {
  case Zero
  case Val(v: Int)
  case Sum(l: Expr, r: Expr)
}
Expr.Val(42): Expr // the fact, that `Val` is implemented with a `class` `Val` is a detail, that should be _hidden_
  1. Any complicated, including nested, sealed hierarchies. We already have everything we need for them in Scala (sealed, trait, class), no need for any other special keywords.

I like how Scala 3 tries to codify common idioms (those would be enum and union), simple things should be easy :tada: On the other hand, I don’t see a reason to complicate the (much rarer) complex things, like nested hierarchies. Those are already possible with Scala 2 tools, like sealed, trait, class.

@odersky would you agree that it’s worthwhile separating the union concept from enum?

1 Like

I support your idea of separation between the features (and also realize now my earlier mistake of not understanding that ADTs are union types), but I’m not convinced that such a “small” scenario merits a new syntax.

As you said, we already have sealed, class, trait, object; why then we need a custom union syntax? Are those really all that different?

union Expr {
  case Zero
  case Val(v: Int)
  case Sum(l: Expr, r: Expr)
}
sealed trait Expr
object Expr {
  object Zero extends Expr
  case class Val(v: Int) extends Expr
  case class Sum(l: Expr, r: Expr) extends Expr
}

If anything, the more nesting such hierarchy has, the more boilerplate is required, and the greater the need for a simpler and more concise syntax.

I’m not sure this would really work because you would anyway need to derive proper (de-)serializers for the case classes. I’m not even sure it would be simple to design a serialization process that would pick up the valueOf method for “simple cases” but would construct proper class instances for the other cases of a same ADT.

1 Like

Not at all! I am a strong proponent of keeping the two together. As far as I know,every language that supports ADTs also supports enums as a special case of ADTs. An enum is simply an ADT where all cases are simple. The philosophy of the Scala language is to be a unifier, instead of an amalgamation of many different features. I have come to realise that if you ask committees or the general public the vote always goes towards more differentiated features, which in the end invariably leads to feature creep. So, I take it on me to strongly resist this tendency :wink:

One possible design is to stay pure and simply not have any enums at all, since they are not strictly necessary. That’s what Scala 2 did, and we could continue with it. On the other hand, I have the impression that the reduction of boilerplate is worth it. But then it should be one concept, not two or three different ones.

6 Likes

I don’t know about pure-data ADTs, but “java” enums are very much missed in Scala. There is the enumeratum library that somewhat provides their utility, but it seems to rely on macros so I’m not sure it’ll be ported to Scala 3.

It seems to me we have two concepts that are similar at their core, but heavily differ in their usage and needs of syntax sugar; enums need values / valueOf; ADTs need defs on nested types and multi-level hierarchies.

Trying to combine two different syntax sugars into one, just because they share a conceptual core, is not a good decision imho.

2 Likes

That may actually be a good thing. When I define ADTs I always end up stuffing them with methods because it’s the easy thing to do, but then I dislike the result, because it is no longer easy to see the structure of the ADT anymore, with all the method pollution.

I think the better approach (though a bit cumbersome) is to outsource the methods into external traits, which actually also works with the enum syntax:

enum Json {
  case Bool(value: Boolean)    extends Json with BoolImpl
  case Array(items: Seq[Json]) extends Json with ArrayImpl
  def foo: Int
}
private trait ArrayImpl { self: Json.Array =>
  def foo = items.size
  def bar = foo // this method is defined only for Array
}
private trait BoolImpl { self: Json.Bool =>
  def foo = if (value) 1 else 0
}

@main def m = {
  val j = new Json.Array(Seq(Json.Bool(true)))
  assert(j.foo == j.bar)
}

Though again, it’s a little too much boilerplate, especially since it forces specifying the full extends clauses of the ADT cases.

2 Likes

Or maybe just using the good old syntax?

sealed trait Json {
  def foo: Int
}

object Json {
  case class Bool(value: Boolean) extends Json {
    def foo = if (value) 1 else 0
  }
  case class Array(items: Seq[Json]) extends Json {
    def foo = items.size
    def bar = foo
  }
}

Seems a lot cleaner to me.

Clean/pollution is in the eye of the beholder, it seems. :smile:

1 Like

Perhaps, but then is it worth spending time on a new syntax-sugar feature that looks pretty much like before and does not reduce boilerplate?

But I think it must be reiterated. An enum construct with severe limitations and a very low ceiling on what it can do:

  1. can’t have subhierarchies
  2. can’t declare methods for branches (without a workaround)
  3. can’t inherit new traits in branches
  4. can’t declare implicits for branches
  5. all while types of the branches are reachable through pattern matching and must be for GADTs to work – making the argument that .apply widens irrelevant, meaning that all the above concerns are very relevant as a programmer will observe the subtypes of branches daily

Is just un-Scala! it’s a construct that does not scale with usage, that does not help contain complexity, but gives up at a certain point of complexity and forces a retreat to a low-level construct. This really goes against the principle of scaling with the codebase and how the other language constructs scale really well. You could argue that case class is also too limited and doesn’t scale, but I don’t think enums will have the success of case classes, not when e.g. nearly all the sealed hierarchies in my libraries are multi-level, it’s not worth it to use a different syntax for the minority of them that are simplistic.

EDIT: LPTK’s post clarifies some of the capabilities of enums, but still, making workarounds for new features before they’re even out is too much, telling newcomers “just make a private trait if you want to add methods to enum branch” and having to remember that yourself is hardly practical.

2 Likes

Nice workaround, but it’s a workaround for a feature that isn’t even released yet! I’d rather have a new release of the language make the ‘book of hacks & workarounds’ thinner and have widely used capabilities available in a straightforward mannger, not add even more weirdness to the language.

2 Likes

Wait until he has at least 500 lines of methods in each case. Then we’ll see if he still thinks it looks a lot cleaner :grin:

Separation of concerns - Wikipedia :slight_smile:

1 Like

For me personally the cleanest way to deal with this is pattern matching and not the typical inheritance polymorphism:

enum Json {
  case Bool(value: Boolean)
  case Array(items: Seq[Json])
 
  def foo: Int = this match {
     case Bool(value) => ???
     case Array(items) => ???
  }
}

I find this quite pleasant, and it saves quite a few keystrokes compared to the status quo. IIRC this is also the reason methods are dropped from the cases. Since pattern matching is quite natural in combination with ADTs. Though it remains subjective of course if this is really easier to read (IMO it is).

Most importantly (for me), constructors will return the type of the ADT and not the specialized branch, which prevents quirks and also saves on boilerplate on smart constructors:

// instead of having type Some[Int]
def some[A](value: A): Option[A] = Some(value) 
3 Likes

I’ve used this a couple times in my “try out Dotty” project, and it works really, really well for the simple case of an enumerable set of values.

For creating an ADT or GADT, my guess is this will quickly become a “cute trick” and be relegated obscurity, similar to how you can define a class such that every instance is an extractor, but almost nobody actually does (Regex notwithstanding, as most people seem to treat that as compiler magic, even though it’s not).

1 Like

That’s interesting, since so far everybody else I talked to is strongly in favor of dropping this feature! So it would be good to see arguments why you think it’s important to have.

AFAIK the canonical (or at least what seems to be the most common) example is what’s inferred if you do something like this:

(_: List[Foo]).foldLeft(Foo.Empty)(_ combine _)

Most times you’d want this to return Foo, but the compiler complains that combine returns a Foo and it expects a Foo.Empty.type.

Granted, if you ever actually need something to return Foo.Empty.type, then getting back down there from Foo is a royal pain, and upcasting from Foo.Empty.type is pretty easy (even if it does require some annoying boilerplate).

I’m in favor of returning the more precise type, as it’s easier to work around when the default is wrong, but I can certainly empathize with the annoyance of having to manually create a smart constructor. A better solution for me would be to return the precise type and synthesize smart constructors and an .upcast method to make it easier to get to the type the compiler needs.

1 Like

That particular type inference problem might be solvable: When faced with the problem of instantiating
a type variable X with constraint C <: X where C is a case of enum E, we could in some situations instantiate X to E instead of to C. Similar automatic widenings happen for singleton types and union types already.

5 Likes

The problem at hand was to add methods to individual cases, not methods on the whole enum. This cannot be done with pattern matching. I also added a whole-enum method foo in my example just to illustrate that you can also do it.

1 Like

It comes up all over the place with type inference! I much prefer the constructors to not reveal the branch of the ADT, if we’re going to have separate syntax for enums.

For instance var a = Some(x); while (p) { a = Option(foo) } doesn’t work in Scala 2.

So I’m in @bmeesters’s camp on this one. :+1: for constructors returning the type of the ADT!

(Except if the branches have their own type you do need a way to get a thing of that type. If you have both an autogenerated apply method and a constructor, the constructor can be exact and the apply return the ADT, for instance.)

3 Likes