Proposal for Enumerations in Scala

kai · February 7, 2020, 3:02pm

Then use a different keyword for ADTs

Or a modifier, like enum Alphabet vs. enum class Option[+T]

eyalroth · February 7, 2020, 3:27pm

@jsuereth I would suggest adding this link to the original post, as it adds vital information regarding the proposal; for instance, my question regarding the return value of values and valueOf for ADTs.

Me neither, unless I’m using a deep nested hierarchy, in which case I would like a syntax sugar to that allows nested ADTs, maybe even something like this (based on @dwijnand 's example):

sealed trait Reference {
 case trait Build {
   case class BuildRef(build: URI) with ResolvedReference
   case object ThisBuild
 }
 case trait Project {
   case class ProjectRef(build: URI, project: String) with ResolvedReference
   ...
 }
}

Which is not an enumuration.

Enumerated types, by their conventional encyclopedia definition, are a set of fixed and tagged values. Yes, one of their characteristics is being able to “switch” over their values and let the compiler warn when some are missed, but they also exhibit the capability to iterate over all of the fixed values (being a set and all).

For example, note how the enumeratum library – which aims to introduce idiomatic enums into Scala – has a the function findValues, which provides this fundamental capability (accessing all of the enum values).

This capability is not fundamental to ADTs, but rather an ad-hoc ability provided only for that “special case of enums”; hence, ADTs are not a generalization of enumerated types.

But these methods are an integral part of enums. The new syntax should provide them with as little cost as possible. Do we want developers to be required to understand type classes (not a trivial topic at all) to use enums? The whole point is to simplify the use of enums and reduce boilerplate.

eyalroth · February 7, 2020, 3:42pm

My thoughts exactly.

And perhaps implement / encode them with opaques? (I’m not sure it’ll work)

Perhaps we don’t even need a new keyword:

sealed trait Option[+T] {
  case Some(x: T)
  case None
}

This might be useful for enums as well, but I agree that this probably much more important for ADTs – without this feature I don’t see much value in a new syntax for ADTs.

jsuereth · February 8, 2020, 4:58pm

Thanks for posting this! I like this proposal for multi-level enums. As stated in the thread, is this something you think would be needed immediately, or something that could be added in later?

joshlemer · February 9, 2020, 2:07pm

Hey Josh, I don’t think it absolutely must be added immediately, no, but probably some work should be done to ensure they can indeed be added later (i.e. we aren’t backing into a corner syntactically or semantically that would prevent their later addition).

dwijnand · February 9, 2020, 6:14pm

It would probably be interesting to see how many class hierarchies in the ecosystem can and how many cannot be converted to enums. (I just have this feeling that multi-level hierarchies are common, and the best proof that enums can be extended to multi-level enums is by implementing it. )

julienrf · February 10, 2020, 8:59am

My (short) experience so far is that using the enum syntax to define ADTs does not work very well. As soon as you add methods to the enum it feels really weird to have them defined at the same level as enum constructors:

enum Json {
  def decode[A](using decoder: Decoder[A]): Option[A] = ...
  case Bool(value: Boolean)
  case Array(items: Seq[Json])
  ...
}

In this example, the method decode is a member of the Json type, although the constructors Bool and Array are members of the Json value. This means that if you have a value json of type Json on hand, you can call json.decode but not json.Bool. Conversely, you can write Json.Bool but not Json.decode.

In addition to this slightly confusing situation, I found that the enum-syntax was too limited for ADTs. Not only multi-level enums are not supported, but enum constructors can’t define methods:

enum Foo {
  case Bar {
    def something = () // NOT SUPPORTED
  }
}

Last but not least, the values and valueOf methods make no sense on ADTs, as this was previously said in this thread.

For these reasons, I found that it makes little sense to start defining an ADT with the enum syntax: I start with a sealed trait directly, so that I don’t have to convert my code to this style once I hit one of the aforementioned limitations.

But, should we work on addressing these limitations (as suggested in the multi-levels enums proposal), or should we restrict the scope of enums to effectively enumerated values? I lean towards the second option.

About nested-enums: keep in mind that in the SIP proposal constructing an enum value has the type of the enum, not the type of its constructor (as opposed to the way the current case classes work). I.e., constructing Some("foo") would have type Option[String], not Some[String], if Option was defined as an enum. One of the motivations for having multi-levels ADTs is to be able to distinguish one particular subtype of the top-level type. So, I am not sure that being able to use the enum syntax to define such ADTs would be enough.

odersky · February 10, 2020, 9:38am

I believe values and valueOf do make sense if ADTs also define simple cases. They are necessary for (de-)serializing such values, for instance. Without values and valueOf, one would have to generate a complete object with its own JVM class for each simple value in an ADT. By contrast, the aim of the current design is to avoid generating lots of code for simple enum values, no matter whether these values are part of a simple enumeratiion or a general ADT.

About adding methods to enum cases: I believe that should not be supported. The idea of an ADT is that it’s data! If one wants a more OO approach where methods go with subclasses, then indeed one should use a sealed trait with subclasses.

kai · February 10, 2020, 9:50am

And the idea of Scala is combining OO with FP, an FP-only construct that removes access to the OO toolset is not a good fit for that purpose.

eyalroth · February 10, 2020, 9:54am

I had a similar feeling when trying to convert one of my ADTs (that has methods) to an inner-class-like syntax. I posted about it the other day on the multi-level enum thread. I could perhaps imagine a syntax like the following, which has nothing to do with enums:

sealed trait Json {
  def decode[A](using decoder: Decoder[A]): Option[A] = ...
  sealed {
    case class Bool(value: Boolean) { ... }
    case class Array(items: Seq[Json])
    case object Null
  }
}

We can play around with the keywords, but I think that two things are essential here:

Using case class and case object. This would (a) make it clearer that one is actually a class and the other is an object, and (b) allow for other class-def modifiers – final, private, etc – to be integrated seamlessly.
Nest / indent the nested ADTs under a certain keyword (not necessarily sealed).

I suspect that this would lead to developers using data-only ADT syntax with extension methods (like here) instead of OOP classes. Not to say that this is a bad thing, but just a consideration to be aware of.

rgwilton · February 10, 2020, 9:57am

Are there any performance considerations for enums? E.g. I would like it if a match statement on enum values collapse down to a tableswitch or lookupswitch (e.g. as per the @switch annotation).

Would this already work with the current Scala 3 Enum definition, and is this in scope as a consideration of the definition?

odersky · February 10, 2020, 10:01am

I don’t think you need to tell me that But the motivation for enums was that sometimes all we want is data, and then we should not have to jump through all the hoops of the more general class hierarchy syntax.

Jasper-M · February 10, 2020, 10:12am

I’m not sure why though. With the current proposal you can turn those extension methods into instance methods by just copy pasting them into the enum (and s/response/this/). So I’m not sure why someone would go with the extension methods instead.

eyalroth · February 10, 2020, 10:28am

I was referring to the situation where enum does not allow instance methods. If they are allowed, then by all means there is no need for extension methods, but then the code looks a bit messy (as @julienrf pointed out).

sideeffffect · February 10, 2020, 10:34am

I think with enum, we’re conflating too many things here.

Enumerations – declaring some finite, flat, plain data and assigning a natural number to each piece. Like enums in Java, C, protobuf, … This is where .value, .values, .valuesToEntriesMap, etc methods make sense. Basically this should replicate what enumeratum does. It would be great, if all of these inherited from java.lang.Enum automatically (I’m not sure it’s possible, though). We should use the keyword enum for this:

enum Fruit(sugarContent: Double) {
  case Apple(0.5)
  case Orange(1.25)
}
Fruit.Apple.value == 0
Fruit.Orange.sugarContent == 1.25
Fruit.valuesToEntriesMap(0).sugarContent == 0.5
Fruit.values == List(Fruit.Apple, Fruit.Orange)
Fruit.Apple: Fruit
// Fruit could also implement Eq and Ord out of the box? (based on `.value`)

Tagged unions – ADTs, as we know from Haskell/F#/ML/… Can be recursive, but these are still data, so “nesting” doesn’t make much sense here, nor do .values or java.lang.Enum. We should use the keyword union for this.

union Expr {
  case Zero
  case Val(v: Int)
  case Sum(l: Expr, r: Expr)
}
Expr.Val(42): Expr // the fact, that `Val` is implemented with a `class` `Val` is a detail, that should be _hidden_

Any complicated, including nested, sealed hierarchies. We already have everything we need for them in Scala (sealed, trait, class), no need for any other special keywords.

I like how Scala 3 tries to codify common idioms (those would be enum and union), simple things should be easy On the other hand, I don’t see a reason to complicate the (much rarer) complex things, like nested hierarchies. Those are already possible with Scala 2 tools, like sealed, trait, class.

@odersky would you agree that it’s worthwhile separating the union concept from enum?

eyalroth · February 10, 2020, 10:50am

I support your idea of separation between the features (and also realize now my earlier mistake of not understanding that ADTs are union types), but I’m not convinced that such a “small” scenario merits a new syntax.

As you said, we already have sealed, class, trait, object; why then we need a custom union syntax? Are those really all that different?

union Expr {
  case Zero
  case Val(v: Int)
  case Sum(l: Expr, r: Expr)
}

sealed trait Expr
object Expr {
  object Zero extends Expr
  case class Val(v: Int) extends Expr
  case class Sum(l: Expr, r: Expr) extends Expr
}

If anything, the more nesting such hierarchy has, the more boilerplate is required, and the greater the need for a simpler and more concise syntax.

julienrf · February 10, 2020, 10:53am

I’m not sure this would really work because you would anyway need to derive proper (de-)serializers for the case classes. I’m not even sure it would be simple to design a serialization process that would pick up the valueOf method for “simple cases” but would construct proper class instances for the other cases of a same ADT.

odersky · February 10, 2020, 10:55am

Not at all! I am a strong proponent of keeping the two together. As far as I know,every language that supports ADTs also supports enums as a special case of ADTs. An enum is simply an ADT where all cases are simple. The philosophy of the Scala language is to be a unifier, instead of an amalgamation of many different features. I have come to realise that if you ask committees or the general public the vote always goes towards more differentiated features, which in the end invariably leads to feature creep. So, I take it on me to strongly resist this tendency

One possible design is to stay pure and simply not have any enums at all, since they are not strictly necessary. That’s what Scala 2 did, and we could continue with it. On the other hand, I have the impression that the reduction of boilerplate is worth it. But then it should be one concept, not two or three different ones.

eyalroth · February 10, 2020, 11:04am

I don’t know about pure-data ADTs, but “java” enums are very much missed in Scala. There is the enumeratum library that somewhat provides their utility, but it seems to rely on macros so I’m not sure it’ll be ported to Scala 3.

It seems to me we have two concepts that are similar at their core, but heavily differ in their usage and needs of syntax sugar; enums need values / valueOf; ADTs need defs on nested types and multi-level hierarchies.

Trying to combine two different syntax sugars into one, just because they share a conceptual core, is not a good decision imho.

LPTK · February 10, 2020, 11:53am

That may actually be a good thing. When I define ADTs I always end up stuffing them with methods because it’s the easy thing to do, but then I dislike the result, because it is no longer easy to see the structure of the ADT anymore, with all the method pollution.

I think the better approach (though a bit cumbersome) is to outsource the methods into external traits, which actually also works with the enum syntax:

enum Json {
  case Bool(value: Boolean)    extends Json with BoolImpl
  case Array(items: Seq[Json]) extends Json with ArrayImpl
  def foo: Int
}
private trait ArrayImpl { self: Json.Array =>
  def foo = items.size
  def bar = foo // this method is defined only for Array
}
private trait BoolImpl { self: Json.Bool =>
  def foo = if (value) 1 else 0
}

@main def m = {
  val j = new Json.Array(Seq(Json.Bool(true)))
  assert(j.foo == j.bar)
}

Though again, it’s a little too much boilerplate, especially since it forces specifying the full extends clauses of the ADT cases.