Proposal for Enumerations in Scala

Ah well, thanks for looking :man_shrugging:

I’m a bit torn on the .apply method, and most of my concerns are also related to the exposed type of the values in general, rather than being specifically limited to the .apply method.

On the one hand, it can be really, really, nice to be able to do list.foldLeft(Some(foo))(merge(_, _)) and have it understand you expect the output type to be Option[Foo]. Not having this is mitigated somewhat by helpers like .some and .none, and it’s a familiar problem, so it’s not exactly a deal-breaker.

On the other hand, because none of the subtypes are exposed, it limits what you can do in terms of differentiating typeclass dispatch (e.g. having a different behavior on a leaf than a branch in a GADT). Granted, this doesn’t come up often, and the by-name implicit resolution takes care of many of the cases where I found this most handy, but it can be a bit of a surprise the first time you run into this limitation.

1 Like

Right, my hope is that we can eventually enable this sort of things by changing type inference to widen the type of enum cases, the same way 1 will be widened to Int by inference but can still be explicitly typed as 1.

Granted, this doesn’t come up often, and the by-name implicit resolution takes care of many of the cases where I found this most handy, but it can be a bit of a surprise the first time you run into this limitation.

Yes, I’ve had to explain the current behavior to many people who were confused by it, hence the desire to change it.

1 Like

If it ends up that nested enums get support (and I really hope they do), would you expect it to widen to the nearest enclosing enum, or the broadest?

1 Like

Good question, I’d say nearest but I guess we’d have to look at a bunch of usecases to make a decision.

1 Like

I agree that nearest is probably a good default, principally because it’s easier to upcast than downcast if the default is wrong for a particular situation.

2 Likes

Here was my proposal for multi-level enums https://contributors.scala-lang.org/t/proposal-for-multi-level-enums

5 Likes

I find it very strange how the presence of a single class case changes the whole meaning of an enum from an Enumeration to an ADT. Having methods appear and disappear by adding type parameters or a normal parameter list to a case feels really gross.

How are we supposed to explain this to people?

  1. Alice and Bob are peer programming and Bob sees: enum Alphabet {
  2. “Let’s take a geez at the Alphabet enumeration…” - Bob
  3. They scroll down a long list of singleton cases:
    case A,
    case B,
    case C

    case Z
    case Custom(c: Char)
  4. “That’s not the Alphabet enumeration, that is the Alphabet ADT.” - Alice
  5. “But it said enum.” - Bob
  6. “Haha! Oh Bob, you’re so silly.” - Alice

It seems to me that Enumeration’s and ADT’s really are not the same thing and the “enum” abstraction is leaky at best. Looking at the desugaring rules, there are almost completely different rules for whether it is an Enumeration vs ADT.

Do we have to use the same enum keyword for both of these?

Jamming them all under one syntax means that we can combine all motivations into a single feature and increase the chance of it getting through but to me it doesn’t make sense.

How about using enum for Enumerations and picking something else for ADT’s?

Personally I have no problem with using sealed trait for ADT’s. Purely speculative, but I would guess that most people don’t even care that something is called an ADT or not. All that matters is that it has some generic interface and that I can pattern match on the concrete types and get the compiler to yell at me when I miss one. Both trait and sealed can easily be taught to someone without ever having to ever mention ADT.

I don’t know if having a new keyword for ADT’s would be worth it just for motivations #2 and #4. I don’t have much experience with the later.

4 Likes

An ADT is a generalization of an enum. So the name might be wrong, but I think the concept itself not leaky at all. I don’t mind using sealed trait, though it is nice that this takes away some boilerplate for something I use on a daily basis. And most importantly, it automatically uses the generic type of the ADT, which IMO is the best choice for working with sealed hierarchies.

1 Like

Suggesting that this feature is so ridiculous and indefensible that Bob is silly for calling the thing an enum instead of an ADT feels really gross.

It also doesn’t matter all that much whether Bob calls it an enum or an ADT, and Alice’s dismissal of Bobs statement it’s an ADT feels really gross.

That the desugaring is completely different is IMO overstated, it looks pretty similar to me. In addition requiring changing the keyword from enum to adt or something when adding a non-enum case also seems like a lot of complexity for no gain.

That (as you say, and as I agree with) people don’t care whether something is called an ADT or not is exactly why we shouldn’t require a keyword separate from enum IMO.

2 Likes

It might be cleaner not to have values and valueOf instance methods, but have some kind of Enumeration[A] typeclass which provides those methods instead. So an Enumeration[Color] instance would exist, but no Enumeration[Option]. But I’m not sure whether java compatibility requires those instance methods to be there.

4 Likes

Am I right in thinking that valueOf and values out works for a subset of items that can be defined as part of an enum?

If so, then I think that combing these into a single user visible feature could easily end up being confusing for developers using the language and plausibly it will cause bugs (e.g. when someone adds a parameterized case clause to an existing regular enum, and then finds out that it doesn’t exist in “values”)

Hence I also wonder whether it might also be useful to define a basic enum (and perhaps call it an enum and maybe make it always extend java.lang.Enum).

Then what is currently called “enum” in Dotty 0.22 could perhaps be given a separate keyword, perhaps something like “choice” (name borrowed from the YANG data modelling language).

1 Like

valueOf will work on case A but not on case B(i: Int). values will return A but not some instantiation of B: https://scastie.scala-lang.org/S0kMbiL2S4KFItKSevuYDQ

And once you add a type parameter both values and valueOf disappear. IMO those methods only make sense for your AlphabetEnum. And it is a bit weird that they appear when you add a single value to CustomADT and then disappear again when you add a type parameter.

As a data point from another programming language with ADTs, Haskell says:

Can’t make a derived instance of ‘Enum AlphabetADT’:
‘AlphabetADT’ must be an enumeration type
(an enumeration consists of one or more nullary, non-GADT constructors)

3 Likes

I can’t yet see the argument for having values and valueOf on AlphabetADT but not having them on the same with a type parameter. Providing them seems equally useful (i.e. a bit) in both cases.

2 Likes

I thought functional programming was about avoiding side effects :wink:, yet adding/removing case statements to an enum has some surprising side effects.

My instinct is that this will end up being confusing to end users of the language - it feels like it is making what should be a simple feature quite a bit more complex.

What is the simpler proposal you had in mind?

My proposal would be to split these into two separate features:

Use the enum keyword to define enums, as per https://dotty.epfl.ch/docs/reference/enums/enums.html, and probably always make them extend java.lang.Enum (if that works). These should always have valueOf and values defined. Perhaps that also means that they cannot have a type parameter, but that would also be okay.

Then use a different keyword for ADTs (https://dotty.epfl.ch/docs/reference/enums/adts.html)

E.g.

choice Option[+T] {
  case Some(x: T)
  case None
}

These would not implement java.lang.Enum, and would not have valueOf or values defined. I would also suggest limiting the “hierarchical enums” to the ADTs syntax only.

Even though this means that the language ends up with two features rather than one, I think that it ends up with less overall complexity, and the semantics and restrictions of each feature are hopefully easier to define and understand.

5 Likes

Then use a different keyword for ADTs

Or a modifier, like enum Alphabet vs. enum class Option[+T]

@jsuereth I would suggest adding this link to the original post, as it adds vital information regarding the proposal; for instance, my question regarding the return value of values and valueOf for ADTs.

Me neither, unless I’m using a deep nested hierarchy, in which case I would like a syntax sugar to that allows nested ADTs, maybe even something like this (based on @dwijnand 's example):

sealed trait Reference {
 case trait Build {
   case class BuildRef(build: URI) with ResolvedReference
   case object ThisBuild
 }
 case trait Project {
   case class ProjectRef(build: URI, project: String) with ResolvedReference
   ...
 }
}

Which is not an enumuration.

Enumerated types, by their conventional encyclopedia definition, are a set of fixed and tagged values. Yes, one of their characteristics is being able to “switch” over their values and let the compiler warn when some are missed, but they also exhibit the capability to iterate over all of the fixed values (being a set and all).

For example, note how the enumeratum library – which aims to introduce idiomatic enums into Scala – has a the function findValues, which provides this fundamental capability (accessing all of the enum values).

This capability is not fundamental to ADTs, but rather an ad-hoc ability provided only for that “special case of enums”; hence, ADTs are not a generalization of enumerated types.

But these methods are an integral part of enums. The new syntax should provide them with as little cost as possible. Do we want developers to be required to understand type classes (not a trivial topic at all) to use enums? The whole point is to simplify the use of enums and reduce boilerplate.

1 Like

My thoughts exactly.

And perhaps implement / encode them with opaques? (I’m not sure it’ll work)

Perhaps we don’t even need a new keyword:

sealed trait Option[+T] {
  case Some(x: T)
  case None
}

This might be useful for enums as well, but I agree that this probably much more important for ADTs – without this feature I don’t see much value in a new syntax for ADTs.