Proposal for Enumerations in Scala

jsuereth · February 5, 2020, 6:23pm

The last bullet point in the linked github issue specifically calls out the .apply method concerns.

Specifically:

Decide if .apply on enum companion gives precise type or not.
- Additionally if so, decide if simple enum constants become case object not val .

See #7 on the specification for translation: https://dotty.epfl.ch/docs/reference/enums/desugarEnums.html

morgen-peschke · February 5, 2020, 6:26pm

Fair point, but not quite what I was looking for. I was hoping to get a reference to something like the linked issues for the other bullet points, to get some context on the existing discussion before putting in my 2 cents.

jsuereth · February 5, 2020, 6:36pm

… I was hoping to get a reference to something like the linked issues for the other bullet points …

Sorry, I wasn’t able to find anything more than what was linked. I believe there has been in-person discussion, and would love to hear your $.02 on it.

morgen-peschke · February 5, 2020, 7:33pm

Ah well, thanks for looking

I’m a bit torn on the .apply method, and most of my concerns are also related to the exposed type of the values in general, rather than being specifically limited to the .apply method.

On the one hand, it can be really, really, nice to be able to do list.foldLeft(Some(foo))(merge(_, _)) and have it understand you expect the output type to be Option[Foo]. Not having this is mitigated somewhat by helpers like .some and .none, and it’s a familiar problem, so it’s not exactly a deal-breaker.

On the other hand, because none of the subtypes are exposed, it limits what you can do in terms of differentiating typeclass dispatch (e.g. having a different behavior on a leaf than a branch in a GADT). Granted, this doesn’t come up often, and the by-name implicit resolution takes care of many of the cases where I found this most handy, but it can be a bit of a surprise the first time you run into this limitation.

smarter · February 5, 2020, 7:57pm

Right, my hope is that we can eventually enable this sort of things by changing type inference to widen the type of enum cases, the same way 1 will be widened to Int by inference but can still be explicitly typed as 1.

Granted, this doesn’t come up often, and the by-name implicit resolution takes care of many of the cases where I found this most handy, but it can be a bit of a surprise the first time you run into this limitation.

Yes, I’ve had to explain the current behavior to many people who were confused by it, hence the desire to change it.

morgen-peschke · February 5, 2020, 8:10pm

If it ends up that nested enums get support (and I really hope they do), would you expect it to widen to the nearest enclosing enum, or the broadest?

smarter · February 5, 2020, 8:23pm

Good question, I’d say nearest but I guess we’d have to look at a bunch of usecases to make a decision.

morgen-peschke · February 5, 2020, 9:48pm

I agree that nearest is probably a good default, principally because it’s easier to upcast than downcast if the default is wrong for a particular situation.

joshlemer · February 6, 2020, 10:36pm

Here was my proposal for multi-level enums https://contributors.scala-lang.org/t/proposal-for-multi-level-enums

steinybot · February 7, 2020, 4:53am

I find it very strange how the presence of a single class case changes the whole meaning of an enum from an Enumeration to an ADT. Having methods appear and disappear by adding type parameters or a normal parameter list to a case feels really gross.

How are we supposed to explain this to people?

Alice and Bob are peer programming and Bob sees: enum Alphabet {
“Let’s take a geez at the Alphabet enumeration…” - Bob
They scroll down a long list of singleton cases:
case A,
case B,
case C
…
case Z
case Custom(c: Char)
“That’s not the Alphabet enumeration, that is the Alphabet ADT.” - Alice
“But it said enum.” - Bob
“Haha! Oh Bob, you’re so silly.” - Alice

It seems to me that Enumeration’s and ADT’s really are not the same thing and the “enum” abstraction is leaky at best. Looking at the desugaring rules, there are almost completely different rules for whether it is an Enumeration vs ADT.

Do we have to use the same enum keyword for both of these?

Jamming them all under one syntax means that we can combine all motivations into a single feature and increase the chance of it getting through but to me it doesn’t make sense.

How about using enum for Enumerations and picking something else for ADT’s?

Personally I have no problem with using sealed trait for ADT’s. Purely speculative, but I would guess that most people don’t even care that something is called an ADT or not. All that matters is that it has some generic interface and that I can pattern match on the concrete types and get the compiler to yell at me when I miss one. Both trait and sealed can easily be taught to someone without ever having to ever mention ADT.

I don’t know if having a new keyword for ADT’s would be worth it just for motivations #2 and #4. I don’t have much experience with the later.

bmeesters · February 7, 2020, 7:57am

An ADT is a generalization of an enum. So the name might be wrong, but I think the concept itself not leaky at all. I don’t mind using sealed trait, though it is nice that this takes away some boilerplate for something I use on a daily basis. And most importantly, it automatically uses the generic type of the ADT, which IMO is the best choice for working with sealed hierarchies.

martijnhoekstra · February 7, 2020, 9:09am

Suggesting that this feature is so ridiculous and indefensible that Bob is silly for calling the thing an enum instead of an ADT feels really gross.

It also doesn’t matter all that much whether Bob calls it an enum or an ADT, and Alice’s dismissal of Bobs statement it’s an ADT feels really gross.

That the desugaring is completely different is IMO overstated, it looks pretty similar to me. In addition requiring changing the keyword from enum to adt or something when adding a non-enum case also seems like a lot of complexity for no gain.

That (as you say, and as I agree with) people don’t care whether something is called an ADT or not is exactly why we shouldn’t require a keyword separate from enum IMO.

Jasper-M · February 7, 2020, 9:50am

It might be cleaner not to have values and valueOf instance methods, but have some kind of Enumeration[A] typeclass which provides those methods instead. So an Enumeration[Color] instance would exist, but no Enumeration[Option]. But I’m not sure whether java compatibility requires those instance methods to be there.

rgwilton · February 7, 2020, 9:59am

Am I right in thinking that valueOf and values out works for a subset of items that can be defined as part of an enum?

If so, then I think that combing these into a single user visible feature could easily end up being confusing for developers using the language and plausibly it will cause bugs (e.g. when someone adds a parameterized case clause to an existing regular enum, and then finds out that it doesn’t exist in “values”)

Hence I also wonder whether it might also be useful to define a basic enum (and perhaps call it an enum and maybe make it always extend java.lang.Enum).

Then what is currently called “enum” in Dotty 0.22 could perhaps be given a separate keyword, perhaps something like “choice” (name borrowed from the YANG data modelling language).

martijnhoekstra · February 7, 2020, 10:10am

valueOf will work on case A but not on case B(i: Int). values will return A but not some instantiation of B: https://scastie.scala-lang.org/S0kMbiL2S4KFItKSevuYDQ

Jasper-M · February 7, 2020, 10:26am

And once you add a type parameter both values and valueOf disappear. IMO those methods only make sense for your AlphabetEnum. And it is a bit weird that they appear when you add a single value to CustomADT and then disappear again when you add a type parameter.

As a data point from another programming language with ADTs, Haskell says:

Can’t make a derived instance of ‘Enum AlphabetADT’:
‘AlphabetADT’ must be an enumeration type
(an enumeration consists of one or more nullary, non-GADT constructors)

martijnhoekstra · February 7, 2020, 10:38am

I can’t yet see the argument for having values and valueOf on AlphabetADT but not having them on the same with a type parameter. Providing them seems equally useful (i.e. a bit) in both cases.

rgwilton · February 7, 2020, 10:38am

I thought functional programming was about avoiding side effects , yet adding/removing case statements to an enum has some surprising side effects.

My instinct is that this will end up being confusing to end users of the language - it feels like it is making what should be a simple feature quite a bit more complex.

martijnhoekstra · February 7, 2020, 10:40am

What is the simpler proposal you had in mind?

rgwilton · February 7, 2020, 10:51am

My proposal would be to split these into two separate features:

Use the enum keyword to define enums, as per https://dotty.epfl.ch/docs/reference/enums/enums.html, and probably always make them extend java.lang.Enum (if that works). These should always have valueOf and values defined. Perhaps that also means that they cannot have a type parameter, but that would also be okay.

Then use a different keyword for ADTs (https://dotty.epfl.ch/docs/reference/enums/adts.html)

E.g.

choice Option[+T] {
  case Some(x: T)
  case None
}

These would not implement java.lang.Enum, and would not have valueOf or values defined. I would also suggest limiting the “hierarchical enums” to the ADTs syntax only.

Even though this means that the language ends up with two features rather than one, I think that it ends up with less overall complexity, and the semantics and restrictions of each feature are hopefully easier to define and understand.