Why is the values method on enum an Array? Could it be an immutable ArraySeq?

bjornregnell · October 11, 2024, 1:59pm

Why does the values method in the companion of every enum give an Array and not an immutable ArraySeq?

Welcome to Scala 3.5.1 (21.0.4, Java OpenJDK 64-Bit Server VM).
Type in expressions for evaluation. Or try :help.

scala> enum Color { case Black, Red }
// defined class Color

scala> Color.values
val res0: Array[Color] = Array(Black, Red)

If it was an immutable.ArraySeq then it wouldn’t need to be a defensive copy as it could just wrap the underlying Array…

bjornregnell · October 11, 2024, 2:10pm

(And I wouldn’t get questions of “Why” from students that I cannot answer (even though its good to show students every now and then that professors can’t answer everything ))

vasilmkd · October 11, 2024, 2:36pm

Java interop. All Java enums have a values() method that returns an array.

bishabosha · October 11, 2024, 2:52pm

I wonder if we could migrate away from this, by making values only visible to Java, and having a more idiomatic api on the scala side - and possibly keep opt-in visibility of values?

bjornregnell · October 11, 2024, 3:04pm

(Or) perhaps (also) provide a new toSeq method that gives an ArraySeq

Jasper-M · October 11, 2024, 3:14pm

I guess you could make it return an IArray and that would be binary compatible with Java.

bjornregnell · October 11, 2024, 4:30pm

That would indeed be an improvement but also IArrays “don’t behave” (i.e. not structural eq etc like a normal collection such as ArraySeq).

Perhaps the best thing would be to change values to return an IArray and add a toSeq method that returns an immutable.ArraySeq to get the best of both worlds.

bishabosha · October 11, 2024, 5:02pm

it’d still have to be copy on each call which isn’t ideal

MateuszKowalewski · October 11, 2024, 6:53pm

Who uses Scala enums from Java?

Actually, who uses any Scala sugar from Java?

Calling any more advanced Scala construct (especially where Scala uses heavy sugared encodings) from Java does not work properly anyway and is at best a gigantic PITA. That’s why all big Scala libs / frameworks with Java support have a dedicated Java API that manually wraps Scala constructs into proper Java constructs so you don’t need to know Scala compiler internals to call some code. So imho the “interop Scala from Java” is mostly irrelevant and should not be an argument for anything.

Scala enums should have an idiomatic Scala API. @bjornregnell and @bishabosha are spot on here!

That said, I think it’s not good to have two methods with the same purpose. That’s very confusing and just more to explain. Especially if one method is only there because some irrelevant “Java compatibility”… (Also one would need support from lint tools to mark use of the “Java compatibility” method in Scala code as a smell. So I think having a “Java compatibility” method available by default just creates a stream of continuous busy work down the road).

In case the “Java compatibility” method is strictly needed for some reasons I’m not aware of it should at least materialize only when a Scala enum extends Java’s enum (which is the hack to provide enum compatibility on the bytecode level). Whether that than generated compatibility method should be visible from Scala I’m not sure. (My gut feeling would be to hide it like other synthetic methods, but I didn’t think too long about the pros and cons of doing so).

Also I’m not sure why the “ValueSet” of enums should be anything “array like”. It’s by its very nature a set. So it should be in fact a set; or something close to a set, like a tuple.

I think tuples would be even better than a set as they’re better suited for generic & type-level programming. OTOH tuples can have duplicate values, but enums can’t have duplicate cases. So tuples aren’t the optimal model. Also, it would be nice to have set-like methods like union, intersection, diff, etc. on “EnumValueSets”. So “EnumValueSet” shouldn’t be a naked tuple? Or do tuples have, or will get set-like methods? Or maybe “EnumValueSet” should be a real thing in fact?

dwalend · October 11, 2024, 7:24pm

Scala enums should have an idiomatic Scala API.

Yes please!

When I am writing OO-style code in Scala and want something that works like an enum I tend to build case objects that all implement the same sealed trait, then pack them into an immutable lazy val Set. That pattern seems to give me a lot of Scala goodness: match/case, somewhere to hang interesting intrinsic behavior, and something I can iterate, map, filter, extend, and whatnot.

What I don’t like about it is I have to type it all in every time; it’s sugarless.

Or maybe “EnumValueSet” should be a real thing in fact?

I like the idea of having something that specific. It can extend whatever Set- and ordered-ness deemed best.

Could “EnumValueSet[The Enum’s Trait]” be so specific that it is always the complete Set of the enum?

Could we have access to the enum for type-level programming? (I want to use it for types for units of measure.)

We do need something for Java enums. However, that seems like something that belongs off in a standard java compatibility library. Maybe a standard asJavaEnumWorkAlike method on EnumValueSet ?

MateuszKowalewski · October 11, 2024, 9:08pm

I’m not sure “EnumValueSet” should be taken verbatim.

It’s for now just a placeholder for some concept that needs further definition.

API-wise I think I would prefer some compiler generated object member (which likely implements that “EnumValueSet” type) on an enum’s companion object. So the API would look more like $SomeEnumType.ValueSet (for the “EnumValueSet” value, and as it’s an object, $SomeEnumType.ValueSet.type for the type) instead of being a type parametric constructor. But maybe having an exposed stand-alone constructor has some benefits?

Despite that, the main question is still: What is this “EnumValueSet”?

It could be completely opaque (and maybe even “pure virtual” in the sense that it’s just a compile time abstraction), and for example just offer some asTuple and asSet methods. (Which would be than kind of magic, I guess).

It could be also a dedicated new type of Set, which all the collection API. Maybe than with some features from tuples also?

Maybe it could be also something else I didn’t think of, like something in between the two mentioned possibilities.

I’m not sure what’s most feasible from the technical standpoint anyway. Also the mapping to Java isn’t clear. Maybe the compiler could generate somehow two things at once: In case the Scala enum extends Java’s Enum a hidden “values()” method, and in any case for regular use in Scala that .ValueSet object member on the enum.

While prove reading this, I noticed: Maybe this “EnumValueSet” is actually unneeded at all.

What if I could use some compiler generated $SomeEnumType.toSet or $SomeEnumType.toTuple methods to get directly the enum values as the specified data structure?

Such API looks very Scala-ish, returns nice and idiomatic data structures specific for a use-case, and this would avoid to create a new data type at all.

I can’t come up with some example where one wants to look at an “EnumValueSet” as a Set[$SomeEnumType] and a (EnumValue1, EnumValue2, <etc>) tuple at the same time. So maybe having just the two mentioned methods would be sufficient.

But than you had at least two methods to get the values of an enum… Each with a very specific use-case in mind (rich collection API vs. accessing very precise type information). Even likely simpler to implement this looks more involved when it comes to teaching / learning. Kind of back to square one. (At least you would get proper data types, instead of a mutable Array…)

Still I’m not sure about the value of a dedicated set-like data type which is able to carry exact type information about each of its elements. Is something like that useful outside of the “EnumValueSet” use-case? Like said, I couldn’t come up at all with any examples for the use of such data type. So maybe such a construct is overblown, and having two specific conversion methods for enums is pragmatic enough.

bjornregnell · October 12, 2024, 10:11am

I think the main use case of values() is iterating over simple enum values in order of their ordinal number so an immutable ArraySeq[Color] would be OK to cover that use case using a “nice and normal” immutable collection from the stdlib.

odersky · October 13, 2024, 7:25pm

We want to minimize the required set of the Scala library that needs to be there to support Scala, the language. An ArraySeq contains lots of operations and lots of superclasses. An Array’s definition is tiny by comparison. But An IArray would be better since we definitely do not want to mutate the values. The problem is that when enums where designed IArray did not exist yet.

mberndt · October 13, 2024, 8:08pm

Cool, are we going to switch varargs from Seq to Array too?

odersky · October 13, 2024, 8:43pm

Mapping varargs to Seq is a precedent, but not necessarily a good one. In retrospect these should also have been IArrays. This would have simplified so many things, in particular with respects to Java interop and also for performance. Anyway, making a questionable choice once is not a good reason for making it again.

bjornregnell · October 14, 2024, 3:45pm

From a learner and teacher perspective I am pretty glad that varargs is a “normal” immutable.Seq that behaves, i.e. has structural equality, don’t require class tag, has a nice toString, etc. If we would change varargs to an IArray that would create a strange exception from a learners perspective IMHO.

And I also think the enum values method would be less surprising to a learner if it was a “normal” immutable Seq. And I, as a teacher, currently need to spend time on this exception, which in turn spend studens’ grit that could be better spent elsewhere.

And from a performance perspective, I guess an IArray must still be a defensive copy as it could leak as an Array from Java and then it would be a bug if it exposed the underlying structure, if I understood this correctly?

You could argue that a user could always memoize it like this:

scala> enum Color:
     |   case Red, Black
     | object Color:
     |   val toSeq = values.toSeq
     | 
// defined class Color
// defined object Color
                                                                                
scala> Color.toSeq
val res0: Seq[Color] = ArraySeq(Red, Black)

but that is boilerplate and doubles the number of lines needed for simple enums.

spamegg1 · October 14, 2024, 6:47pm

Agreed, although from a different point of view, you can see it as a teachable moment: (in my best teacher impression / voice) “All languages have flaws. All languages have weird edge cases, exceptions, irregularities and gotchas like this! Watch out for them!”

Similar to how the .split method returns an annoying Array (then .toString won’t let us see it), or how .sliding returns an Iterator that can only be used once… I actually use this deliberately as a debugging exercise.

I think it’s nice to fearlessly show the flaws in all languages (and briefly explain the reasons, whether historical, technical, or just bad design) so that students have a more flexible mentality; they will be less likely to fall into language / paradigm camps and wars, or worship some and hate others. But maybe that’s a more long-term goal…

MateuszKowalewski · October 15, 2024, 12:38am

I think that the idea to “minimize the language” is a very good guiding principle.

But as with all good ideas, one should not overdo it.

Things should be simple. But not simpler…

The language needs at its lowest level for sure some proper collection type(s)!

Dumbbing everything down to Java level is contra-productive in my opinion. What values has than Scala when it’s at its core just Java with some syntax sugar?

The idea to just ignore flaws and gotchas (and even add some on top here and there from time to time) also doesn’t look right imho. That’s the way of PHP…

All the mentioned things should get fixed. A little bit more perfectionism, please.

I still think that things should be modeled correctly first and foremost. Because correct and coherent models make things easy to understand.

If all you have are exceptions on top of exceptions, and implementation details leaking into user-space, that’s very hard to teach, understand, and memorize. Scala has imho already way to many of such things. (A few were mentioned by @spamegg1). So the goal should be to reduce the amount of such things with time. Not to double down on such flaws.

BTW: Are tuples and sets also not part of “core Scala”? But if they are, why not change the mentioned things to that? (I would not insist, but if Seq is to “heavyweight” let’s see whether other alternatives besides the IArray fit better).

MateuszKowalewski · October 15, 2024, 12:51am

To go a little bit on a tangent: I like the idea to modularize the std. lib.

Rust has for example a two part std. lib with a core module, and the rest that sits on top.

I like the idea to define a core module for the Scala std. lib, and module(s) on top of that.

That would allow to make much smaller binaries if one does something a la “static linking”. The current Scala std. lib is “gigantic”; but you need to pull it in completely every time. This bloats stand alone applications quite substantially. Would be cool to solve this issue. (Java managed to moduliarize their std. lib… Just saying ).

bjornregnell · October 17, 2024, 9:23am

Well, yes, we should try to use quirks to our advantage and, if the student is ready for it, expand the view on language design trade-offs such as historical prevalence and backward compatibility goals etc.

But, students are different. And learning professional software engineering is a great undertaking that needs to start with many small steps. I often witness students that have a threshold and wear down their grit on things that are not really giving them value in their current context of learning and their current knowledge gaps and conceptual murkiness.

The regularity of the standard library and its integration with the language constructs (for-yield vs map etc) and the (mostly successful) hiding of Java quirks is a major reason that Scala is such an excellent beginner language. That is why I am bringing up this irregularity of enum values not behaving as expected.

But there is of course a cost of e.g. adding lazy val toSeq = values.toSeq to every companion of a simple enum; but I think it might be worth it; esp. as there is both a performance argument and a regularity argument. And these holds even if values is changed to return an IArray instead of an Array.