Why is the values method on enum an Array? Could it be an immutable ArraySeq?

Why does the values method in the companion of every enum give an Array and not an immutable ArraySeq?

Welcome to Scala 3.5.1 (21.0.4, Java OpenJDK 64-Bit Server VM).
Type in expressions for evaluation. Or try :help.

scala> enum Color { case Black, Red }
// defined class Color

scala> Color.values
val res0: Array[Color] = Array(Black, Red)

If it was an immutable.ArraySeq then it wouldn’t need to be a defensive copy as it could just wrap the underlying Array…

3 Likes

(And I wouldn’t get questions of “Why” from students that I cannot answer (even though its good to show students every now and then that professors can’t answer everything :slight_smile: ))

Java interop. All Java enums have a values() method that returns an array.

1 Like

I wonder if we could migrate away from this, by making values only visible to Java, and having a more idiomatic api on the scala side - and possibly keep opt-in visibility of values?

4 Likes

(Or) perhaps (also) provide a new toSeq method that gives an ArraySeq

I guess you could make it return an IArray and that would be binary compatible with Java.

3 Likes

That would indeed be an improvement but also IArrays “don’t behave” (i.e. not structural eq etc like a normal collection such as ArraySeq).

Perhaps the best thing would be to change values to return an IArray and add a toSeq method that returns an immutable.ArraySeq to get the best of both worlds.

it’d still have to be copy on each call which isn’t ideal

1 Like

Who uses Scala enums from Java?

Actually, who uses any Scala sugar from Java? :joy:

Calling any more advanced Scala construct (especially where Scala uses heavy sugared encodings) from Java does not work properly anyway and is at best a gigantic PITA. That’s why all big Scala libs / frameworks with Java support have a dedicated Java API that manually wraps Scala constructs into proper Java constructs so you don’t need to know Scala compiler internals to call some code. So imho the “interop Scala from Java” is mostly irrelevant and should not be an argument for anything.

Scala enums should have an idiomatic Scala API. @bjornregnell and @bishabosha are spot on here!

That said, I think it’s not good to have two methods with the same purpose. That’s very confusing and just more to explain. Especially if one method is only there because some irrelevant “Java compatibility”… (Also one would need support from lint tools to mark use of the “Java compatibility” method in Scala code as a smell. So I think having a “Java compatibility” method available by default just creates a stream of continuous busy work down the road).

In case the “Java compatibility” method is strictly needed for some reasons I’m not aware of it should at least materialize only when a Scala enum extends Java’s enum (which is the hack to provide enum compatibility on the bytecode level). Whether that than generated compatibility method should be visible from Scala I’m not sure. (My gut feeling would be to hide it like other synthetic methods, but I didn’t think too long about the pros and cons of doing so).

Also I’m not sure why the “ValueSet” of enums should be anything “array like”. It’s by its very nature a set. So it should be in fact a set; or something close to a set, like a tuple.

I think tuples would be even better than a set as they’re better suited for generic & type-level programming. OTOH tuples can have duplicate values, but enums can’t have duplicate cases. So tuples aren’t the optimal model. Also, it would be nice to have set-like methods like union, intersection, diff, etc. on “EnumValueSets”. So “EnumValueSet” shouldn’t be a naked tuple? Or do tuples have, or will get set-like methods? Or maybe “EnumValueSet” should be a real thing in fact?

2 Likes

Scala enums should have an idiomatic Scala API.

Yes please!

When I am writing OO-style code in Scala and want something that works like an enum I tend to build case objects that all implement the same sealed trait, then pack them into an immutable lazy val Set. That pattern seems to give me a lot of Scala goodness: match/case, somewhere to hang interesting intrinsic behavior, and something I can iterate, map, filter, extend, and whatnot.

What I don’t like about it is I have to type it all in every time; it’s sugarless.

Or maybe “EnumValueSet” should be a real thing in fact?

I like the idea of having something that specific. It can extend whatever Set- and ordered-ness deemed best.

Could “EnumValueSet[The Enum’s Trait]” be so specific that it is always the complete Set of the enum?

Could we have access to the enum for type-level programming? (I want to use it for types for units of measure.)

We do need something for Java enums. However, that seems like something that belongs off in a standard java compatibility library. Maybe a standard asJavaEnumWorkAlike method on EnumValueSet ?

1 Like

I’m not sure “EnumValueSet” should be taken verbatim.

It’s for now just a placeholder for some concept that needs further definition.

API-wise I think I would prefer some compiler generated object member (which likely implements that “EnumValueSet” type) on an enum’s companion object. So the API would look more like $SomeEnumType.ValueSet (for the “EnumValueSet” value, and as it’s an object, $SomeEnumType.ValueSet.type for the type) instead of being a type parametric constructor. But maybe having an exposed stand-alone constructor has some benefits?

Despite that, the main question is still: What is this “EnumValueSet”?

It could be completely opaque (and maybe even “pure virtual” in the sense that it’s just a compile time abstraction), and for example just offer some asTuple and asSet methods. (Which would be than kind of magic, I guess).

It could be also a dedicated new type of Set, which all the collection API. Maybe than with some features from tuples also?

Maybe it could be also something else I didn’t think of, like something in between the two mentioned possibilities.

I’m not sure what’s most feasible from the technical standpoint anyway. Also the mapping to Java isn’t clear. Maybe the compiler could generate somehow two things at once: In case the Scala enum extends Java’s Enum a hidden “values()” method, and in any case for regular use in Scala that .ValueSet object member on the enum.


While prove reading this, I noticed: Maybe this “EnumValueSet” is actually unneeded at all.

What if I could use some compiler generated $SomeEnumType.toSet or $SomeEnumType.toTuple methods to get directly the enum values as the specified data structure?

Such API looks very Scala-ish, returns nice and idiomatic data structures specific for a use-case, and this would avoid to create a new data type at all.

I can’t come up with some example where one wants to look at an “EnumValueSet” as a Set[$SomeEnumType] and a (EnumValue1, EnumValue2, <etc>) tuple at the same time. So maybe having just the two mentioned methods would be sufficient.

But than you had at least two methods to get the values of an enum… Each with a very specific use-case in mind (rich collection API vs. accessing very precise type information). Even likely simpler to implement this looks more involved when it comes to teaching / learning. Kind of back to square one. (At least you would get proper data types, instead of a mutable Array…)

Still I’m not sure about the value of a dedicated set-like data type which is able to carry exact type information about each of its elements. Is something like that useful outside of the “EnumValueSet” use-case? Like said, I couldn’t come up at all with any examples for the use of such data type. So maybe such a construct is overblown, and having two specific conversion methods for enums is pragmatic enough.

I think the main use case of values() is iterating over simple enum values in order of their ordinal number so an immutable ArraySeq[Color] would be OK to cover that use case using a “nice and normal” immutable collection from the stdlib.

We want to minimize the required set of the Scala library that needs to be there to support Scala, the language. An ArraySeq contains lots of operations and lots of superclasses. An Array’s definition is tiny by comparison. But An IArray would be better since we definitely do not want to mutate the values. The problem is that when enums where designed IArray did not exist yet.

3 Likes

Cool, are we going to switch varargs from Seq to Array too?

Mapping varargs to Seq is a precedent, but not necessarily a good one. In retrospect these should also have been IArrays. This would have simplified so many things, in particular with respects to Java interop and also for performance. Anyway, making a questionable choice once is not a good reason for making it again.

6 Likes

From a learner and teacher perspective I am pretty glad that varargs is a “normal” immutable.Seq that behaves, i.e. has structural equality, don’t require class tag, has a nice toString, etc. If we would change varargs to an IArray that would create a strange exception from a learners perspective IMHO.

And I also think the enum values method would be less surprising to a learner if it was a “normal” immutable Seq. And I, as a teacher, currently need to spend time on this exception, which in turn spend studens’ grit that could be better spent elsewhere.

And from a performance perspective, I guess an IArray must still be a defensive copy as it could leak as an Array from Java and then it would be a bug if it exposed the underlying structure, if I understood this correctly?

You could argue that a user could always memoize it like this:

scala> enum Color:
     |   case Red, Black
     | object Color:
     |   val toSeq = values.toSeq
     | 
// defined class Color
// defined object Color
                                                                                
scala> Color.toSeq
val res0: Seq[Color] = ArraySeq(Red, Black)

but that is boilerplate and doubles the number of lines needed for simple enums.

2 Likes

Agreed, although from a different point of view, you can see it as a teachable moment: (in my best teacher impression / voice) “All languages have flaws. All languages have weird edge cases, exceptions, irregularities and gotchas like this! Watch out for them!” :laughing:

Similar to how the .split method returns an annoying Array (then .toString won’t let us see it), or how .sliding returns an Iterator that can only be used once… I actually use this deliberately as a debugging exercise.

I think it’s nice to fearlessly show the flaws in all languages (and briefly explain the reasons, whether historical, technical, or just bad design) so that students have a more flexible mentality; they will be less likely to fall into language / paradigm camps and wars, or worship some and hate others. But maybe that’s a more long-term goal… :smiley:

I think that the idea to “minimize the language” is a very good guiding principle.

But as with all good ideas, one should not overdo it.

Things should be simple. But not simpler…

The language needs at its lowest level for sure some proper collection type(s)!

Dumbbing everything down to Java level is contra-productive in my opinion. What values has than Scala when it’s at its core just Java with some syntax sugar?

The idea to just ignore flaws and gotchas (and even add some on top here and there from time to time) also doesn’t look right imho. That’s the way of PHP…

All the mentioned things should get fixed. A little bit more perfectionism, please. :smile:

I still think that things should be modeled correctly first and foremost. Because correct and coherent models make things easy to understand.

If all you have are exceptions on top of exceptions, and implementation details leaking into user-space, that’s very hard to teach, understand, and memorize. Scala has imho already way to many of such things. (A few were mentioned by @spamegg1). So the goal should be to reduce the amount of such things with time. Not to double down on such flaws.

BTW: Are tuples and sets also not part of “core Scala”? But if they are, why not change the mentioned things to that? (I would not insist, but if Seq is to “heavyweight” let’s see whether other alternatives besides the IArray fit better).

To go a little bit on a tangent: I like the idea to modularize the std. lib.

Rust has for example a two part std. lib with a core module, and the rest that sits on top.

I like the idea to define a core module for the Scala std. lib, and module(s) on top of that.

That would allow to make much smaller binaries if one does something a la “static linking”. The current Scala std. lib is “gigantic”; but you need to pull it in completely every time. This bloats stand alone applications quite substantially. Would be cool to solve this issue. (Java managed to moduliarize their std. lib… Just saying :wink:).