Could ADTs extend Product with Serializable in Dotty?

Scala 2

ADTs are commonly expressed in Scala 2 as:

sealed trait Foo
object Foo {
  case class Bar(i: Int) extends Foo
  case class Baz(b: Boolean) extends Foo
}

The problem with this approach is that we sometimes get wonky type inference:

scala> List(Foo.Bar(1), Foo.Baz(false))
res0: List[Product with Serializable with Foo] = List(Bar(1), Baz(false))

As a result, a common pattern is to write ADTs as:

sealed trait Foo extends Product with Serializable
object Foo {
  case class Bar(i: Int) extends Foo
  case class Baz(b: Boolean) extends Foo
}

The type inference issue then goes away:

scala> List(Foo.Bar(1), Foo.Baz(false))
res1: List[Foo] = List(Bar(1), Baz(false))

Dotty

In Dotty, we have the delightful enum syntax for ADTs, which make a lot of the verbosity go away:

enum Foo {
  case Bar(i: Int)
  case Baz(b: Boolean)
}

And, due to smart data constructors, it’s almost as if the type inference issue had gone away:

scala> List(Foo.Bar(1), Foo.Baz(false))
val res2: List[Foo] = List(Bar(1), Baz(false))

Almost, though:

scala> List(new Foo.Bar(1), new Foo.Baz(false))
val res3: List[Foo & Product & Serializable] = List(Bar(1), Baz(false))

While a clear improvement, the problem is still here - better hidden, but still here. One might argue that no one should call constructors directly, but the point is that anybody could.

Possible fix

If my understanding of enum is correct, Dotty would need to flag all enums that have at least 2 class cases as extending Product with Serializable.

Simple and value cases are encoded as val and are of the enum’s root type - they do not extend Product or Serializable and are not part of the problem.

If the enum has a single class case, there is no scenario I can think of in which Product or Serializable will be inferred.

Maybe the rule _any enum that contains 2 class cases or more will extend Product with Serializable is too strange, though. It could be generalised - all enums, or all enums that have class cases, or…

There is potential for breakage with such a rule: what if the enum defines simple cases that extend types that are not serializable, for example?

4 Likes

Three things that sound reasonable to me:

  1. Do nothing, and call it good. After all, List(new Foo.Bar(1), new Foo.Baz(false)) does really create a List[Foo & Product & Serializable], and while Product & Serializable is a cumbersome and possibly confusing type (maybe an alias can relief some of that pain?) that does little for most people, it does correctly represent the types in the list
  2. Hide the constructors. Make them private, and make the only way to create new instance the apply methods, that are typed to the parent type, since they are just different constructors of the same type, not different types
  3. Push up the ancestors, and make any enum automatically extend the lub of its cases.
1 Like

In fact, enums already follow that rule.

enum Foo {
  case Bar(i: Int)
  case Baz(b: Boolean)
}

Here, the type of Bar(1) is Foo, not Bar.

no, I mean really hide the actual constructor, and make it private and uncallable with new

1 Like

Ah, that’d be a cleverer solution than what I suggested. The only downside I can think of is that it would then become impossible to create values of type Foo.Baz.

Let’s imagine Option were to be declared as an enum. It’d be impossible to create values of type Some[A], which can be useful - typically when writing custom extractors, where the compiler takes that to mean it’s an irrefutable extractor.

You can still create a value of that type with new. I.e.

   Bar(1): Foo
   new Bar(1): Bar

It’s true that this could open up again the problem of unwanted types in lubs. So Allow traits to be transparently or "invisibly" mixed in could also be relevant in that context.

This doesn’t seem like an issue to me. The constructors are there already to provide values of type Foo. And the new constructors are still available if you need to do something other than that.

This is fine for some things like Option, but in Rust where enums’ branches do not get their own type I find it awkward in cases like Result (i.e. Either in Scala) where you do actually want to talk for a while about one branch.

So I’d rather leave them accessible (and typed).

4 Likes

@Ichoran
So I’d rather leave them accessible (and typed).

I agree with you, but it’d mean that if the wonky type inference is considered to be a problem, there aren’t that many solutions:

  • invisible traits (which I feel would be a major feature and out of scope here)
  • enum extending Product with Serializable, which might result in some breakage and would be a hack, fixing the symptom rather than the problem

Mind you, I’m all for fixing the symptom, I’m tired of extending Product and Serializable in all my Scala 2 ADTs…

1 Like

I think we should

  • deprecate and remove Product
  • have the compiler not extend any unspecified base traits
  • have users extend Serialisable manually if they need to. It’s such a small minority of users who need this functionality anyway
3 Likes

Product and its synthesis in case classes is extremely useful in serialization. What’s your statistical basis of “such a small minority of users”?

2 Likes

It’s the classic Java-centric Serializable trait that is questionable here, and I think it’s a good point. Over the years I’ve been using Scala, the community appears to have moved pretty sharply against that – it’s explicitly an anti-pattern for Akka, for example.

So while I’m less sure about Product, it does seem worth examining the automatic addition of Serializable. It feels to me like a decision we would probably not make today if we were inventing case classes afresh, and I wonder if it makes sense to remove it now…

1 Like

My “statistical basis”? Gosh. Look I’ll share the reason that I believe only a minority of Scala users care about serialisation (let alone on all of their case classes): Firstly, just very simply anecdotal: I’ve worked on lots of Scala projects and talked to lots of Scala users and in my experience less than 5% have used Java serialisation with Scala. Secondly, one could take the top 50 OSS Scala projects and see how many of those projects would be affected if case classes didn’t automatically extend Product with Serializable; I’d be so surprised I think I’d fall into a coma for a few days if it turned out that the majority of those projects broke.

ok. Just saying: productPrefix and productIterator is extremely useful to have on case classes, whether used in serialization (and I have nowhere written that it is Java serialization) or otherwise.

Right, but I don’t think that Product depends on Serializable in any way. They were both added long, long ago, and that probably made sense at the time. And I suspect that Product still makes sense – that’s much more generally useful.

But I’m dubious about Serializable: I suspect it’s adding relatively little value, it’s a mild hassle, and it slightly biases people towards using a tool (Java Serialization) that they probably should be avoiding. Note that @japgolly’s assertion above, about the “small minority of users”, was just about Serializable, and I think he’s probably correct about at least that part…

3 Likes

Anti-pattern or not, Java serialization was for many years the default in Akka and in some legacy Scala microservices I’m developing at work there is lot of Akka remoting usages with messages sent directly over the wire to another Java process (i.e. to another microservice). We often get into situation of serialization problems (i.e. something unserializable was accidentally tried to be sent to remote actor). We haven’t investigated such problems deeply because the problem is relatively rare, project was taken from another team (so we don’t have full knowledge of design decisions) and we’re rewriting it anyway (piece by piece).

Nowadays direct usage of Akka remoting is discouraged outside of Akka clusters, but years ago it wasn’t discouraged at all and IIRC designing for location transparency was heavily promoted, probably leading users to wrong conclusion that we should use Akka remoting casually.

For various reasons our projects that use Akka remoting the most are stuck on Scala 2.10, which means Akka 2.3.x at best. I’m not sure if that Akka version has easy to use alternative to Java serialization.

Concluding:
Java serialization was IMO heavily used in Scala world in the past because of Akka and that made extends Serializable by default (for case classes and case objects) pretty sensible. Now the situation is different.

PS:
Scala developers might not realize that they were using Java serialization. The proper question is to ask if they used Akka remoting in default setting (as the default serialization is Java serialization).

Akka. Spark.

Just tiny projects nobody uses. /s

(using more efficient serialization is recommended, but the fall back is Serializable and is often used, especially during prototyping amd development).

3 Likes

Akka is moving to a different default serialization.

Seriously, it looks to me like Java serialization is becoming steadily more vestigial, and it’s worth thinking seriously about whether it should still be baked into the language itself. I’m not saying forbid it – I’m saying that it may be time to stop treating it as The One True Way, which we’re essentially doing now…

1 Like

What is going on with Heather’s Spores SIP, is there enough energy to get it baked into Dotty?

1 Like

I believe at some point we definitely want to pick this up again. We already have our plates overfull for 3.0, but after that there should be room for it. Somebody would have to step forward and do it.