Could ADTs extend Product with Serializable in Dotty?

tarsa · May 11, 2019, 2:42pm

It seems quite the opposite. Switching the default in Akka away from Java serialization is not yet delivered. AFAIR Java serialization is also not removed from Akka 2.6, so some people that migrate to Akka 2.6 might choose it to ease migration. Therefore it’s hard to conclude that Java serialization is vestigal in Scala world.

japgolly · May 12, 2019, 1:30am

Is it the right trade-off for a language feature is the question I’m asking.

What is gained by Scala itself automatically extending Serializable to types? It seems to me that users relying on Java serialisation don’t need to type extending Serializable themselves.

What is the cost of Scala itself automatically extending Serializable to types?

It can cause type inference problems, especially in invariant contexts – I’ve experienced this myself and seen this cause confusion, especially for juniors.
Classes that users might not actually want to be serialisable, are without their control/consent/knowledge. Akka users might want more control to declare which objects are for transport vs not.
It’s implicit “magic” hidden from users and learners that doesn’t help Scala’s reputation as a language criticised for too much implicit magic (even though it’s not literally an implicit in the keyword sense). IIRC it’s the source of a few Scala puzzlers.

So is it the right trade-off for a language feature? For me, it doesn’t seem so. I don’t want Akka users to go anywhere, I still want them to be able to do their thing and I think manually adding extends Serializable is a very reasonable request that would allow us to simplify Scala avoid the costs of this feature, especially type inference which I’ve seen confuse and frustrate people.

Secondly, does anyone have any reliable stats or survey results that could help us understand roughly what percentage of Scala users are using Akka or Spark? That might help us weight the arguments with less bias.

mdedetrich · May 12, 2019, 9:53am

Honestly Serializable should be removed and instead we should be using the Encoder/Decoder pattern (i.e. what Circe uses) where you use a typeclass like system to optionally define how to serialize a certain type.

The Scala compiler can provide an automated way of doing this for case class, it makes however zero sense to make everything extend serializable.

tarsa · May 12, 2019, 10:17am

Even if majority agree what to do with Serializable then there’s Product which AFAIR is used by ScalaTest to produce output that can e.g. be copy-pasted into Scala code.

Maybe instead of removing Product and Serializable, add a construct that adds them automatically? I propose case trait X - it will mean you can subclass X only by case class, case object or another case trait. case trait X would expand to trait X extends Product with Serializable.

case trait X would add to the often mentioned regularity of language. If case modifier can be attached to class and object then why couldn’t it be attached to a trait? Together with trait parameters Scala compiler could auto-generate valid unapply method in companion object of a case trait. That would be a viable alternative to subclassing case classes which is already prohibited in Scala 2.12.

tarsa · May 12, 2019, 10:22am

Scala in 2018 - The State of Developer Ecosystem by JetBrains - section: Which frameworks / libraries do you regularly use? (%)

Akka: 47 %
Spark: 42 %

They are the two most used libraries in Scala world.

Also there’s a prevalent opinion that Apache Spark is the cause of a big influx of Scala programmers. Scala authors already made a favor to Apache Spark developers by not changing a feature of Scala that is often used in Spark world: Proposal to deprecate and remove symbol literals

odersky · May 12, 2019, 10:45am

Serializable is indeed a train wreck, which is even acknowledged by Oracle. That said, it is still ubiquitous, unfortunately. I believe the best way to treat Serializable is as a runtime thing. Don’t try to make sense of it as a type, it will be impossible since it’s not a typeclass. Instead, just slap it on basically everything and work out things at runtime as best as you can. It’s a bad state of affairs, and the best option is to just acknowledge that.

Product is different. It’s actually at the core of typeclass derivation. We need Product to get sane alternatives to Java serialization.

On the other hand, I agree that we want to see neither in inferred types, ideally.

tarsa · May 12, 2019, 12:20pm

Why? AFAIK serialization libraries use either reflection or macros. I would say Product is rather used for pretty-printing or runtime analysis. However, I used Product once to implement very simple CSV serializer - it’s easy to convert productIterator to raw String containing valid CSV row, provided that all fields are direcly representable as CSV column. But that’s not a typical use case - it lacked deserializer and support for nested data types (which is super common in general).

OTOH implementing Product by hand would make it somewhat useless.

oscar · May 12, 2019, 5:21pm

I have only seen people discussing data serialization here but in fact, the harder problem is closure/code serialization. It’s not possible to write an encoder for Function1 or for Monoid. But spark, scalding, flink etc… do serialization of logic using Serializable.

This maybe doesn’t change the discussion about case classes extending Serializable, but I wanted to bring it up as we are lamenting Serializable.

I’d love spores to be part of scala 3, but even that only answers Function serialization, it doesn’t address cases where you want to send a typeclass to another node to execute.

Sciss · May 12, 2019, 5:41pm

Ahm, no. I use Product in many cases, and I suspect others do, too.

nafg · May 12, 2019, 6:07pm

Isn’t the broader problem that the compiler shouldn’t infer A & B if A <: B?

nrinaudo · May 12, 2019, 6:58pm

Isn’t it more that both A and B subtype C, D and F, and so their lub is C & D & F instead of C?

RichType · May 12, 2019, 6:58pm

Currently case classes do not automatically inherit from Product2[A1, A2], Product3[A1, A2, A3] etc, which means manually extending and overriding _1, _2, etc members. That’s not an insignificant amount of boiler plate for a large percentage of my case classes.

nafg · May 12, 2019, 9:43pm

True but then it’s not really incorrect.

Imagine we got rid of Serializable and renamed Product to CaseClass. So you’d get e.g. Color & CaseClass. That doesn’t seem very wrong to me.

Although, I’m not sure it’s better than sticking to e.g. Red | Green.

nrinaudo · May 13, 2019, 7:27am

Oh the inferred type is absolutely correct, that was never my point. It’s just extremely confusing to newcomers and not very useful.

Whether this is enough to warrant “fixing” it is obviously up for debate, but my anecdotal evidence is the amount of Scala code that declares ADTs with a root type extending Product with Serializable to avoid the weirdness.

RichType · May 13, 2019, 3:36pm

Funnily enough I just run into this problem of the compiler inferring

List[TraitX with Serializable with Product]

I would definitely like Serializable and Product removed from case classes. What if you want some of your leaf classes to be case classes and some not to be leaf classes? However I have long thought it would bee good if traits and classes could be marked as non-inferable and require an explicit type ascription. That could then be applied to other types like AnyVal.

morgen-peschke · May 13, 2019, 4:04pm

I disagree with this WRT Serializable. Using Product as a LUB is reasonable, as all subclasses extend it.

The biggest Serializable headaches I’ve had were when working with Spark, and they all had the same root cause: Serializable isn’t an interface which can be verified at compile time, it’s a promise made by the programmer, and automatically inferring this guarantees nothing.

It’s entirely possible, even easy, to create an ADT where all of the values are known at compile time (no generics), the compiler “knows” the classes extend Serializable, and the ADT cannot be serialized.

Backwards compatibility concerns may prevent removing this inference, but I don’t see how this could be considered “correct” behavior.

curoli · May 13, 2019, 4:05pm

How about having each case class extend a trait CaseClassBase, which extends Product with Serializable?

morgen-peschke · May 13, 2019, 4:09pm

There’s an open issue tracking this: 1799.
The TL;DR is:

Everyone pretty much agreed it’d be a good thing.
Implementing it turned out more difficult than expected.
At the moment there aren’t sufficient contributors to get it working.

curoli · May 13, 2019, 4:21pm

Why would Serializable be incorrect? Are you saying case classes do not extend Serializable, or are you just saying that Serializable is not useful?

odersky · May 13, 2019, 4:35pm

One possibility would be to use a special form of extension. For lack of a better idea, let’s use the protected keyword for now. E.g.

class C extends A, protected Serializable, protected Product { ... }
class D extends A, protected Serializable, protected Product { ... }

We could then specify that upper bounds don’t take protected superclasses into account. So we’d
infer type A for x in

val x = if (???) C() else D()

instead of A & Serializable & Product. But normal subtyping would not depend on protectedness. E.g. the following would work OK:

def f(p: Product) = ...
f(C())

The extension syntax is admittedly clunky, but that should be not a big issue since most of the problematic cases are auto-generated anyway from case classes and enum cases.

~~

A separate question is whether protected inheritance should mean anything in addition to “omit from upper bounds”. One possible refinement could be to require that a protected superclass only adds protected members to its subclass. I.e. the following would be OK

trait A { def a: Int }
trait B { def a: Int = ???; protected def b: Int = ??? }
class C extends A, protected B

Here, the members of C are a and b, and the latter is protected. But if B was defined like this

trait B { def a: Int = ???; def b: Int = ??? }

then the definition of C would give a compile time error, since the public b member comes from a protected superclass.

The advantage of this rule is that it gives us a way to state and check that a trait such as IndexedSeqOptimized is only inherited for performance.

Unlike in C++'s private inheritance, I do not propose to retroactively change the visibility of members of inherited classes. See also the discussion in Allow traits to be transparently or "invisibly" mixed in.