It seems quite the opposite. Switching the default in Akka away from Java serialization is not yet delivered. AFAIR Java serialization is also not removed from Akka 2.6, so some people that migrate to Akka 2.6 might choose it to ease migration. Therefore it’s hard to conclude that Java serialization is vestigal in Scala world.
Is it the right trade-off for a language feature is the question I’m asking.
What is gained by Scala itself automatically extending Serializable to types? It seems to me that users relying on Java serialisation don’t need to type extending Serializable
themselves.
What is the cost of Scala itself automatically extending Serializable to types?
- It can cause type inference problems, especially in invariant contexts – I’ve experienced this myself and seen this cause confusion, especially for juniors.
- Classes that users might not actually want to be serialisable, are without their control/consent/knowledge. Akka users might want more control to declare which objects are for transport vs not.
- It’s implicit “magic” hidden from users and learners that doesn’t help Scala’s reputation as a language criticised for too much implicit magic (even though it’s not literally an implicit in the keyword sense). IIRC it’s the source of a few Scala puzzlers.
So is it the right trade-off for a language feature? For me, it doesn’t seem so. I don’t want Akka users to go anywhere, I still want them to be able to do their thing and I think manually adding extends Serializable
is a very reasonable request that would allow us to simplify Scala avoid the costs of this feature, especially type inference which I’ve seen confuse and frustrate people.
Secondly, does anyone have any reliable stats or survey results that could help us understand roughly what percentage of Scala users are using Akka or Spark? That might help us weight the arguments with less bias.
Honestly Serializable
should be removed and instead we should be using the Encoder
/Decoder
pattern (i.e. what Circe uses) where you use a typeclass like system to optionally define how to serialize a certain type.
The Scala compiler can provide an automated way of doing this for case class
, it makes however zero sense to make everything extend serializable.
Even if majority agree what to do with Serializable
then there’s Product
which AFAIR is used by ScalaTest to produce output that can e.g. be copy-pasted into Scala code.
Maybe instead of removing Product
and Serializable
, add a construct that adds them automatically? I propose case trait X
- it will mean you can subclass X
only by case class
, case object
or another case trait
. case trait X
would expand to trait X extends Product with Serializable
.
case trait X
would add to the often mentioned regularity of language. If case
modifier can be attached to class
and object
then why couldn’t it be attached to a trait
? Together with trait parameters Scala compiler could auto-generate valid unapply
method in companion object of a case trait
. That would be a viable alternative to subclassing case classes which is already prohibited in Scala 2.12.
Scala in 2018 - The State of Developer Ecosystem by JetBrains - section: Which frameworks / libraries do you regularly use? (%)
- Akka: 47 %
- Spark: 42 %
They are the two most used libraries in Scala world.
Also there’s a prevalent opinion that Apache Spark is the cause of a big influx of Scala programmers. Scala authors already made a favor to Apache Spark developers by not changing a feature of Scala that is often used in Spark world: Proposal to deprecate and remove symbol literals
Serializable
is indeed a train wreck, which is even acknowledged by Oracle. That said, it is still ubiquitous, unfortunately. I believe the best way to treat Serializable
is as a runtime thing. Don’t try to make sense of it as a type, it will be impossible since it’s not a typeclass. Instead, just slap it on basically everything and work out things at runtime as best as you can. It’s a bad state of affairs, and the best option is to just acknowledge that.
Product
is different. It’s actually at the core of typeclass derivation. We need Product
to get sane alternatives to Java serialization.
On the other hand, I agree that we want to see neither in inferred types, ideally.
Why? AFAIK serialization libraries use either reflection or macros. I would say Product
is rather used for pretty-printing or runtime analysis. However, I used Product
once to implement very simple CSV serializer - it’s easy to convert productIterator
to raw String containing valid CSV row, provided that all fields are direcly representable as CSV column. But that’s not a typical use case - it lacked deserializer and support for nested data types (which is super common in general).
OTOH implementing Product
by hand would make it somewhat useless.
I have only seen people discussing data serialization here but in fact, the harder problem is closure/code serialization. It’s not possible to write an encoder for Function1 or for Monoid. But spark, scalding, flink etc… do serialization of logic using Serializable.
This maybe doesn’t change the discussion about case classes extending Serializable, but I wanted to bring it up as we are lamenting Serializable.
I’d love spores to be part of scala 3, but even that only answers Function serialization, it doesn’t address cases where you want to send a typeclass to another node to execute.
Ahm, no. I use Product
in many cases, and I suspect others do, too.
Isn’t the broader problem that the compiler shouldn’t infer A & B if A <: B?
Isn’t it more that both A and B subtype C, D and F, and so their lub is C & D & F instead of C?
Currently case classes do not automatically inherit from Product2[A1, A2], Product3[A1, A2, A3] etc, which means manually extending and overriding _1, _2, etc members. That’s not an insignificant amount of boiler plate for a large percentage of my case classes.
True but then it’s not really incorrect.
Imagine we got rid of Serializable and renamed Product to CaseClass. So you’d get e.g. Color & CaseClass. That doesn’t seem very wrong to me.
Although, I’m not sure it’s better than sticking to e.g. Red | Green.
Oh the inferred type is absolutely correct, that was never my point. It’s just extremely confusing to newcomers and not very useful.
Whether this is enough to warrant “fixing” it is obviously up for debate, but my anecdotal evidence is the amount of Scala code that declares ADTs with a root type extending Product with Serializable
to avoid the weirdness.
Funnily enough I just run into this problem of the compiler inferring
List[TraitX with Serializable with Product]
I would definitely like Serializable and Product removed from case classes. What if you want some of your leaf classes to be case classes and some not to be leaf classes? However I have long thought it would bee good if traits and classes could be marked as non-inferable and require an explicit type ascription. That could then be applied to other types like AnyVal.
I disagree with this WRT Serializable
. Using Product
as a LUB is reasonable, as all subclasses extend it.
The biggest Serializable
headaches I’ve had were when working with Spark, and they all had the same root cause: Serializable
isn’t an interface which can be verified at compile time, it’s a promise made by the programmer, and automatically inferring this guarantees nothing.
It’s entirely possible, even easy, to create an ADT where all of the values are known at compile time (no generics), the compiler “knows” the classes extend Serializable
, and the ADT cannot be serialized.
Backwards compatibility concerns may prevent removing this inference, but I don’t see how this could be considered “correct” behavior.
How about having each case class extend a trait CaseClassBase, which extends Product with Serializable?
There’s an open issue tracking this: 1799.
The TL;DR is:
- Everyone pretty much agreed it’d be a good thing.
- Implementing it turned out more difficult than expected.
- At the moment there aren’t sufficient contributors to get it working.
Why would Serializable be incorrect? Are you saying case classes do not extend Serializable, or are you just saying that Serializable is not useful?
One possibility would be to use a special form of extension. For lack of a better idea, let’s use the protected
keyword for now. E.g.
class C extends A, protected Serializable, protected Product { ... }
class D extends A, protected Serializable, protected Product { ... }
We could then specify that upper bounds don’t take protected superclasses into account. So we’d
infer type A
for x
in
val x = if (???) C() else D()
instead of A & Serializable & Product
. But normal subtyping would not depend on protectedness. E.g. the following would work OK:
def f(p: Product) = ...
f(C())
The extension syntax is admittedly clunky, but that should be not a big issue since most of the problematic cases are auto-generated anyway from case classes and enum cases.
~~
A separate question is whether protected
inheritance should mean anything in addition to “omit from upper bounds”. One possible refinement could be to require that a protected
superclass only adds protected members to its subclass. I.e. the following would be OK
trait A { def a: Int }
trait B { def a: Int = ???; protected def b: Int = ??? }
class C extends A, protected B
Here, the members of C
are a
and b
, and the latter is protected. But if B
was defined like this
trait B { def a: Int = ???; def b: Int = ??? }
then the definition of C
would give a compile time error, since the public b
member comes from a protected
superclass.
The advantage of this rule is that it gives us a way to state and check that a trait such as IndexedSeqOptimized
is only inherited for performance.
Unlike in C++'s private inheritance, I do not propose to retroactively change the visibility of members of inherited classes. See also the discussion in Allow traits to be transparently or "invisibly" mixed in.