Case Class toString new behavior proposal (with implementation)

joshlemer · July 15, 2018, 8:18pm

Implementation: https://github.com/scala/scala/pull/6936

I am proposing to change the behaviour of case classes in order that the field names are shown along side the field values. For example:

case class A()
A().toString // previously: "A()", now: "A()"

case class B(i: Int)
B(1).toString // previously: "B(1)", now: "B(i=1)"

case class C(a: Int, b: Int, c: Int, d: Int, e: Int, f: Int, g: Int)
C(1,1,1,1,1,1,1).toString 
// previously:  C(1,1,1,1,1,1,1) . 
//now: C(a=1, b=1, c=1, d=1, e=1, f=1, g=1)

For prior art, look at Kotlin’s data classes:

data class A(val a: Int, val b: String)

fun main(args : Array<String>) {
    println(A(1, "hello")) // A(a=1, b=hello)
}

And Rust’s structs

#[derive(Debug)]
struct A {
    a: i32, b: i32, c: i32
}

fn main() {
    println!("{:?}", A { a: 1, b: 2, c: 3});
    // A { a: 1, b: 2, c: 3 }
}

sjrd · July 15, 2018, 8:27pm

Repeating my comment from the PR at https://github.com/scala/scala/pull/6936#issuecomment-405114583

Sorry, but I have to voice a negative vote on this one. This is going to break so much code, if only existing tests, that the convenience is never going to be worth it.

If only we had done that while we could, like in 2.9.x or something. But now is too late.

oscar · July 15, 2018, 9:18pm

We could imagine a

trait Debug[A] {
  def debug(a: A): String
}

object Debug {
  def debug[A: Debug](a: A): String = implicitly[Debug[A]].debug(a)
}

and you could probably use a macro to generate instances for case classes that have the behavior you want.

You could also imagine adding this to the standard library and have an instance for case classes automatically generated along with all standard java and scala types.

nafg · July 15, 2018, 10:48pm

Why was 2.9 better?

Anyway what about some way to opt in? A compiler flag, mixin, or annotation?

nafg · July 16, 2018, 12:13am

It could be Show, and you could import debug instances

NthPortal · July 16, 2018, 3:05am

(copying a comment left on the PR)

@‌metasim:

@sjrd Is the argument against (“is going to break so much code”) from the standpoint that people out there parse the string output of case classes? Or is there some other consideration?

sjrd · July 16, 2018, 5:41am

My main concern is about a lot of test suites that test the result of toString of their own classes, and happen to use case classes as elements. All those tests will break.

(2.9.x was not better; it was just not used as much, largely deployed etc.)

sjrd · July 16, 2018, 5:45am

compiler flags affecting language semantics are frowned upon now;
a mixin leaks in the public API, and is not at all the right tool for this stuff;
an annotation is the least evil, but still, it’s probably going to be used more and more often up to the point where we’ll really write a case class without that annotation. At that point, I hope we could have a more general mechanism.

nafg · July 16, 2018, 6:28am

One could argue that the format of the toString method isn’t language semantics, no?

What do you have in mind with “more general mechanism”?

swachter · July 16, 2018, 7:06am

I doubt that there are many tests that rely on the string representation of case classes. Case classes are constructed easily and come with equals. Therefore I guess most of the time tests compare to case class instances.

Krever · July 16, 2018, 7:39am

This is extremely valuable and would save thousands of man hours when experimenting, reading logs or debugging… I don’t think that argument about breaking parsers is valid, and even if it is let’s hide this feature behind a compiler flag.

Also, IMHO, for this to have any value it would have to be the default (or at least there should be a possibility to make it a default for a project). If I have to write any code to get this better behaviour the whole change is pointless as I can already override toString.

nafg · July 16, 2018, 8:00am

Keep in mind that even a compiler flag wouldn’t help for compiled libraries on the classpath

sjrd · July 16, 2018, 8:41am

Your compiles with and without the flag, but behaves different at run-time depending on that flag → definitely language semantics.

Something where I could pick and choose case class features. In particular, I would love to have “data classes” that are amenable to binary compatible evolution (see my ScalaSphere 2018 talk, or what sbt-contraband does in an out-of-band fashion). Since data classes would be a new thing, we would have the opportunity to make their toString better than that of case classes. It would also likely be more important for such data classes, as they tend to grow more parameters than case classes meant only for pattern matching (because, you know, case classes should only be used if you actually intend to case on them in pattern matching).

olafurpg · July 16, 2018, 9:04am

This is something I have wanted for a long time, thanks for opening this discussion @joshlemer. I share @sjrd’s concern that changing .toString is too big of a breaking change and I also agree that compiler flags should not affect functional behavior of programs.

Here is an alternative proposal: add productFields: Iterator[String] method to Product. This would enable you to implement a method

def toProductString: String =
  productFields
    .zip(productIterator)
    .map { case (field, value) => s"$field=$value" }
    .mkString("(", ", ", ")")

This method could be an extension method or mixed in via a trait. I think it’s important to have a generic solution to enable custom formatts, in some cases you might for example prefer multiline strings

case class User(name: String, age: Int)
User("Susan", 42).toMultilineString
// user {
//   name: "Susan"
//   age: 42
// }

The method would need a default implementation so that custom subtypes of Product continue to compile unchanged. I think it makes sense to enforce the contract that productIterator and productFields must have the same length, so the default implementation could return empty string field names.

Krever · July 16, 2018, 9:20am

productFields sounds awesome and I wanted this multiple times in the past.

Regarding changing language semantics: I believe the compiler flag should only affect case classes defined in current project. If lib author compiles a lib without a flag it would have old toString.

julienrf · July 16, 2018, 9:41am

I think that relying on toString is weak anyway. Instead of improving it I would suggest moving to a better solution, like the Show typeclass in cats (whose instances can be generically derived, and configured to fit multiple formats). I’m not sympathetic to the idea of having productFields, which makes case classes even heavier and has a runtime overhead.

dwijnand · July 16, 2018, 9:42am

“probably”

Sure it behaves differently, but given its behaviour isn’t specified is it a breaking change? Should your test rely on the current implementation detail?

Reminds me of “every change breaks someone’s workflow” - xkcd: Workflow

I think this change is worth experimenting with, with the annotation opt-back, and perhaps even a Scalafix “annotate all my case classes with the opt-back annotation” rewrite, as a migration tool for large codebases with buggy tests.

sjrd · July 16, 2018, 9:46am

You have no idea how much stuff in Scala isn’t technically specified, yet a lot of code relies on it. Not just in tests. I know because I’ve had to reimplement all of that stuff in Scala.js so that existing code would work.

olafurpg · July 16, 2018, 10:43am

If you want Show[T] to include field names, then you have to pay the bytecode price for the field name string literals regardless if they’re generated by the compiler or a macro. I think productFields complements Show[T], since Show.fromToString could runtime match against Product to extract the field names. Other pretty-printing libraries and tools (scala REPL, jupyter notebooks, scribe, pprint/ammonite, …) could do the same.

For implementation details, I think it makes sense to have case classes generate the following method

case class User(name: String, age: Int) extends Product {
  <synthetic> def productField(n: Int): String = n match {
    case 0 => "name"
    case 1 => "age"
    case n => throw new IndexOutOfBoundsException(n.toString)
  }
}

and the Product trait would have a concrete implementation of final def productFields: Iterator[String] that internally calls productField(n: Int): String

joshlemer · July 16, 2018, 12:19pm

I think that for displaying a case class / data class, it’s important that it can be done without pulling in a third party library, and also depending on shapeless transitively (that’s how this works right?). For better or worse, the “priviledged” way to display objects as String in Scala is .toString so we should either improve that, or look at ways of bringing in Show[T] into the core and automatically deriving instances.