Another option is to provide an asMap[String, Any]
, so as to avoid the .zip(productIterator)
. Either way, if case classes provided something more or less like productFields
this discussion would probably be moot. People can write whatever mix-in they want to get toFancyString
, etc.
@dwijnand I think I will have a go at implementing a @classicToString annotation or a ClassicToString
trait which reverts to the old behavior, in the PR.
Keep in mind, it is already easy enough to implement the classic toString as an override:
import scala.runtime.ScalaRunTime._toString
trait ClassicToString { this: Product =>
def toString = _toString(this)
}
case class A(i: Int, b: Int) extends ClassicToString
Itâs all anecdotal guesswork, but I suspect youâre wrong. Itâs not all that unusual to have tests that compare a bunch of printouts to an expected output. (IIUC, Dottyâs own test suite does some of this.)
Iâm sure itâs less common that comparing values using equals, yes. But Iâd still be kind of surprised if this didnât break hundreds, maybe thousands, of projects in a way that couldnât be automatically migrated using scalafix. Thatâs a serious priceâŚ
Ultimately, I have to agree with the voices here that state changing the established behaviour of toString
would by an entirely unacceptable breaking change.
We all want better string representations of case classes, but toString
most certainly isnât the way to do it. Scala isnât Kotlin, we donât need to try and shoehorn such behaviour into a special-case one-size-fits-all solution like Kotlin does.
We have typeclasses.
With Dotty/Scala 3 the situation is going to be better still, with typeclasses and some degree of generic derivation baked into the heart of the language - this is the right place to target. We could easily have a range of typeclasses supporting existing behaviour, quoted strings, exposed parameter names, multi-line trees (like pprint), graphviz .dot syntax, etc, etc
Thereâs no good reason to force just one solution on everyone, and even less reason to break the existing approach that people will be assuming in their tests.
Just in my own code I know of a bunch of places where I donât override toString
because it does what I want, and there are probably a bunch more Iâve never thought about. Some of this involves unit tests against hard-coded strings, others donât.
We donât get to redo stuff like this any longer. If you want an easy way to get alternate behavior, itâs worth thinking about, but changing long-standing conventions isnât.
(AlsoâI prefer the existing behavior regardless, because itâs externally usable if you pick the name carefully; the debug version is pretty useless for everything except for debugging, and even then itâs unhelpful unless youâve forgotten what your case class is. But thatâs irrelevant; the point is that this is a major breaking change.)
(Also, the Rust comparison isnât fair, because struct initializers require the field name, while Scala case classes do not. So the stringification matches the usage.)
Why I might not have your experience, I already know how true that is. People relying on the iteration order of Sets, comes to mind, because they unit tested with sets up to size 4. But Iâm not sure if thatâs enough reason not to do it.
âconventionsâ? Itâs just the current implementation, and I donât think it can no longer be redone.
If we add productFields
to the case class instances, I could implement this as part of https://github.com/lihaoyi/PPrint without needing to change the built-in toString method
It turns out the compiler already has commented-out code to generate a method productElementName
(same as productFields
) that would allow you to implement this functionality as a library. The feature was disabled out of concern for bytecode bloat. I took a stab at reviving the commented-out code in this commit here https://github.com/scala/scala/commit/8430f69f72d10e437563f3e14ed329bb246f317c I added a unit test to show how it works and ran :javap -c
to measure the bytecode size. Iâm not experienced in reading javap -c
output so I would some appreciate help there.
If the bytecode overhead is acceptable, I think itâs worth discussing whether to re-enable productElementName
since it makes it possible pretty-print case classes in a far more readable way without breaking existing code that relies on toString
.
After more consideration, I am concerned productElementName
doesnât go far enough for those who want to change toString
and it already goes too far for those who donât want this feature. More bloated bytecode for every case class definition would be a regression in some code-bases.
I believe a more interesting discussion is the general problem: how can we make case-classes more flexible? Currently, all case class must have the same encoding but sometimes you prefer a custom combination of hashCode
, equals
, toString
, copy
, unapply
. Itâs a tough problem with no obvious answer.
Intriguing After playing around with javap a bit, I think itâs about
56 + 10*fieldCount
Breakdown for each field:
- 3 bytes for the string info (the actual characters are already in the constant pool)
- 4 bytes for the tableswitch entry
- 2 bytes for the ldc to load the string
- 1 byte for areturn
Fixed cost: 56 (??) bytes for the productElementName
method. Not sure about this one, ran out of time before I could run a linear regression
It makes sense that the strings are already in the constant pool; theyâre used for debug information, and you can already get access to them via the Paranamer library (thatâs what e.g. jackson-module-scala does)
In that case the additional bloat from making them available should be trivial, and we should just do it
Thank you for the analysis @adriaanm! The estimate would need to be validated on the actual produced jars, but I think if this estimate is correct then the change is absolutely worth it. If anyone is interested in picking this up, feel free to build on top of the changes in this PR here https://github.com/scala/scala/pull/6951
Is this true of linked Scala.js? I honestly have no idea, but itâs a much bigger deal over there. Bloat in the JVM is kind of âmehâ; bloat in JS is deadly critical, and itâs not obvious to me that the strings are typically retained in fullOptJS
. @sjrd?
from the discussion so far I understand that toString will not be changed but that an alternative representation may be introduced.
If that is the case I humbly suggest to actually quote the strings in the output.
Here is an example to illustrate what I mean :
case class Point(x:Int,y:Int)
Point(1,1).toString
Currently yields the string Point(1,1)
which you can conveniently copy and paste back into code especially when working from the repl or writing tests. This doesnât change with the addition of field names.
Now consider the following :
case class Person(firstName:String,lastName:String)
Person("martin","odersky").toString
yields Person(martin, ordersky)
notice how the quotes are lost ? this means you can no longer copy/paste this back into code without some more editing. I wish it would output Person("martin", "odersky")
instead.
I stopped counting the number of students that were surprised by that behaviour when teaching the Lightbendâs Fast Track to Scala/Scala Language for Professionals class.
I understand that it simply delegates to the toString call on java.lang.string which removes the quotes but I still would like the behaviour changed (while still honoring null
properly) it would also make empty strings stand out better. This would probably also break a lot of toString usasge but it may make more sense if a new string serialization mechanisms is introduced in the language/standard library.
If no one calls productElementNames
, then the Scala.js optimizer will remove that method, and the strings it contains with it. If someone does call that method, well then theyâre using the strings, so theyâll be kept, but it can hardly be considered âbloatâ when itâs actually used.
@jeantil how would you treat the case Person("\"", "")
?
@joshlemer The idea is for the string representation to be exactly the code typed to create the instance, I would there fore expect expect the string to contain the escape. If I were to implement it, I would probably look at how itâs done in https://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/StringEscapeUtils.html#escapeJava(java.lang.String) and start from there.
I recommend you check out pprint as it works pretty much as youâre describing http://www.lihaoyi.com/PPrint/
The PR https://github.com/scala/scala/pull/6951 adding productElementNames has been reviewed by Jason Zaugg from the compiler team with actionable feedback if someone wants to pick this up.
thanks for the pointer, since this thread was originally about changing the toString behaviour, I wish it would behave like pprint
but I will look into integrating pprint for my projects thought that wonât help with teaching scala classes.