Can we make adding a parameter with a default value binary compatible?

This post is inspired by recent work we’ve been doing to enforce binary compatibility in the com-lihaoyi ecosystem, as well as the earlier discussion about adding some alternative to case classes which are binary compatible:

I’m personally not a fan of that approach. Binary compatibility is an important concern, but it’s an implementation limitation, not a semantic/language concern. Furthermore, that solution doesn’t apply to defs, which suffer the same problem. This post proposes an alternative.

Why not withFoo?

Going all-in on the .withFoo approach is very Java-esque: Basically the entire reason I want to write Scala is because I can write syntax that matches exactly what I mean like:

case class Person(first: String, last: String, country: String)

Person(first = "Haoyi", last = "Li", country="singapore")

Rather than the Java-style syntax that’s full of boilerplate and patterns to work around language weaknesses:

Person()
  .withFirst("Haoyi")
  .withLast("Li")
  .withCountry("Singapore")

In other languages like Python or SQL, adding a field (or column) with a default value is largely a backwards-compatible operation. Can we do the same for Scala?

Principles

  1. Given that binary compatibility is an implementation concern - it simply doesn’t exist in a program compiled all-at-once from source - we should not change the Scala Language or type system to accommodate it. Some kind of annotation would be ideal, given @odersky’s stated principle that annotations are for things that do not affect typechecking

  2. We shouldn’t have to contort our Scala source code for binary compatibility concerns. That rules out the .withFoo automation in the earlier proposal, and also rules out the very tedious way we currently manually perform these operations today. I want to be able to say

case class Person(first: String, last: String)

And later evolve it to

case class Person(
  first: String,
  last: String,
  country: String = "unknown"
)

or

case class Person(
  first: String,
  last: String,
  country: String = "unknown",
  number: Option[String] = None
)

Without breakage. After all, it’s (almost) source compatible, in other languages such a change would be backwards compatible, and I would like Scala to be up to that standard

  1. The same solution should apply to both case classes and plain defs. Both of these currently allow parameters, allow parameters with defaults, and cause bincompat breakage when a new parameter with default is added. To a developer, these concepts are the same: something that takes arguments, possibly with defaults. Binary compatibility should be managed the same way for both

  2. We want to handle the case where someone adds a parameter to the right-side of a parameter list, with a default value. This is the case that is already (almost) source compatible, and is backwards-compatible in other languages like Python or SQL. We don’t need to handle more complex cases like changing parameter types or re-ordering parameters, which are universally backwards-incompatible across the programming landscape.

Proposal Sketch

We use a @telescopingDefaults annotation (name is arbitrary) to automate the generation of “telescoping” methods and constructors.

Defs

To begin with, let’s consider a simpler scenario: defs that we want to evolve with additional parameters. e.g. starting from:

def makePerson(first: String, last: String) = ???

To

@telescopingDefaults
def makePerson(
  first: String,
  last: String,
  country: String = "unknown") = ???

To

@telescopingDefaults
def makePerson(
  first: String,
  last: String,
  country: String = "unknown",
  number: Option[String] = None) = ???

The @telescopingDefaults annotation would generate the following forwarders:

def makePerson(
  first: String,
  last: String,
  country: String = "unknown",
  number: Option[String] = None) = ???

@synthetic
def makePerson(first: String, last: String, country: String) = makePerson(first, last, country)

@synthetic
def makePerson(first: String, last: String) = makePerson(first, last)

Thus, any bytecode which was compiled against earlier versions of def makePerson with fewer parameters can continue to call those earlier signatures un-changed.

These definitions can be synthetic and hidden from the Scala compiler:

  1. Downstream bytecode being compiled against the latest version of def makePerson always has the most recent signature available to compile against and generate bytecode against.
  2. But downstream bytecode compiled earlier against older versions of Person with fewer parameters, will continue to be able to call the @synthetic forwarders, which will send the method call to the right place

Case Classes

Case classes can be handled similarly, given an annotated case class:

@telescopingDefaults
case class Person(
  first: String,
  last: String,
  country: String = "unknown",
  number: Option[String] = None
)

We could generate the following additional code:

class Person(val first: String, val last: String, val country: String, val number: Option[String] = None){
  @synthetic
  def this(first: String, last: String, country: String){
    this(first, last, country)
  }

  @synthetic
  def this(first: String, last: String){
    this(first, last)
  }

  def copy(
    first: String = this.first,
    last: String = this.last,
    country: String = this.country,
    phone: Option[String] = this.phone) = new Person(first, last, country, phone)

  @synthetic
  def copy(first: String, last: String, country: String) =
    new Person(first, last, country)

  @synthetic
  def copy(first: String, last: String) =
    new Person(first, last)
}

object Person{
  def apply(
    first: String,
    last: String,
    country: String = "unknown",
    phone: Option[String] = None) = new Person(first, last, country, phone)

  @synthetic
  def apply(first: String, last: String, country: String) =
    new Person(first, last, country)

  @synthetic
  def apply(first: String, last: String) =
    new Person(first, last)

  def unapply(p: Person): Person = p
}

Unlike defs, which can only be called, there are three cases to consider for case classes:

  1. apply/new: These work similar to defs above: as new parameters with defaults are added, the old signature is kept working via forwarders, so bytecode compiled against the old signatures can continue to work

  2. copy: This is similar to apply/new above, except the copy method doesn’t care about the default values specified for the parameters: all params foo default to this.foo. However, we can still use the default values as an indicator to when we need to start caring about backwards compatibility: e.g. here we generate synthetic copy overloads only down to 2 parameters, since we do not need to provide binary compatibility to earlier versions of Person

  3. unapply: I think as of Scala 3 this will work right out of the box: unapply no longer returns Option[TupleN[...]] as it did in Scala 2, and instead just returns Person, with pattern matching just relying on the ._1 ._2 etc. fields to work. Thus, a p match{ case Person(first, last) => ???} callsite compiled against case class Person(first: String, last: String) should continue to work even when Person has evolved into case class Person(first: String, last: String, country: String = "unknown", number: Option[String] = None)


What do people think? Are there any obvious blockers that I’m missing? I haven’t actually implemented this yet, but I’m wondering if the fundamental idea is sound.

The implementation of generating forwarding proxies seems relatively straightforward. And if it can avoid me constantly jumping through hoops manually writing forwarders or avoiding case classes to preserve binary compatibility, it would definitely be worth investing in automation

19 Likes

Yes currently we need to add the overloaded constructor manually, protobuf and json both can easily evolve.

And in Java I always using something like

@Data
@Builder
public class Message implements PubSubValue, RouteInfo {
    private static final long serialVersionUID = 1L;

    @Nullable
    private String id;

    @NonNull
    @Builder.Default
    private Integer subType = MsgSubType.normalMsg.getMsgSubType();

  //...
}

Here the default values are set, and a Lombok’s builder is used to easily add properties to the class.

A link from kotlin too

This seems reasonable, though because methods with overloaded names aren’t quite first-class w.r.t. other features, you couldn’t use an unhidden with-defaults approach at all if you already have, say,

def makePerson(first: String, last: String) = ???
def makePerson(entry: DbEntry, allowPartial: Boolean = false) = ???

Starting from that, you can’t even use the mechanism. Although you could if you allowed explicit unrolling of defaults (but it wouldn’t work for more than one default argument):

@unrolledDefault
def makePerson(entry: DbEntry, allowPartial: Boolean = false) = ???

// becomes
def makePerson(entry: DbEntry, allowPartial: Boolean) = ???
def makePerson(entry: DbEntry) = makePerson(entry, false)

But that can be done manually in the cases where it’s needed, trading off some source compatibility for binary compatibility.

It also won’t interact well even as is in hopefully-rare cases where an opaque type overload was already present.

opaque type Passport = String
def makePerson(first: String, last: String, passport: Passport) = ???
def makePerson(first: String, last: String) = ???

But this is a weird enough edge case that I don’t think it should derail the idea.

Yeah there definitely will be some limitations around overloading. What this proposal does is provide synthetic overloads for backwards compatibility, and if there are existing overloads then there’s always the possibility of a clash.

The proposal as written has this synthetic-forwarder-generation logic as opt-in via an annotation @telescopingDefaults, so “can’t use it on overloads” is a possible answer. We already lose a bunch of language features when overloads are present - e.g. result type inference, defining default values for each overload, etc. - so I feel like this can be an acceptable limitation that fits reasonably well into the other kinds of edge cases Scala already has.

Also, for many classes of overloads, replacing the overloaded methods with Magnet pattern implicit conversions is another possible workaround. That’s what I do throughout the com.lihaoyi ecosystem and it works well enough.

I wonder if this approach could break existing code in subtle ways.

Currently we can assume that two instances of a case class are the same if and only if their fields are the same.

With the proposed solution, this reasoning would no longer be sound, because there may be additional fields we are not aware of.

The scheme looks at first glance quite reasonable to me. Definitely worth following up, maybe leading to a pre SIP? I like unrolledDefaults as a name for the annotation.

3 Likes

There should be some examples with multiple parameter lists. An example where the default refers to a parameter from a earlier list. It seems it should work out.

For the case classes, you also need to override fromProduct, right? It could look something like this

object Person:
  @synthetic
  def fromProduct(p: Product): Person = p.productArity match
    case 2 =>
      Person(
        p.productElement(0).asInstanceOf[String],
        p.productElement(1).asInstanceOf[String],
      )
    case 3 =>
      Person(
        p.productElement(0).asInstanceOf[String],
        p.productElement(1).asInstanceOf[String],
        p.productElement(2).asInstanceOf[String],
      )
    case 4 =>
      Person(
        p.productElement(0).asInstanceOf[String],
        p.productElement(1).asInstanceOf[String],
        p.productElement(2).asInstanceOf[String],
        p.productElement(3).asInstanceOf[Option[String]],
      )

Shout out to @armanbilge who discovered this

I don’t think so. AFAIK, fromProduct is only used by Mirrors (typeclass derivation). So, if you want to support typeclass derivation you have to implement a custom fromProduct, otherwise you don’t need it.

So, if you want to support typeclass derivation you have to implement a custom fromProduct, otherwise you don’t need it.

Are you saying that by not implementing a custom fromProduct, you’re essentially prohibiting typeclass derivation? Shouldn’t then the recommended practice be to implement it? What do you know about what the users of your case class want to use it for? The chance that they will want working derivation is high.

2 Likes

Yes. I was still seeing things along the lines of the linked Pre-SIP, which explicitly ignored the derivation use-case because it was impossible to implement correctly with that approach.

But I agree that if there is a solution that works with the approach based on parameters with default values (as described in your post), that’s good to have!

@lihaoyi do you plan to move this idea forward yourself? Otherwise, if it is not too urgent, it seems like this could be a nice and self-contained subject for a student project next semester at EPFL (September-February). What do you think?

3 Likes

I don’t have immediate plans to move this forward. Feel free to commandeer the idea and turn it into a real project!

1 Like

Would it be possible to make this behavior the default?

That would be the best programmer experience, wouldn’t it? Not having to worry about binary compatibility anymore :relaxed:

3 Likes

Just wanted to bump this again, with another concrete use case I encountered:

My last update to the com.lihaoyi::mainargs library involves adding a new default parameter to a bunch of user-facing methods. As a result, a ton of method signatures needed to be duplicated and “manually telescoped” or “manually unrolled” to maintain binary compatibility

Some of these signatures were already duplicated twice for compatibility concerns in the past, and now are duplicated three times.

While extemely tedious, it is impossible for a library to simultaneously (a) make use of Scala language features, like default argument values and (b) provide a smooth user experience free from NoSuchMethodErrors and the like and (c) avoid this duplication. That puts library authors between a rock and a hard place, having to give up one of them:

  • Some libraries give up (a), limiting themselves to a subset of Scala that doesn’t use default arguments, and forcing additional builder-pattern boilerplate on all their users
  • Some libraries give up on (b), expecting that users will hit JVM LinkageErrors sometimes and be forced to recompile their un-changed source code against newer library versions
  • Some libraries give up on (c), and fill their implementation with boilerplate telescoping methods.

For Mainargs I’ve chosen to give up (c), and decided to live with the boilerplate in exchange for providing an optimal user experience. But these telescoping/unrolled binary compatibility shims are extremely mechanical, and it should be straightforward to automate their generation via a compiler plugin or annotation macro.

I don’t have any concrete implementation to show yet, just wanted to keep the conversation going as I encounter these cases in the wild

10 Likes

Scala stewards, please take note of this :pray:
This language change is one of the few that will improve Scala users lives the most, especially the library authors.
The efforts that need to be put up to ensure binary compatibility are painstaking. This would help a great deal.
/cc @Kordyjan

3 Likes

data-class lets you put an annotation (@since) on the first “new” paramter, which reduces the number of synthetics if the initial version already uses default arguments.

There are probably some tricky aspects in here. If we have

trait T {
  @telescopingDefaults def f(x: Int, y: Int = 1) = 0
  @synthetic def f(x: Int) = f(x, 1)
}

We want t.f(42) to compile to the non-synthetic overload. We also want to hide the synthetic one in IDEs and Scaladocs. But the method should probably still be there, for example for the Mixin phase to generate forwarders. But it looks all doable to me.

I tried a few examples around overriding and couldn’t find issues, it seems the scheme would work well. Existing subclasses would override the new synthetic method, newly compiled subclasses would not be source-compatible, so they have to be rewritten to override the new signature (and get an overriding synthetic method).

I wonder if this transformation could be done conpletely at the bytecode level. e.g. via ASM rather than via a compiler plugin. That would allow us to share the implementation between Scala2 and 3.

After all, generating bincompat forwarders seems purely a JVM-level concern, and the only thing Scala related is knowing how to call Scala default argument value methods inside the forwarders. Apart from that, the Scala compiler should not need to know about these forwarders at all and vice versa

1 Like

It’s also a Scala.js IR and Native IR concern. So you’d have to do the work 3 times.

1 Like

That’s true. I guess it might save effort doing it i the compiler then, though it would still need to be done twice for Scala 2 and 3

1 Like