Can we make adding a parameter with a default value binary compatible?

This post is inspired by recent work we’ve been doing to enforce binary compatibility in the com-lihaoyi ecosystem, as well as the earlier discussion about adding some alternative to case classes which are binary compatible:

I’m personally not a fan of that approach. Binary compatibility is an important concern, but it’s an implementation limitation, not a semantic/language concern. Furthermore, that solution doesn’t apply to defs, which suffer the same problem. This post proposes an alternative.

Why not withFoo?

Going all-in on the .withFoo approach is very Java-esque: Basically the entire reason I want to write Scala is because I can write syntax that matches exactly what I mean like:

case class Person(first: String, last: String, country: String)

Person(first = "Haoyi", last = "Li", country="singapore")

Rather than the Java-style syntax that’s full of boilerplate and patterns to work around language weaknesses:

Person()
  .withFirst("Haoyi")
  .withLast("Li")
  .withCountry("Singapore")

In other languages like Python or SQL, adding a field (or column) with a default value is largely a backwards-compatible operation. Can we do the same for Scala?

Principles

  1. Given that binary compatibility is an implementation concern - it simply doesn’t exist in a program compiled all-at-once from source - we should not change the Scala Language or type system to accommodate it. Some kind of annotation would be ideal, given @odersky’s stated principle that annotations are for things that do not affect typechecking

  2. We shouldn’t have to contort our Scala source code for binary compatibility concerns. That rules out the .withFoo automation in the earlier proposal, and also rules out the very tedious way we currently manually perform these operations today. I want to be able to say

case class Person(first: String, last: String)

And later evolve it to

case class Person(
  first: String,
  last: String,
  country: String = "unknown"
)

or

case class Person(
  first: String,
  last: String,
  country: String = "unknown",
  number: Option[String] = None
)

Without breakage. After all, it’s (almost) source compatible, in other languages such a change would be backwards compatible, and I would like Scala to be up to that standard

  1. The same solution should apply to both case classes and plain defs. Both of these currently allow parameters, allow parameters with defaults, and cause bincompat breakage when a new parameter with default is added. To a developer, these concepts are the same: something that takes arguments, possibly with defaults. Binary compatibility should be managed the same way for both

  2. We want to handle the case where someone adds a parameter to the right-side of a parameter list, with a default value. This is the case that is already (almost) source compatible, and is backwards-compatible in other languages like Python or SQL. We don’t need to handle more complex cases like changing parameter types or re-ordering parameters, which are universally backwards-incompatible across the programming landscape.

Proposal Sketch

We use a @telescopingDefaults annotation (name is arbitrary) to automate the generation of “telescoping” methods and constructors.

Defs

To begin with, let’s consider a simpler scenario: defs that we want to evolve with additional parameters. e.g. starting from:

def makePerson(first: String, last: String) = ???

To

@telescopingDefaults
def makePerson(
  first: String,
  last: String,
  country: String = "unknown") = ???

To

@telescopingDefaults
def makePerson(
  first: String,
  last: String,
  country: String = "unknown",
  number: Option[String] = None) = ???

The @telescopingDefaults annotation would generate the following forwarders:

def makePerson(
  first: String,
  last: String,
  country: String = "unknown",
  number: Option[String] = None) = ???

@synthetic
def makePerson(first: String, last: String, country: String) = makePerson(first, last, country)

@synthetic
def makePerson(first: String, last: String) = makePerson(first, last)

Thus, any bytecode which was compiled against earlier versions of def makePerson with fewer parameters can continue to call those earlier signatures un-changed.

These definitions can be synthetic and hidden from the Scala compiler:

  1. Downstream bytecode being compiled against the latest version of def makePerson always has the most recent signature available to compile against and generate bytecode against.
  2. But downstream bytecode compiled earlier against older versions of Person with fewer parameters, will continue to be able to call the @synthetic forwarders, which will send the method call to the right place

Case Classes

Case classes can be handled similarly, given an annotated case class:

@telescopingDefaults
case class Person(
  first: String,
  last: String,
  country: String = "unknown",
  number: Option[String] = None
)

We could generate the following additional code:

class Person(val first: String, val last: String, val country: String, val number: Option[String] = None){
  @synthetic
  def this(first: String, last: String, country: String){
    this(first, last, country)
  }

  @synthetic
  def this(first: String, last: String){
    this(first, last)
  }

  def copy(
    first: String = this.first,
    last: String = this.last,
    country: String = this.country,
    phone: Option[String] = this.phone) = new Person(first, last, country, phone)

  @synthetic
  def copy(first: String, last: String, country: String) =
    new Person(first, last, country)

  @synthetic
  def copy(first: String, last: String) =
    new Person(first, last)
}

object Person{
  def apply(
    first: String,
    last: String,
    country: String = "unknown",
    phone: Option[String] = None) = new Person(first, last, country, phone)

  @synthetic
  def apply(first: String, last: String, country: String) =
    new Person(first, last, country)

  @synthetic
  def apply(first: String, last: String) =
    new Person(first, last)

  def unapply(p: Person): Person = p
}

Unlike defs, which can only be called, there are three cases to consider for case classes:

  1. apply/new: These work similar to defs above: as new parameters with defaults are added, the old signature is kept working via forwarders, so bytecode compiled against the old signatures can continue to work

  2. copy: This is similar to apply/new above, except the copy method doesn’t care about the default values specified for the parameters: all params foo default to this.foo. However, we can still use the default values as an indicator to when we need to start caring about backwards compatibility: e.g. here we generate synthetic copy overloads only down to 2 parameters, since we do not need to provide binary compatibility to earlier versions of Person

  3. unapply: I think as of Scala 3 this will work right out of the box: unapply no longer returns Option[TupleN[...]] as it did in Scala 2, and instead just returns Person, with pattern matching just relying on the ._1 ._2 etc. fields to work. Thus, a p match{ case Person(first, last) => ???} callsite compiled against case class Person(first: String, last: String) should continue to work even when Person has evolved into case class Person(first: String, last: String, country: String = "unknown", number: Option[String] = None)


What do people think? Are there any obvious blockers that I’m missing? I haven’t actually implemented this yet, but I’m wondering if the fundamental idea is sound.

The implementation of generating forwarding proxies seems relatively straightforward. And if it can avoid me constantly jumping through hoops manually writing forwarders or avoiding case classes to preserve binary compatibility, it would definitely be worth investing in automation

16 Likes

Yes currently we need to add the overloaded constructor manually, protobuf and json both can easily evolve.

And in Java I always using something like

@Data
@Builder
public class Message implements PubSubValue, RouteInfo {
    private static final long serialVersionUID = 1L;

    @Nullable
    private String id;

    @NonNull
    @Builder.Default
    private Integer subType = MsgSubType.normalMsg.getMsgSubType();

  //...
}

Here the default values are set, and a Lombok’s builder is used to easily add properties to the class.

A link from kotlin too

This seems reasonable, though because methods with overloaded names aren’t quite first-class w.r.t. other features, you couldn’t use an unhidden with-defaults approach at all if you already have, say,

def makePerson(first: String, last: String) = ???
def makePerson(entry: DbEntry, allowPartial: Boolean = false) = ???

Starting from that, you can’t even use the mechanism. Although you could if you allowed explicit unrolling of defaults (but it wouldn’t work for more than one default argument):

@unrolledDefault
def makePerson(entry: DbEntry, allowPartial: Boolean = false) = ???

// becomes
def makePerson(entry: DbEntry, allowPartial: Boolean) = ???
def makePerson(entry: DbEntry) = makePerson(entry, false)

But that can be done manually in the cases where it’s needed, trading off some source compatibility for binary compatibility.

It also won’t interact well even as is in hopefully-rare cases where an opaque type overload was already present.

opaque type Passport = String
def makePerson(first: String, last: String, passport: Passport) = ???
def makePerson(first: String, last: String) = ???

But this is a weird enough edge case that I don’t think it should derail the idea.

Yeah there definitely will be some limitations around overloading. What this proposal does is provide synthetic overloads for backwards compatibility, and if there are existing overloads then there’s always the possibility of a clash.

The proposal as written has this synthetic-forwarder-generation logic as opt-in via an annotation @telescopingDefaults, so “can’t use it on overloads” is a possible answer. We already lose a bunch of language features when overloads are present - e.g. result type inference, defining default values for each overload, etc. - so I feel like this can be an acceptable limitation that fits reasonably well into the other kinds of edge cases Scala already has.

Also, for many classes of overloads, replacing the overloaded methods with Magnet pattern implicit conversions is another possible workaround. That’s what I do throughout the com.lihaoyi ecosystem and it works well enough.

I wonder if this approach could break existing code in subtle ways.

Currently we can assume that two instances of a case class are the same if and only if their fields are the same.

With the proposed solution, this reasoning would no longer be sound, because there may be additional fields we are not aware of.

The scheme looks at first glance quite reasonable to me. Definitely worth following up, maybe leading to a pre SIP? I like unrolledDefaults as a name for the annotation.

3 Likes

There should be some examples with multiple parameter lists. An example where the default refers to a parameter from a earlier list. It seems it should work out.

For the case classes, you also need to override fromProduct, right? It could look something like this

object Person:
  @synthetic
  def fromProduct(p: Product): Person = p.productArity match
    case 2 =>
      Person(
        p.productElement(0).asInstanceOf[String],
        p.productElement(1).asInstanceOf[String],
      )
    case 3 =>
      Person(
        p.productElement(0).asInstanceOf[String],
        p.productElement(1).asInstanceOf[String],
        p.productElement(2).asInstanceOf[String],
      )
    case 4 =>
      Person(
        p.productElement(0).asInstanceOf[String],
        p.productElement(1).asInstanceOf[String],
        p.productElement(2).asInstanceOf[String],
        p.productElement(3).asInstanceOf[Option[String]],
      )

Shout out to @armanbilge who discovered this

I don’t think so. AFAIK, fromProduct is only used by Mirrors (typeclass derivation). So, if you want to support typeclass derivation you have to implement a custom fromProduct, otherwise you don’t need it.

So, if you want to support typeclass derivation you have to implement a custom fromProduct, otherwise you don’t need it.

Are you saying that by not implementing a custom fromProduct, you’re essentially prohibiting typeclass derivation? Shouldn’t then the recommended practice be to implement it? What do you know about what the users of your case class want to use it for? The chance that they will want working derivation is high.

2 Likes

Yes. I was still seeing things along the lines of the linked Pre-SIP, which explicitly ignored the derivation use-case because it was impossible to implement correctly with that approach.

But I agree that if there is a solution that works with the approach based on parameters with default values (as described in your post), that’s good to have!

@lihaoyi do you plan to move this idea forward yourself? Otherwise, if it is not too urgent, it seems like this could be a nice and self-contained subject for a student project next semester at EPFL (September-February). What do you think?

3 Likes

I don’t have immediate plans to move this forward. Feel free to commandeer the idea and turn it into a real project!

1 Like

Would it be possible to make this behavior the default?

That would be the best programmer experience, wouldn’t it? Not having to worry about binary compatibility anymore :relaxed:

3 Likes