Proposal: Creator applications (going new-less)

You couldn’t have done anything better, indeed. And TASTy will indeed burn the new in the TASTy files, so that has the same consequences as the binary compatibility in your post.

Developers will think that turning a turning a constructor into an apply is a compatible API change, whereas in fact it’s completely non-compatible from the TASTy and binary points of view.

3 Likes

Lets follow the exact same logic without this change. I’m gdoing to copy paste the prior argument and modify it to what api evolution looks like today.


I’m wondering about compatibility implications along some long-term library API evolution.

foo v1

class Animal(name: String)

At this point the user could use this as either has

val a = new Animal("Tommy")

Two methods of the animal creation are expected to behave the same.

foo v1.1

We realize that we want to provide our own apply to add some validation or business logic.

class Animal(name: String)
object Animal {
  def apply(name: String): Animal = {
    assert(name.nonEmpty && name.head.isUpper)
    val a = new Animal(name)
    DB.append(a)
    a
  }
}

This would create a library that is binary-compatible, source-compatible, yet half the code would change its behavior upon recompilation none of the existing user code picks up the validation without source modification and recompilation .

foo v1.1.1

To prevent the unpredictable behavior *enforce validation, maybe the library would then opt to force everyone to use the apply by making the constructor private.

class Animal private (name: String)
object Animal {
  def apply(name: String): Animal = {
    assert(name.nonEmpty && name.head.isUpper)
    val a = new Animal(name)
    DB.append(a)
    a
  }
}

This is slightly better, but for binary semantic compatibility, the situation hasn’t really changed. Your old downstream libraries/plugins are still not using the right validation. but downstream users must recompile and modify their code to use the apply method.

foo v1.1.2

To prevent this discrepancy, ultimately I’d have to treat that apply(name: String) is now forbidden, and put the validation logic into constructor code?

class Animal(name: String) {
  assert(name.nonEmpty && name.head.isUpper)
  DB.append(this)
}
object Animal {
  // for bincompat with v1.1
  def apply(name: String): Animal = new Animal(name)
}

what could I have done?

I’m not really sure how I could have guarded against this. Ship every classes with private constructor and apply as prophylactic measure? Or just not worry too much about this? In general, parameter validation needs to be in the constructor or the constructor hidden anyway (with or without this change), and whatever side-effect DB.apply() is doing is a code smell I’d whack out in any code review I’m involved in.


I don’t find this api evolution argument very compelling, as you can see, the situation is only slightly different with or without this change:
If you have only a plain constructor, and add an apply method you have to be aware of what that means for your users. The real difference is that library authors will have to be aware of how this de-sugars. However, this is mostly only a risk when the apply method has side effects (including exceptions from validation), which is an area that is error prone for novices with or without this change.

Apply methods/constructors are probably ‘naked’ 90% of the time, in which case this change is pure win. Mabye 9% of the time, there is some parameter validation, but this is best done either in the constructor, or in apply methods with private constructors, so that no invalid instances can be created.

Lastly, some novice will inevitably put side-effects in constructors or apply methods without understanding the consequences, but this proposal seems only mildly more dangerous in this case, and honestly its an anti-pattern that for some people needs to be a lesson learned the hard way. Experts might provide sane tools that do have effects upon construction, like Future, but someone who is at the level that they can write Future won’t be caught by surprise here. And those novices that will be caught by surprise shouldn’t be shoving side effects into constructors.

Linters, findbugs-like tools, IDEs or a future effects system can warn or error on these things. And I suspect the ability to warn on leaking uninitialized ‘this’ is very close (there was a lot of work recently to catch unsafe initialization), though not all uninitialized reference leaks are errors, there are valid reasons to do it.

This proposal does make it easier for binary compatible behavior to differ from source compatible behavior. (e.g. dropping in an updated library jar without recompilation produces different behavior than recompiling with that jar). Its not the only Scala feature with this quality, however.

1 Like

Yea. At this point I’m not arguing for / against this language change, but I want to understand the implication for separate compilation situations.

As I see it, def apply is a formalization of factory method pattern, which provides indirection between the creation API and the implementation details of how the interface contract may be fulfilled. In Scala 2.x, if the API started out with def apply, the underlying concrete class could potentially evolve without breaking binary compatibility (like Vector did recently).

I’m not sure if this worth it, but if companion object is code generated together with def apply(...), then the usage code would have Animal.apply(...) instead of new Animal(...), retaining the possibility of animal implementation evolution and/or caching.

2 Likes

This would also simplify the question of “does this class have a companion object”, which would be a nice win for regularity.

One downside of generating the apply method is the same as exists today for case classes: you can’t view the source equivalent to the generated code. This is more of a tooling issue than one specific to this proposal.

This approach was tried and rejected:
http://dotty.epfl.ch/docs/reference/other-new-features/creator-applications.html#discussion

1 Like

My thought was that generating a companion for every class would bloat the bytecode among other drawbacks.

Instead, a (jvm) static apply method could always be generated for each constructor, that delegates to it directly. The additional bytecode size for this is much smaller than a companion. Companion apply methods that return the class type would have their implementations in static methods on the class, which the companion delegates to.
So a library compiled to call Animal("slug") on a plain class would be wired up to the static factory method, and adding an apply method on a companion later would then modify the contents of this static method.
This would mean the binary compat vs source compat story is a bit better - add a companion apply method, and code compiled previously would pick up the implementation without a recompile.

There are some issues that make this non-trivial – jvm static method signatures can collide with instance methods, etc.

But it would help with the java interop story in general if more companion members were actually static elements on the class (with forwarders from the companion object instance). It would make it a lot easier to write wrapper-apis that allow java or other jvm languages to call Scala libraries, for example, or better, to share more of the api between a scala-idiomatic api and one that is meant for other jvm languages to use.

There are drawbacks to any approach that ‘fixes’ this issue as well. Users might be surprised when suddenly new Animal("ant") is not the same as Animal("ant") when it was in previous versions of the library, so I fall back to most of my prior arguments – adding or modifying apply methods in your api contract has consequences and there is no free lunch. Either the behavior changes at link time or compile time, but there IS a change, and the library writer needs to be aware of what it is and communicate it to users.

In that sense, the current proposal is the simplest solution possible if one desires plain classes to not require ‘new’ to be used in the source code. All solutions have drawbacks, even the status-quo. There is wisdom in choosing the simple solution that doesn’t involve modifying the compiler back-end, and so I feel that this proposal has more positives than negatives.

1 Like

Static methods are inherited.

Why are these issues prohibitive for regular classes but not for case classes?

Right now it’s the other way around. A static method on the class forwards to the instance method of the companion object. I think it’s important that the implementation is in the instance method because objects in Scala can inherit from classes and traits.

I did not draw these conclusions, so I can only guess. What I see: using the generate-an-apply-function approach can result in an ambiguity problem. This is already the case for case classes and I need to work around it from time to time. The question here is if we want to add ambiguity problems to code bases which are ambiguity free today in scala 2 when they upgrade/switch to dotty – this could make migrations much harder. Even worse, in some cases the behaviour could change as the generated apply method is more specific than the custom one. So potentially adding bugs when the code is compiled with dotty without noticing it.

IMO it would still be a feasible approach if we

  • do not generate an apply in case of ambiguity and emit a warning in 3.0 and turn it into an error in 3.1 (i.e. generate the apply in all cases)
  • emit a warning if the apply method is more specific (also for case classes)

However, I see additional problems with that approach:

  • we generate companion objects for all classes even though we don’t need it
  • sometimes I want to hide the constructor and provide my own apply method (factory method) and not provide the auto-generated one. For case classes there is no way to do this as far as I know, the public apply is always there. That’s kind of OK for case classes and for consistency reasons the same should apply to classes (if this approach is taken).

IMO the current approach where omitting new is really the same as using new is good. but I would like to see an easy way to delegate to the constructor from an apply (maybe something for 3.1 :wink: or did I miss something and it’s already there?).

In that case, it may make sense to also drop the generated apply method for case classes.

1 Like

Indeed :+1: , would:

  • remove the ambiguity issues
  • allow to have a custom apply which has a higher precedence
  • allow to have a private constructor and a non-public apply

However, then we probably want an easy way to delegate to the constructor already in 3.0

It would indeed be nice if we could do this. Right now the problem is that companions of case classes implicitly extend function types which are implemented by the apply method. We’d have to also drop this
concept from the language. Maybe not a bad idea, except that it would cause further migration hassles,
and we already have enough of those…

4 Likes

A bit of digging lands me 10 years ago, where you explained that that was added for backwards compatability at https://stackoverflow.com/a/3054165/381801

Thinking about the use case for that, is there eta expansion for constructors? On the one hand, I’d think so. On the other, Foo in call(Foo) becomes ambiguous in whether it means passing the term Foo, or passing the eta expanded constructor of the class Foo.

In current dotty, it’s the following:

class Creator(){}
def callit[A](f: () => A) = f()
callit(Creator _) //no: "Only function types can be followed by _ but the current expression has type"
callit(Creator) //no: "Not found: Creator"
callit(Creator(_)) //no: "Wrong number of parameters, expected: 0"
callit(() => Creator()) //yes

Should any of those no’s work? new makes it more obvious that you’re doing something other than just calling a method which makes it less surprising that the above no’s are no’s.

1 Like

This topic has now been open for over 30 days. If anyone wants to add anything further or make any kind of closing or summary statement for the committee, please do so this week, before we close the topic.

1 Like

I’m tentatively opposed, as I see the arguments of both sides as equal and negating one another, making the change a neutral net worth.

IMO, @jducoeur’s and @tarsa’s argument is the strongest argument in favor of the change – preventing developers from abusing case classes just because they provider new-less constructors.

On the other hand, I find @nafg’s and @morgen-peschke’s opposing arguments important as well – signal, not noise; blurring the distinction between regular and case classes; real work in constructors.

@Ichoran I am very much in the opinion that the new keyword has a purpose of reminding people of things they do need to be reminded about. Well, perhaps need is a strong word in this case, but I do think the keyword is not without significance.

new tells me that I am constructing a non-value class, that has non-trivial methods that my current unit is dependent on. In many cases, I would like to be able to mock this new class when testing my unit, and I can do this only if I inject said new class into my unit (via constructor or method argument).

This is more eloquently described by Miško Hevery (already referenced in this thread) in his general post on Writing Testable Code and the more specific one about How to Think About the “new” Operator with Respect to Unit Testing.


If I may, I’d like to add another idea into the pile. How about an additional keyword / annotation that will make the compiler automatically generate an apply method?

@Apply
class MyLovelyFoo(bar: Bar)
// or
apply class MyLovelyFoo(bar: Bar)
// or
class apply MyLovelyFoo(bar: Bar)

This could help prevent the abuse of case classes for the sake of new-less constructors for those who do not care for the distinction between the types of classes or the significance of the keyword, while still allowing class authors who care about these things keep the distinction and the explicit constructors.

2 Likes

It does not work for us, because we broadly use factories, so I can see more disadvantages with ‘new’.

I wonder whether it would be better to use abstract classes to prevent “apply generation”

2 Likes

Perhaps with effect tracking of references we can request that a value can only be assigned with a new reference, which would also help in overload resolution of constructors vs apply:

class Foo()
object Foo:
  // here fresh implies that only a fresh value can be returned,
  // so this is not recursive, however fresh variables can not be assigned to this result
  fresh def apply(): Foo = Foo()

def cache(fresh x: Foo): Unit = ???

cache(Foo()) // calls constructor unambiguously

Here is a nice puzzler.

class Foo(block : => Unit = {println("A")}) {
  println("B")
}

val justFoo = Foo {
  println("C")
}

val newFoo = new Foo {
  println("C")
}

What do you think? Does justFoo and newFoo print the same thing?
If you find yourself straining your brain, we have a problem.

Here is a scastie of what really happens: https://scastie.scala-lang.org/WcsEtJ3GT9mHOGqCvwAR8g

1 Like

Good point.

For clarity and consistency, the second one should be new _ extends Foo { ... }. Then there’s no surprise.

(This is orthogonal to the creator applications issue; apply methods on the companion class create exactly the same illusion in 2.x.)

1 Like