Proposal: Creator applications (going new-less)

tarsa · May 21, 2020, 8:04pm

Read my reply. If you don’t care where your side effects are then you can very well write code like that today:

// `case` modifier below is only there to allow constructing this class
// without using `new` keyword
case class ServiceA(serviceB: ServiceB) {
  private val configC = ... // load from disk
  private val serviceC = ServiceC(configC)
  ...
}

I’ve seen things like that already.

morgen-peschke · May 21, 2020, 10:07pm

I have and I understand this is a thing that happens, the point I disagree with you on is the implications of this change.

If people abuse the case modifier, it doesn’t necessarily follow that making it easier to abuse constructors is an improvement.

kai · May 21, 2020, 11:26pm

I think that example was of abuse of constructors (class bodies) in private val configC = ... // load from disk.

AMatveev · May 21, 2020, 11:30pm

Could you give an example? What usual code will they be writing in a constructor and why it is bad and why can’t bad decomposition appear with apply method?

morgen-peschke · May 22, 2020, 12:19am

@tarsa’s example is a pretty good example of this type of shenanigans.

WRT bad decomposition, it’s absolutely possible with apply methods as well, the primary difference is that in the context of a factory, you don’t have access to a this referencing the partially-initialized object (or at least it’s harder to make this happen), so there’s fewer things which are easy to do badly.

To be clear, neither the status quo nor a change to make new optional would prevent misbehavior.

I simply don’t believe that people who are prone to abusing case to avoid new are less likely to indulge in initialization anti-patterns if they’re catered to, and I believe it’s going to be harder to educate people about the correct way to initialize a class if the language blurs the distinction between factory and constructor.

The TL;DR is that I’m wary of this change because it conflates two concepts in a way that will make it hard to teach people how not to shoot themselves in the foot.

AMatveev · May 22, 2020, 12:58am

It can be considered as pseudo code generation of a default apply method over a constructor.
I would be glad if such code generation were extended in future.
Currently I have to write many typical boilerplate code in object factory.
Typical boilerplate:

object Btk_AcItemPrivilegeApi extends ApiFactory[java.lang.Long, Btk_AcItemPrivilegeAro, Btk_AcItemPrivilegeApi]

object InfoLogException extends ExceptionFactory(new InfoLogException(_))

object C{
   def apply():C  = new C()
}

I am glad to have the ability reduce boilerplate a little.

morgen-peschke · May 22, 2020, 1:19am

Well, yes, that’s how it’s implemented. Semantically it says, “we’ll let you pretend factories and constructors are the same thing”.

If macros eventually support code generation tied to annotations, that boilerplate goes down significantly, and maintains the distinction between the two concepts:

@simpleFactory
class C(...)

eed3si9n · May 22, 2020, 1:38pm

I’m wondering about compatibility implications along some long-term library API evolution.

foo v1

class Animal(name: String)

At this point the user could use this as either

val a = Animal("Timmy")
val b = new Animal("Tommy")

Two methods of the animal creation are expected to behave the same.

foo v1.1

We realize that we want to provide our own apply to add some validation or business logic.

class Animal(name: String)
object Animal {
  def apply(name: String): Animal = {
    assert(name.nonEmpty && name.head.isUpper)
    val a = new Animal(name)
    DB.append(a)
    a
  }
}

This would create a library that is binary-compatible, source-compatible, yet half the code would change its behavior upon recompilation.

foo v1.1.1

To prevent the unpredictable behavior, maybe the library would then opt to force everyone to use the apply by making the constructor private.

class Animal private (name: String)
object Animal {
  def apply(name: String): Animal = {
    assert(name.nonEmpty && name.head.isUpper)
    val a = new Animal(name)
    DB.append(a)
    a
  }
}

This is slightly better, but for binary semantic compatibility, the situation hasn’t really changed. Your old downstream libraries/plugins are still not using the right validation.

foo v1.1.2

To prevent this discrepancy, ultimately I’d have to treat that apply(name: String) is now forbidden, and put the validation logic into constructor code?

class Animal(name: String) {
  assert(name.nonEmpty && name.head.isUpper)
  DB.append(this)
}
object Animal {
  // for bincompat with v1.1
  def apply(name: String): Animal = new Animal(name)
}

what could I have done?

I’m not really sure how I could have guarded against this. Ship every classes with private constructor and apply as prophylactic measure? Or just not worry too much about this?

what would TASTY linking do?

I’m guessing that at TASTY level, apply would be desugared into new so the new call is carved into the stone for TASTY-compat.

sjrd · May 22, 2020, 2:30pm

You couldn’t have done anything better, indeed. And TASTy will indeed burn the new in the TASTy files, so that has the same consequences as the binary compatibility in your post.

Developers will think that turning a turning a constructor into an apply is a compatible API change, whereas in fact it’s completely non-compatible from the TASTy and binary points of view.

scottcarey · May 22, 2020, 5:38pm

Lets follow the exact same logic without this change. I’m gdoing to copy paste the prior argument and modify it to what api evolution looks like today.

I’m wondering about compatibility implications along some long-term library API evolution.

foo v1

class Animal(name: String)

At this point the user ~~could use this as either~~ has

val a = new Animal("Tommy")

~~Two methods of the animal creation are expected to behave the same.~~

foo v1.1

We realize that we want to provide our own apply to add some validation or business logic.

class Animal(name: String)
object Animal {
  def apply(name: String): Animal = {
    assert(name.nonEmpty && name.head.isUpper)
    val a = new Animal(name)
    DB.append(a)
    a
  }
}

This would create a library that is binary-compatible, source-compatible, yet ~~half the code would change its behavior upon recompilation~~ none of the existing user code picks up the validation without source modification and recompilation .

foo v1.1.1

To ~~prevent the unpredictable behavior~~ *enforce validation, maybe the library would then opt to force everyone to use the apply by making the constructor private.

class Animal private (name: String)
object Animal {
  def apply(name: String): Animal = {
    assert(name.nonEmpty && name.head.isUpper)
    val a = new Animal(name)
    DB.append(a)
    a
  }
}

This is slightly better, ~~but for binary semantic compatibility, the situation hasn’t really changed. Your old downstream libraries/plugins are still not using the right validation.~~ but downstream users must recompile and modify their code to use the apply method.

foo v1.1.2

To prevent this discrepancy, ultimately I’d have to treat that apply(name: String) is now forbidden, and put the validation logic into constructor code?

class Animal(name: String) {
  assert(name.nonEmpty && name.head.isUpper)
  DB.append(this)
}
object Animal {
  // for bincompat with v1.1
  def apply(name: String): Animal = new Animal(name)
}

what could I have done?

~~I’m not really sure how I could have guarded against this. Ship every classes with private constructor and apply as prophylactic measure? Or just not worry too much about this?~~ In general, parameter validation needs to be in the constructor or the constructor hidden anyway (with or without this change), and whatever side-effect DB.apply() is doing is a code smell I’d whack out in any code review I’m involved in.

I don’t find this api evolution argument very compelling, as you can see, the situation is only slightly different with or without this change:
If you have only a plain constructor, and add an apply method you have to be aware of what that means for your users. The real difference is that library authors will have to be aware of how this de-sugars. However, this is mostly only a risk when the apply method has side effects (including exceptions from validation), which is an area that is error prone for novices with or without this change.

Apply methods/constructors are probably ‘naked’ 90% of the time, in which case this change is pure win. Mabye 9% of the time, there is some parameter validation, but this is best done either in the constructor, or in apply methods with private constructors, so that no invalid instances can be created.

Lastly, some novice will inevitably put side-effects in constructors or apply methods without understanding the consequences, but this proposal seems only mildly more dangerous in this case, and honestly its an anti-pattern that for some people needs to be a lesson learned the hard way. Experts might provide sane tools that do have effects upon construction, like Future, but someone who is at the level that they can write Future won’t be caught by surprise here. And those novices that will be caught by surprise shouldn’t be shoving side effects into constructors.

Linters, findbugs-like tools, IDEs or a future effects system can warn or error on these things. And I suspect the ability to warn on leaking uninitialized ‘this’ is very close (there was a lot of work recently to catch unsafe initialization), though not all uninitialized reference leaks are errors, there are valid reasons to do it.

This proposal does make it easier for binary compatible behavior to differ from source compatible behavior. (e.g. dropping in an updated library jar without recompilation produces different behavior than recompiling with that jar). Its not the only Scala feature with this quality, however.

eed3si9n · May 22, 2020, 6:32pm

Yea. At this point I’m not arguing for / against this language change, but I want to understand the implication for separate compilation situations.

As I see it, def apply is a formalization of factory method pattern, which provides indirection between the creation API and the implementation details of how the interface contract may be fulfilled. In Scala 2.x, if the API started out with def apply, the underlying concrete class could potentially evolve without breaking binary compatibility (like Vector did recently).

I’m not sure if this worth it, but if companion object is code generated together with def apply(...), then the usage code would have Animal.apply(...) instead of new Animal(...), retaining the possibility of animal implementation evolution and/or caching.

morgen-peschke · May 22, 2020, 7:12pm

This would also simplify the question of “does this class have a companion object”, which would be a nice win for regularity.

One downside of generating the apply method is the same as exists today for case classes: you can’t view the source equivalent to the generated code. This is more of a tooling issue than one specific to this proposal.

robstoll · May 22, 2020, 7:28pm

This approach was tried and rejected:
http://dotty.epfl.ch/docs/reference/other-new-features/creator-applications.html#discussion

scottcarey · May 22, 2020, 9:07pm

My thought was that generating a companion for every class would bloat the bytecode among other drawbacks.

Instead, a (jvm) static apply method could always be generated for each constructor, that delegates to it directly. The additional bytecode size for this is much smaller than a companion. Companion apply methods that return the class type would have their implementations in static methods on the class, which the companion delegates to.
So a library compiled to call Animal("slug") on a plain class would be wired up to the static factory method, and adding an apply method on a companion later would then modify the contents of this static method.
This would mean the binary compat vs source compat story is a bit better - add a companion apply method, and code compiled previously would pick up the implementation without a recompile.

There are some issues that make this non-trivial – jvm static method signatures can collide with instance methods, etc.

But it would help with the java interop story in general if more companion members were actually static elements on the class (with forwarders from the companion object instance). It would make it a lot easier to write wrapper-apis that allow java or other jvm languages to call Scala libraries, for example, or better, to share more of the api between a scala-idiomatic api and one that is meant for other jvm languages to use.

There are drawbacks to any approach that ‘fixes’ this issue as well. Users might be surprised when suddenly new Animal("ant") is not the same as Animal("ant") when it was in previous versions of the library, so I fall back to most of my prior arguments – adding or modifying apply methods in your api contract has consequences and there is no free lunch. Either the behavior changes at link time or compile time, but there IS a change, and the library writer needs to be aware of what it is and communicate it to users.

In that sense, the current proposal is the simplest solution possible if one desires plain classes to not require ‘new’ to be used in the source code. All solutions have drawbacks, even the status-quo. There is wisdom in choosing the simple solution that doesn’t involve modifying the compiler back-end, and so I feel that this proposal has more positives than negatives.

curoli · May 22, 2020, 10:04pm

Static methods are inherited.

Jasper-M · May 23, 2020, 1:53pm

Why are these issues prohibitive for regular classes but not for case classes?

Right now it’s the other way around. A static method on the class forwards to the instance method of the companion object. I think it’s important that the implementation is in the instance method because objects in Scala can inherit from classes and traits.

robstoll · May 23, 2020, 4:16pm

I did not draw these conclusions, so I can only guess. What I see: using the generate-an-apply-function approach can result in an ambiguity problem. This is already the case for case classes and I need to work around it from time to time. The question here is if we want to add ambiguity problems to code bases which are ambiguity free today in scala 2 when they upgrade/switch to dotty – this could make migrations much harder. Even worse, in some cases the behaviour could change as the generated apply method is more specific than the custom one. So potentially adding bugs when the code is compiled with dotty without noticing it.

IMO it would still be a feasible approach if we

do not generate an apply in case of ambiguity and emit a warning in 3.0 and turn it into an error in 3.1 (i.e. generate the apply in all cases)
emit a warning if the apply method is more specific (also for case classes)

However, I see additional problems with that approach:

we generate companion objects for all classes even though we don’t need it
sometimes I want to hide the constructor and provide my own apply method (factory method) and not provide the auto-generated one. For case classes there is no way to do this as far as I know, the public apply is always there. That’s kind of OK for case classes and for consistency reasons the same should apply to classes (if this approach is taken).

IMO the current approach where omitting new is really the same as using new is good. but I would like to see an easy way to delegate to the constructor from an apply (maybe something for 3.1 or did I miss something and it’s already there?).

martijnhoekstra · May 23, 2020, 4:31pm

In that case, it may make sense to also drop the generated apply method for case classes.

robstoll · May 23, 2020, 4:38pm

Indeed , would:

remove the ambiguity issues
allow to have a custom apply which has a higher precedence
allow to have a private constructor and a non-public apply

However, then we probably want an easy way to delegate to the constructor already in 3.0

odersky · May 23, 2020, 5:43pm

It would indeed be nice if we could do this. Right now the problem is that companions of case classes implicitly extend function types which are implemented by the apply method. We’d have to also drop this
concept from the language. Maybe not a bad idea, except that it would cause further migration hassles,
and we already have enough of those…