Let's fix Scala's initialization order when overloading values

soronpo · February 3, 2019, 11:03pm

Yes, I know trait parameters are on their way which can mitigate the situation, but I find Scala’s initialization order very counter-intuitive. Consider the following example:

trait Foo {
  val a : Int
  println(a)
}

class Bar extends Foo {
  val a : Int = 1
}

class LazyBar extends Foo {
  lazy val a : Int = 1
}

new Bar //prints `0`
new LazyBar //prints `1`

As I look at it, using lazy val to change the initialization order isn’t what lazy is for, and looks more like a hack. For me, lazy is simply a function that runs only once.

Take the example above, Foo only works correctly if its successor defines a as lazy. This doesn’t make sense to me. In this case, I cannot even create a Foo without declaring a value for a. But even if a had an explicit initial value, I expect that the latest successor wins the initialization battle.

smarter · February 3, 2019, 11:31pm

Note that your code won’t compile with Dotty as is: you can’t implement a val with a lazy val anymore, val a in Foo needs to be replaced by either a def or a lazy val (the latter won’t work in Scala 2 currently: https://github.com/scala/scala-dev/issues/583).

Anyway, what do you propose as a fix exactly ? There’s no easy solution here. My best hope is that Fengyun and Aggelos work on having the compiler warn about all potential uses of uninitialized values becomes usable enough that we can enable it by default.

soronpo · February 3, 2019, 11:48pm

I describe the rule with using lazy semantics:
All values should be considered “lazy” for initialization when overloading, but are evaluated on construction as if they were accessed.
So every trait/class can be re-written as follows:

//From `Foo`
trait Foo {
  val a : Int
}

//Change to `ProperFoo`
trait ProperFoo {
  lazy val a : Int //assuming dotty's abstract lazy is accepted
  a //touching `a`, since we want it evaluated and not really lazy
}

odersky · February 4, 2019, 6:52am

I considered that solution early on in the life of Scala, and I remember discussing it with Don Syme. He had the counter argument that sometimes you need a guaranteed initialization order, and lazy vals everywhere don’t deliver that. E.g.

    val x = initGraphicsDriver()
    val y = graphicsDriverStatus()

soronpo · February 4, 2019, 11:49am

odersky:

I considered that solution early on in the life of Scala, and I remember discussing it with Don Syme. He had the counter argument that sometimes you need a guaranteed initialization order, and lazy vals everywhere don’t deliver that. E.g.
    val x = initGraphicsDriver()
    val y = graphicsDriverStatus()

How does this example contradict the rule above? Although I mentioned the rule should be applied only in case of overriding values, it still does not matter. Using your example, if I had the following class:

class GraphicsExample {
  private var gc = 5
  def initGraphicsDriver() = gc = 55
  def graphicsDriverStatus() = println(gc)
  val x = initGraphicsDriver()
  val y = graphicsDriverStatus()
}

I propose changing it to:

class GraphicsExample {
  private var gc = 5
  def initGraphicsDriver() = gc = 55
  def graphicsDriverStatus() = assert(gc == 55)
  lazy val x = initGraphicsDriver()
  x
  lazy val y = graphicsDriverStatus()
  y
}

new GraphicsExample{} //all is OK

The idea behind my proposal is that val x and val y could have remained abstract in a parent trait and initialized in a successor without any problems.

curoli · February 4, 2019, 1:02pm

First of all, lazy vals are expensive. Under the hood, they require an extra flag whether they have been initialized, and a lock on that flag needs to be obtained each time the lazy val is accessed. You really don’t want that by default.

Second, lazy vals only help you if your initialization is acyclical, and since there is really no way to enforce that, complex initalization with lazy vals is a landmine waiting to kill you.

The only legitimate reason for lazy vals I can see is to delay or omit an expensive computation.

Regarding initialization, you just have to keep it simple or else terrible things will happen.

soronpo · February 4, 2019, 1:21pm

We don’t need the extra flag here, since the lazy is evaluated immediately upon construction. As I mentioned, the lazy here is just a hack for what I consider the proper value initialization order when overloading is applied (no need to apply a “lazy” mechanism otherwise). I haven’t been given a counter-example that the proposed behavior is bad.

Can you provide an example where the proposal does this?

That’s my claim at the OP. I believe to use lazy val as a fix for initialization order is hack. This is what I intuitively expect as the proper initialization order. If I (or a user of the class) overload a value, then I expect the overloaded value to take effect during the initialization phase as well.

I consider my proposal simpler than what we currently have now. I’m constantly tripped by this, and others seem to as well:
https://www.tapad.com/news/engineering-blog/here-there-be-dragons-dangers-of-initialization-order-in-scala
https://stackoverflow.com/questions/14568049/scala-initialization-order-of-vals

soronpo · February 4, 2019, 1:58pm

BTW, I’m not saying doing this won’t break code. But since we are going for something better in Scala 3, I’m proposing what I think is more intuitive.

Here is an example that breaks under the proposal above:

object Counter {
  private var c = 0
  def inc : Int = {c += 1; c}
}
class Foo1 {
  val cnt = Counter.inc
}
class Foo2 extends Foo1 {
  override val cnt = Counter.inc
}
class Foo3 extends Foo2 {
  override val cnt = Counter.inc
}
val foo3 = new Foo3
println(foo3.cnt) //prints 3

is changed into

object Counter {
  private var c = 0
  def inc : Int = {c += 1; c}
}
class Foo1 {
  lazy val cnt = Counter.inc
  cnt
}
class Foo2 extends Foo1 {
  override lazy val cnt = Counter.inc
  cnt
}
class Foo3 extends Foo2 {
  override lazy val cnt = Counter.inc
  cnt
}
val foo3 = new Foo3
println(foo3.cnt) //prints 1

Yes, this example is now broken, but I claim such code is broken to begin with. To me, there is no sense that an overridden value is executed in every phase of its hierarchy. Its initialization should be done once.

curoli · February 4, 2019, 3:12pm

I just wrote a longer response, but the forum rejected it, saying it contained too many images. The funny thing is, I did not include any images, all the images were automatically included by quoting the response. We should have a forum somewhere dedicated to venting frustration about the forum platform.

Basically, my response was about the following example. Keep in mind that each type is compiled separately.

trait A {

** def initA(): Int**

** lazy val a: Int = initA()**

** a**

}

trait B {

** def initB(): Int**

** lazy b: Int = initB()**

** b**

}

class C extends A with B {

** override def initA(): Int = b**

}

class Boom extends C {

** override def initB(): Int = a**

}

soronpo · February 4, 2019, 4:07pm

OK, I now understand what you referring to. As I mentioned in the last example with the counter, I’m well aware that the proposal can break existing code (or even hang it, as you demonstrated), but I’m searching for real-world cases that writing such code is recommended in the first place. I mean, we’re only discussing semantic changes of val. I think my semantic version of val better fits the intended programming model we have in our head when using override.
Let’s look at your example as it would have been originally before the semantic change (changed it a little so it would compile).

trait A {
 def initA(): Int = 0
 val a: Int = initA()
}

trait B {
 def initB(): Int = 0
 val b: Int = initB()
}

class C extends A with B {
 override def initA(): Int = b
}

class Boom extends C {
 override def initB(): Int = a
}

To me, writing something like this, even without lazyness, looks like just writing:

trait A {
 def initA(): Int = 0
 val a: Int = initA()
}

trait AChild extends A {
 override def initA() : Int = a
}

It does not make sense.

So we can advance more logically than subjectively about the code, what would you say if I apply the rule above to the 2.12 community build and check the results? I believe that a high percent of the build will succeed without any intervention.

shawjef3 · February 4, 2019, 4:26pm

I’ve run into this initialization problem plenty of times and become used to using early initializers.

// Entering paste mode (ctrl-D to finish)

class EarlyBar extends {
  override val a : Int = 1
} with Foo

new EarlyBar

// Exiting paste mode, now interpreting.

1
defined class EarlyBar
res25: EarlyBar = EarlyBar@2c29aa59

soronpo · February 4, 2019, 4:27pm

FWIW, early-initializers are on their way out and trait parameters are on their way in.

curoli · February 4, 2019, 4:34pm

Sorry if I’m being unclear, but my argument is this:

If you have have initialization order issues, rather than wanting to modify the language to better support complex initialization schemes, rather try to make your initialization scheme simpler.

We do know that overriding a val with a val is weird. The solution is not to do that.

Relying on laziness will make accidental cycles more likely. Instead, organize you initialization to make the order more clear.

odersky · February 4, 2019, 6:24pm

Also worth noting that @liufengyun is making good progress with a static analysis that discovers missing initializations.

liufengyun · February 5, 2019, 12:44pm

I am sympathetic to the problem raised by @soronpo . However, I think changing Scala semantics is not the best direction to go.

First, it complicates the semantics of Scala, programmers need to understand more rules, and compiler writers have to implement those rules.

Second, the problem raised by @soronpo is just the surface of an iceberg of problems related to initialization in Scala, Java, C++ and other languages. It is not scalable to address such problems by changing language semantics.

The following code is an initialization problem in Java:

abstract class Parent {
  int a;
  Parent() {
    a = init();
  }
  abstract int init();
}

class Child extends Parent {
  int b;
  Child() {
    super(); b = 10;
  }
  int init() { return 2 * b; }
}

The following is an example in Scala that causes problems:

    abstract class Base {
      def name: String
      val message = "Hello, " + name
    }
    trait Mixin {
      val name = "Scala"
    }
    class Child extends Base with Mixin
    println((new Child).message)  // "Hello, null" instead of "Hello, Scala"

When we talk about initialization, there are mainly three concerns:

safety
expressiveness
friendliness

There is a tension among the three goals: for safety, we have to restrict expressiveness; safety may require more annotations or impose a new system on programmers, which harms friendliness.

Currently, Scala, Java, C++ are extremely expressive in supporting flexible initialization patterns. I think what is missing is a programmer-friendly initialization checker. We are working in that direction to (1) check initialization errors in Scala with zero annotations, (2) programmers only need to understand Scala semantics, and (3) the check supports all reasonable Scala initialization patterns.

liufengyun · February 5, 2019, 1:14pm

We can make it safe without early initializers — just make the field as class parameter.

trait Foo {
  val a : Int
  println(a)
}

class Bar private (val a: Int) extends Foo {
  def this() = this(1)
}

Class parameters are always initialized early.

mdedetrich · March 7, 2019, 4:31pm

I disagree here, that is precisely one of the use cases for lazy, its to not care about initialization order. If you look at Haskell (which has lazy by default), one of their reasons for this decision was that you shouldn’t care about order of definitions/initialization unless you actually are doing sequential programming (i.e. you are inside do notation).

Also, having the lazy keyword designed the way it is right now allows for some beautiful things when it comes to initialization order, i.e. implicit lazy vals with traits gives you DI that is checked at compile time.