Proposal to add top-level definitions (and replace package objects)

lihaoyi · March 26, 2019, 9:05am

I think this is a reasonable thing that everyone would agree on for now, in the context of top-level definitions: we preserve the current property of *.scala files that all top level declarations only take effect when referenced, and there are no file-level side effects that can kick in at unpredictable times.

Simplifying main methods, or converging on a *.sc syntax, is an orthogonal issue and can be discussed separately without getting in the way of top-level declarations.

drdozer · March 26, 2019, 9:50am

IMHO, allowing println("Hi Mum") in a .scala file at top-level is a horror show waiting to happen. Now, you raise the issue of the semantics of the scripty scala form. There seem to be two issues here that I think are orthogonal:

default environment
evaluation semantics

The default environment issue I think can be addressed by either having specific extensions (e.g. .sbt, .am) or bundling them up into a standardised environment import e.g. import ammonite.environment, much like the language feature flags are. The mechanic through which this works is plumbing, and largely booring (e.g. it could be importing a scala.language.ScriptEnvironment instance with a well-named macro that injects the environment).

As for the evaluation semantics, there appear to be two of them. The first one runs the statements just as you would in main. The other collects a dictionary (or bag) of values associated with names, and then makes this available to some down-stream process. In the case of interactive worksheets, this dictionary is used to decorate the IDE with the evaluated values. In ammonite or the scala repl, it gives you interactive values to play with as you continue to type. In SBT, this dictionary becomes the parameterised build commands/environment. But fundamentally it’s the same deal - you evaluate each statement, and record a memoisation of the result against any identifier, minting a new one if the statement is anonymous. The main semantics then reduces to the special case where you decline to do anything with that dictionary, and run it purely for the side-effects.

lihaoyi · March 26, 2019, 9:53am

I wrote up a separate post:

Proposal: Simplifying the Scala getting started experience

Perhaps we can assume that we will prohibit top-level side-effecting statements/vals/vars in this thread on top-level definitions, and move discussion on the main-method-entrypoint stuff to that post

soronpo · March 26, 2019, 10:30am

For me, val sideEffect = println("Hi Mum") is the same. To allow one, but not the other is very weird, IMO. So I think top level should either be only lazy val and def or we should remove any other restraints and allow statements.

sjrd · March 26, 2019, 10:32am

I don’t think they’re the same. For a top-level statement

println("Hi Mum")

the only reasonable naive expectation is that it is executed “when the program starts”, which is unrealistic to implement.

For

val sideEffect = println("Hi Mum")

it is easier to convince oneself that the side effects will only be executed once sideEffect or one of its siblings is first accessed, the same way the constructor of an object is only executed once that object is accessed for the first time.

julienrf · March 26, 2019, 10:33am

I share this concern. I think top-level val definitions might introduce too much confusion.

julienrf · March 26, 2019, 10:34am

Are top-level definitions allowed in the “empty package”?

drdozer · March 26, 2019, 12:40pm

val sideEffect = println(“Hi Mum”)

I’d need convincing that this was sane. I’ve had experience in the past with Java libraries that load in side-effecting values (e.g. from files) into static variables, and it results in incredibly brittle behaviour as it’s unclear what programs will or won’t trigger a resource to be loaded, and therefore which may or may not result in the side effect raising exceptions during class loading. I don’t feel that restricting top-level vals to being lazy is that onerous, and it does force the person writing the top level value to pause and think if they are doing something sane.

scalway · March 26, 2019, 12:52pm

What we have then:

var are bad design and we should avoid them anyway. DONT ALLOW
val has problem with side effects, and It is hard to say when they are Initialized. DONT ALLOW
lazy val are safe OK
def are safe OK

Do we bother about top level val and var only because we want to drop package objects with this proposal? Is there any other reason why not allow only lazy vals and defs at top level?

Jasper-M · March 26, 2019, 1:56pm

sjrd:

For
val sideEffect = println("Hi Mum")
it is easier to convince oneself that the side effects will only be executed once sideEffect or one of its siblings is first accessed, the same way the constructor of an object is only executed once that object is accessed for the first time.

But who are the siblings? They used to be all vals, vars and defs in the same namespace. In this proposal suddenly the semantics change depending on which file contains which definitions. The only way to understand why side effects happen when they do is by knowing how they are compiled into an object per file. i.e. top-level eager side effects leak an implementation detail.

odersky · March 26, 2019, 2:59pm

There’s a proposal for new implicit resolution rules that offer better ways to prioritze than by location of original definition.: #6071 (comment).

tarsa · March 26, 2019, 9:34pm

Making initialization order more unintuitive than it is now will only make things worse. Right now Scala has inherited from Java the class initialization order (hey unexpected nulls in vals), but also added unintuitive initialization order of nested objects. Inner object can be initialized without initializing outer object, like here:

object Outer {
  println("outer")
  
  object Inner {
    println("Inner")
    
    val x = 5
  }
}

println(Outer.Inner.x)

Above code prints only this:

Inner
5

Adding extra rules for initialization order of top level definitions will only make things more confusing as a whole, especially when hunting for bugs (and beginners do a lot of bugs, partly because they write low quality code).

RichType · March 27, 2019, 2:24pm

Will top level implicit objects be allowed under this proposal? It would be nice if we could use a trait’s /classes’ companion object as the implicit evidence, rather than just as a container for the implicit evidence object.

drdozer · March 27, 2019, 11:28pm

Yes. You can put implicit/implied declarations at the top level. It works great for things that add syntax to types defined elsewhere.

pagoda_5b · March 29, 2019, 11:11am

Or using the underscore for a name, thus removing the risk of conflicts and the need for the package object syntax. The underlying name might be synthesised by the compiler to reduce conflict risks.

package myLib

object _ {
  ...
}

This would reflect the fact that you’re somehow doing import myLib._

Still I’m still bugged by rules about how the enclosing file should be named…

scalway · March 29, 2019, 3:38pm

one more meaning of _. Please don’t.

soronpo · April 16, 2019, 4:23am

I think that export (see export dotty docs) covers my proposal quite well, so I’m dropping it.