Proposal to add top-level definitions (and replace package objects)

I think this is a reasonable thing that everyone would agree on for now, in the context of top-level definitions: we preserve the current property of *.scala files that all top level declarations only take effect when referenced, and there are no file-level side effects that can kick in at unpredictable times.

Simplifying main methods, or converging on a *.sc syntax, is an orthogonal issue and can be discussed separately without getting in the way of top-level declarations.

2 Likes

IMHO, allowing println("Hi Mum") in a .scala file at top-level is a horror show waiting to happen. Now, you raise the issue of the semantics of the scripty scala form. There seem to be two issues here that I think are orthogonal:

  • default environment
  • evaluation semantics

The default environment issue I think can be addressed by either having specific extensions (e.g. .sbt, .am) or bundling them up into a standardised environment import e.g. import ammonite.environment, much like the language feature flags are. The mechanic through which this works is plumbing, and largely booring (e.g. it could be importing a scala.language.ScriptEnvironment instance with a well-named macro that injects the environment).

As for the evaluation semantics, there appear to be two of them. The first one runs the statements just as you would in main. The other collects a dictionary (or bag) of values associated with names, and then makes this available to some down-stream process. In the case of interactive worksheets, this dictionary is used to decorate the IDE with the evaluated values. In ammonite or the scala repl, it gives you interactive values to play with as you continue to type. In SBT, this dictionary becomes the parameterised build commands/environment. But fundamentally itā€™s the same deal - you evaluate each statement, and record a memoisation of the result against any identifier, minting a new one if the statement is anonymous. The main semantics then reduces to the special case where you decline to do anything with that dictionary, and run it purely for the side-effects.

I wrote up a separate post:

Perhaps we can assume that we will prohibit top-level side-effecting statements/vals/vars in this thread on top-level definitions, and move discussion on the main-method-entrypoint stuff to that post

1 Like

For me, val sideEffect = println("Hi Mum") is the same. To allow one, but not the other is very weird, IMO. So I think top level should either be only lazy val and def or we should remove any other restraints and allow statements.

6 Likes

I donā€™t think theyā€™re the same. For a top-level statement

println("Hi Mum")

the only reasonable naive expectation is that it is executed ā€œwhen the program startsā€, which is unrealistic to implement.

For

val sideEffect = println("Hi Mum")

it is easier to convince oneself that the side effects will only be executed once sideEffect or one of its siblings is first accessed, the same way the constructor of an object is only executed once that object is accessed for the first time.

I share this concern. I think top-level val definitions might introduce too much confusion.

Are top-level definitions allowed in the ā€œempty packageā€?

val sideEffect = println(ā€œHi Mumā€)

Iā€™d need convincing that this was sane. Iā€™ve had experience in the past with Java libraries that load in side-effecting values (e.g. from files) into static variables, and it results in incredibly brittle behaviour as itā€™s unclear what programs will or wonā€™t trigger a resource to be loaded, and therefore which may or may not result in the side effect raising exceptions during class loading. I donā€™t feel that restricting top-level vals to being lazy is that onerous, and it does force the person writing the top level value to pause and think if they are doing something sane.

What we have then:

  • var are bad design and we should avoid them anyway. DONT ALLOW
  • val has problem with side effects, and It is hard to say when they are Initialized. DONT ALLOW
  • lazy val are safe OK
  • def are safe OK

Do we bother about top level val and var only because we want to drop package objects with this proposal? Is there any other reason why not allow only lazy vals and defs at top level?

4 Likes

But who are the siblings? They used to be all vals, vars and defs in the same namespace. In this proposal suddenly the semantics change depending on which file contains which definitions. The only way to understand why side effects happen when they do is by knowing how they are compiled into an object per file. i.e. top-level eager side effects leak an implementation detail.

1 Like

Thereā€™s a proposal for new implicit resolution rules that offer better ways to prioritze than by location of original definition.: #6071 (comment).

2 Likes

Making initialization order more unintuitive than it is now will only make things worse. Right now Scala has inherited from Java the class initialization order (hey unexpected nulls in vals), but also added unintuitive initialization order of nested objects. Inner object can be initialized without initializing outer object, like here:

object Outer {
  println("outer")
  
  object Inner {
    println("Inner")
    
    val x = 5
  }
}

println(Outer.Inner.x)

Above code prints only this:

Inner
5

Adding extra rules for initialization order of top level definitions will only make things more confusing as a whole, especially when hunting for bugs (and beginners do a lot of bugs, partly because they write low quality code).

1 Like

Will top level implicit objects be allowed under this proposal? It would be nice if we could use a traitā€™s /classesā€™ companion object as the implicit evidence, rather than just as a container for the implicit evidence object.

Yes. You can put implicit/implied declarations at the top level. It works great for things that add syntax to types defined elsewhere.

Or using the underscore for a name, thus removing the risk of conflicts and the need for the package object syntax. The underlying name might be synthesised by the compiler to reduce conflict risks.

package myLib

object _ {
  ...
}

This would reflect the fact that youā€™re somehow doing import myLib._

Still Iā€™m still bugged by rules about how the enclosing file should be namedā€¦

one more meaning of _. Please donā€™t.

I think that export (see export dotty docs) covers my proposal quite well, so Iā€™m dropping it.

1 Like