Hello everyone,
The discussion of the proposal to add top-level definitions brought back to life the issues of top-level statements, aka what to do about an easy syntax to write a program.
This is a pre-SIP for a solution, which I came up with, with input from @odersky, @AleksanderBG, @nicolasstucki, @densh and @OlivierBlanvillain.
Problem statement
We want an easy and simple way to define a main program, i.e., an entry point for the execution of an application. Without any special support from the language or standard library, what we need to write is an object
with a main
method:
package foo
object Bar {
def main(args: Array[String]): Unit = {
val who = "world"
println(s"Hello $who!")
}
}
The above has two main issues:
- It contains several concepts that are irrelevant to a beginner who just wants to print something on the screen, so it is awkward to teach;
- It contains a lot of boilerplate, which annoys even experienced developers who write a lot of such entry points
Previous approaches
Scala has had two different approaches to solving this problem, both library-based.
scala.Application
A developer could write a main object as follows:
package foo
object Bar extends Application {
val who = "world"
println(s"Hello $who!")
}
the trait Application
being straightforwardly defined as
trait Application {
def main(args: Array[String]): Unit = ()
}
The main
method was mixed in Bar
, and did nothing. However, calling it would force the constructor of Bar
to execute, which would run the program.
That approach had a very severe issue: the entire application would run within the constructor of the object
, which means inside a static initializer, which means within the initialization lock of the class. This causes deadlocks if multiple threads try to access members of the main object.
Another small problem is that the args
are never accessible.
scala.App
To remedy the above two issues, another trait was introduced. At use site, it looks exactly the same:
package foo
object Bar extends App {
val who = "world"
println(s"Hello $who!")
}
however its implementation is very different:
trait App extends DelayedInit {
/** The command line arguments passed to the application's `main` method. */
protected final def args: Array[String] = _args
private[this] var _args: Array[String] = _
private[this] val initCode = new ListBuffer[() => Unit]
override def delayedInit(body: => Unit): Unit =
initCode += (() => body)
final def main(args: Array[String]) = {
this._args = args
for (proc <- initCode)
proc()
}
It relies on the DelayedInit
mechanism–which was invented specifically for App
–to move the body of the constructor of Bar
into the initCode
lambdas. The main
method can then store the args
and then call the previous body of constructors.
DelayedInit
however has several unfixable issues, which led to it being deprecated in 2.11.0. Support for App
has been preserved nevertheless until a better solution came, but so far that has not happened.
Proposed solution
I propose the following solution to the main
problem. We introduce a new soft keyword program
, which is only a keyword when directly enclosed by a package
block (remember that package
“statements” open an implicit package
block). Its usage looks like:
package foo
program Bar = {
val who = "world"
println(s"Hello $who!")
}
Intuitively, program Bar
above introduces a main entry point, whose fully qualified name is foo.Bar
. The right-hand-side after the =
sign is the definition of the program, and is executed as a main method. Once compiled, it is possible to invoke it with
$ scala -cp . foo.Bar
for example.
The right-hand-side has its own scope, which is a local scope like that of methods. Within that scope, the identifier args
of type Array[String]
is visible and refers to the command-line arguments.
Formally, program
is defined as a straightforward, syntactical-only desugaring. The general form
program X = <expr>
is rewritten as
object X {
def main(args: _root_.scala.Array[_root_.scala.String]): _root_.scala.Unit = <expr>
}
This has the following consequences:
-
program X
introduces a termX
- The identifier
args
is visible in<expr>
- The identifier
main
is also visible in<expr>
(it refers to themain
method itself) – this is not strictly speaking desirable, but allowing this “leak” greatly simplifies the specification and the implementation - Local definitions inside the
<expr>
, such aswho
in the example above, are kept local to the synthesizedmain
method. They are never visible nor import-able outside of theprogram
declaration.
Alternatives
The “magical” introduction of args
can be seen as problematic. An alternative would be to use _
as the parameter name, therefore making it invisible instead. This is a bit less magical, but then there is no way to get access to the command-line arguments if we want to, limiting the usefulness of program
. Besides, we leak main
anyway, so it does not seem a stretch to “leak” args
.
An alternative syntax does not include the =
sign:
package foo
program Bar {
val who = "world"
println(s"Hello $who!")
}
Given the shape of existing constructs in Scala, this alternative suggests that definitions inside the program
would somehow be importable from the outside, such as import Bar.who
. That is however not possible. The =
sign solves this ward by clearly marking the body as something similar to the body of a def
or the rhs of a val
(and in fact it is the body of a def
). It also removes the need for the {}
altogether if the body is a single expression/statement, so the following is legal:
program Bar = println("Hello world!")
Backward compatibility
Since program
only takes a special meaning when directly inside a package declaration, this proposal does not break any existing code. It is indeed illegal, currently, for anything to start with the token program
in a top-level position.
Implementation effort
Trivial, as it is literally a syntactical rewriting.