@sjrd you raise a good point. Unlike Python (or Ammonite) which would trigger top-level code any time the module is imported, Scala would only trigger them when a top level val/var/def is referenced, but not when top level class/object/types are referenced. That is surprising.
Presumably this surprisingness is already present in package objects, but those are uncommon and used much less than we expect top-level definitions to be.
There is also the question of, given we want to use this top-level code as program entrypoints, how do we change the various scala runners to specify which top level code to run? These top-level code blocks basically become main methods, and will need to be specifiable in scala, SBT, Mill, and so on.
Perhaps we could consider a slightly more limited scope:
-
Top-level statements can only be used in
*.scfiles; these are picked up by the Scala compiler similar to*.scalafiles -
*.scfiles automatically generate a Java-compatible main method with the name of the class being the name of the file e.g.Foo.scgenerates a classFoowith a main method (perhaps mangled in some way to avoid collisions?) -
We ban top-level
varandvals within*.scalafiles, as @nafg suggested. It’s not the end of the world to label thevals withlazyto get a more predictable initialization semantic, and top-level mutable state is rare enough the boilerplate of stuffing it in anobjectis no big deal.
This would have the following consequences:
-
Standalone
*.scfiles become code that people can run viascala(this is already possible), or via alternate runners likeamm(to the extent that they are compatible, which they mostly are) -
*.scfiles can also serve as entrypoints to larger applications, with the benefit that the entrypoint of a large codebase can trivially be seen from the filesystem without needing to dig through individual files to hunt fordef mainmethods (orextends App, …). Essentially, you could start off with a standalone script, and as it grows seamlessly incorporate it into a multi-file project with a proper build tool by adding*.scalafiles. -
*.scala“library” files maintain their current “statelessness”: you cannot accidentally trigger a top-level side effect when dealing with a*.scalafile, only by calling their defined functions, instantiating their classes or referencing their (lazy)objects orlazy vals. This also follows the best practice in other languages which allow top-level code, which generally discourage you from having top-level side effecting code in any imported “library” files and only use top-level code in the application entrypoint
Essentially, we would take the convenient “just run code” part of scripting languages, while enforcing the “avoid top level code in imported library files” best practice that already exists, and avoiding any confusion about exactly when top-level code evaluates when non-entrypoint *.scala files are used.
The “seamlessly go from one-file script to multi-file project with build tool” would be a nice experience to people used to Python’s “just import helper code” style of growing out their initial scripts. SBT would already support it (since it allows Scala files in the project root), and Mill and even Ammonite’s script runner could be similarly tweaked to conform to such a "*.sc is entrypoint, *.scala is library" convention with the limitations described above
In this world, we wouldn’t consolidate to a single Scala syntax, but at least we can get everyone to converge towards the same two *.sc/*.scala file extensions with their associated semantics.
This is the best I can come up with so far, unless we can find some way of harmonizing the behavior of top-level code in imported files with that of other languages (i.e. it runs the first time something in the file, anything, is used) to avoid the confusion sebastien brought up.