Remove `-d .jar` and some backend flags

Hi

I’m doing a bit of a cleanup in the backend and stumbled over a few features that can possibly go away. There’s no deeper reason than spring cleaning, so anything that’s actually being used will stay.

Please let me know if you feel that any of these features should stay.

Compile to jar

Using -d out.jar, the backend generates a jar file instead of separate classfiles. The jar has a simple manifest that sets Main-Class. The main class can be specified using -Xmain-class pkg.Main. Without a given -Xmain-class, if the compiler finds a single valid main method, its containing class will be used.

From what I hear, writing directly to jar can speed up compile times, especially on (certain versions of?) Windows where the file system is slow dealing with many small files. The feature was added 6 years ago. There’s an sbt plugin.

Issues when using that features are:

  • it doesn’t work with incremental compilation. The sbt plugin always re-compiles all sources. It might be technically possible to improve that aspect by patching an existing jar file.
  • the resulting jar only contains the simple manifest generated by the Scala compiler and no other metadata. In a standard workflow with sbt, a jar often contains more files (resources, more metadata).

-Ydump-classes

-Ydump-classes /path makes the compiler generate classfiles also in /path, in addition to the output directory. This is potentially useful when compiling to memory instead of the file system (when the compiler is invoked programmatically). Added in https://github.com/scala/scala/commit/7eb45e79b8.

-Ygen-asmp

-Ygen-asmp /path makes the compiler emit the bytecode as text files in asm’s format (similar to javap). This can be easily achieved by invoking asm separately, I personally use an asm shell script. For comparing classfiles or jars I can recommend the recent jardiff tool by Jason.

I used -d jar once this year. I don’t remember the context at work, trying something quick and dirty but effective, something I might otherwise have left as a script. I remember being grateful for the feature, and a bit surprised that it turned out to be useful.

Ironically I used -d jar to try out Jason’s jardiff tool :slight_smile:

In another post, @stuhood argued for better incremental compilation into JARs:

@jvican is still interested in doing this, yea: https://github.com/sbt/zinc/issues/305

used -d jar to try out Jason’s jardiff tool

That shouldn’t be necessary, jardiff also works on directories with classfiles.

1 Like

@stuhood it’s not entierly clear to me if this proposal requires the compiler to compile to jar (incrementally) or just sbt/zinc update packaged jars incrementally.

Sbt is using packageBin to create jars, and it’s independent of -d as far as I see in the source code. Provided this is the case (I double-checked, but someone should double check too), we can remove -d in the compiler.

And yes, in the next days I’m providing an implementation to perform incremental compilation for jars. :slight_smile:

Just to repeat my previous remark: I often don’t use sbt, so -d my.jar is useful in normal command line scalac.

I’ll say more: sbt requires a project and a dir. But scala is scalable, which means I use it for micro-tasks: scripting, or something more than scripting. A few times, a POC script at work graduated to, “Oh, this is a feature”.

Although there is talk about how the command line scalac isn’t useful for newbies, it’s very useful for folks who like to try something out, develop it a bit, and then it’s a thing. This issue could be the turning point where we decide, is Scala REPL just a way to test the compiler? Is Ammonite the tool useful for actually doing stuff?

I like the idea of having a core tool that always uses compiler internals, and a second tool like Ammonite that might be my daily shell but might also exhibit a different set of bugs from what scalac would report. Let alone what happens in a Spark shell.

If you want to write miscellaneous scripts using Scala, and have it scale to larger sizes with third party deps and multiple files, be portable enough that you can send the source to anyone else and they can run it as-is or trivially make changes, automatically cache and invalidate them when the code changes, watch reload & re-run the script code when it changes, take CLI arguments, and provide good error messages when you mess up when doing any of this stuff…

You really should be using Ammonite :smiley:

1 Like

I can imagine that a VFS abstraction for writing compiler outputs would remain useful, as it might allow new outputs to head straight to a JarOutputStream (for example). Whether that is exposed to the CLI is perhaps less important? But @jvican would know better, so.

I can imagine that a VFS abstraction for writing compiler outputs would remain useful, as it might allow new outputs to head straight to a JarOutputStream (for example). Whether that is exposed to the CLI is perhaps less important? But @jvican would know better, so.

Isn’t this already available programmatically? e.g. you can do

val vd = new scala.reflect.io.VirtualDirectory("(memory)", None)

settings.outputDirs.setSingleOutput(vd)

to have Scalac write files to an in-memory virtual filesystem. Both Ammonite and scalafiddle.io make use of this to avoid round-tripping to disk unnecessarily when compiling a user’s commands and snippets.

2 Likes

@lrytz After pondering this for a while, I think we should not remove -d because:

  • Existing solutions like sbt-tojar cannot replace -d completely. sbt-tojar is not performant enough: it waits for the scala compiler to output class files and then post-processes them.
  • It is not be possible to outsource the implementation of -d to a build tool (in an efficient way, that is, straight-to-jar compilation).

Having this option in the compiler instead of build tools makes sense to me, especially for large codebases.

In addition to this, I have two more remarks:

  • Zinc analysis for jars is on the way.
  • We can write a sbt plugin like sbt-tojar that uses -d and improves existing tooling for monorepos.

Also, -d seems to be a need for this Scalac issue https://github.com/scala/scala/pull/5088.