Scala build process

See https://twitter.com/jroper/status/1062869489073573888 for example.

3 Likes

Yeah, sorry about that. After trying it again on freshly cloned sources, it all worked as you say. I think some of the problem is that sbt clean does not delete the build directory, so the pack/ directory contains the scripts and jars even after a clean, but the quick/ directory seems to be emptied of files. So running dist/mkPack first then clean then compile made it look like compile also produced scripts and jar files.

But the question that remains is the version number, which looks like:

Welcome to Scala 2.13.0-20190604-151517-43e040f

after a dist/mkPack. How can I change that, to something more debianish such as:

Welcome to Scala 2.13.0-debian0.10-p12

or something?

Also, when building scala to pack it inside the debian package, I suppose I can use the dist/mkPack command to produce what is needed. As far as I see we dont need to bootstrap when we have the correct previous version of the binary to build the new version with. what do you think?

They might have been thinking of sbt-pack, which does provide a pack command.

1 Like

This is controlled by baseVersionSuffix.

Agree. Since the Scala orgā€™s own CI includes the ā€œstability testā€ Iā€™ve already described, you already know that recompiling the new compiler with itself isnā€™t going to make any difference.

Agree, but note that dist/mkPack only gets you the contents of the bin and lib directories. There are a other files in the full distribution, and in a full distribution also the file naming in the lib directory is a little different.

GitHub - scala/scala-dist: sbt project that packages the Scala distribution is where the additional steps happen.

As far as I can see from a quick look, scala-dist adds the license, readme, doc/ and man/, in addition to packing the archives for the different target types. The jar files are a bit different, I havent looked into wether they are just packed slightly different (e.g. merged, or uber-jars etc) or just renamed, a bit of both as you point out.

But in essence the functionality is the complete in the script and jar files from build/pack/, its just a matter of naming and packaging that is added, right?

In the final jar files, there are a number of properties files containing version number strings named differenttly and so on, but many of them seem to be the same, why arent they just using shared files instead? also why is there two properties named osgi version and maven version?

But more importantly, are these version and other property strings used for anything functionaly important or are they just printed on screen for information? (Just need to understand if they affect anything or are used by sbt or the compler, or similar, during a compilation of a project)

Starting the Scala build from debian/rules will need to be something like:

$ sbt -Dversion.number=2.13.1  'set every baseVersionSuffix := "debian-p12"' dist/mkPack

It results in:

$ ./build/quick/bin/scala -version
Scala code runner version 2.13.1-debian-p12 -- Copyright 2002-2019, LAMP/EPFL and Lightbend, Inc.

Iā€™m not so sure that baseVersionSuffix supports having dots in the value or not. It seems to complain about that:

java.lang.IllegalArgumentException: Invalid syntax for version: 2.13.1.v20190822-215206-debian0.10-p12-a02eafe
	at aQute.bnd.version.Version.<init>(Version.java:46)
	at aQute.bnd.osgi.Analyzer.augmentExports(Analyzer.java:1711)
	at aQute.bnd.osgi.Analyzer.analyze(Analyzer.java:229)
	at aQute.bnd.osgi.Builder.analyze(Builder.java:352)
	at aQute.bnd.osgi.Analyzer.calcManifest(Analyzer.java:618)
	at aQute.bnd.osgi.Builder.build(Builder.java:81)
	at scala.build.Osgi$.bundleTask(Osgi.scala:73)

Did you run that command in Debian testing or on your local computer?

Yes, I did notice that. I dont know what the actual debian version string will be, just remember its something like thar

@SethTisue, I have a question about the actual compiling. I know the build process is complex, but I have experimented with creating a simplified shell script to compile 2.12.0 using scala 2.11.12. It compiles the library and reflect modules without error, but fails for the compiler module. I see from the sbt log that the classpath argument references scala-asm and scala-xml as dependencies.

So the first quesion is, does the compiler module depend on anything else except the modules library, reflect, scala-asm and scala-xml? The compile process is a bit overwhelming, to say the least, so I am asking to check if I have missed something.

The second question is about the scalac arguments when compiling a language. I assume one major point is to not mix code from the reference compiler into the new compiler build. I.e. the compiler source must only depend on library classes from its own library version. Not use anything from the reference compiler sdk. I understand its impossible to not use anything from the reference compiler sdk, otherwise the library module cant be built at all.
Hence i assume, using bootclasspath is done to controll which runtime is used for compiling the compiler and classpath argument is used to force it to only use the library and reflect class files from the previous step in the build process. Is that correct and complete or is there something important I have missed?

I donā€™t know, Iā€™ve never understood that OSGi stuff. There might be some insight in the git history, but it might take some digging, some of that stuff is quite old. (Also, a lot of things were carried over from the ant build without necessarily being reexamined or rethought.)

Not sure what ā€œother property stringsā€ might refer to.

The version number suffix is just informational. Some peopleā€™s builds and things will check the beginning of the version number to see if it starts with ā€œ2.13.ā€, that kind of thing.

The -bin- and -pre- parts of the version numbers for prereleases are significant to sbt, though; -bin- means binary compatible, -pre- means not.

https://repo1.maven.org/maven2/org/scala-lang/scala-compiler/2.12.9/scala-compiler-2.12.9.pom lists the runtime dependencies. In addition to the ones you listed, thereā€™s also jline and jansi. 2.13 drops the scala-xml dependency.

Well, not precisely. There is a double requirement: the compiler-and-standard-library must be compilable by the reference compiler-and-standard-library, and compilable by itself.

(This restricts the kinds of changes we can make without re-bootstrapping. And even when we do re-bootstrap, we canā€™t just freely change anything.)

Iā€™m having some trouble responding to this since the terminology isnā€™t precise, e.g. when you say ā€œusing bootclasspathā€, I donā€™t really know what youā€™re asking about, exactly.

Perhaps this will help: there is no separation in our compilation process between ā€œthe standard library the running compiler is usingā€ and ā€œthe standard library the new compiler is being compiled againstā€. (This is arguably unfortunate, but thatā€™s a separate discussion.)

Iā€™m not sure if I really answered your questions, but hopefully thisā€™ll help get us to the next set of questions, at least.

I wonder if thatā€™s still true / accurate. I remember that Iā€™ve wondered about that potential conflation in experiments years ago but the success in compiling 2.12.9 with 2.10.x after a few changes seems like evidence to the contrary. In particular, I had to revert removals of some annotations that were present in the 2.10.x library and that were hard-coded in the 2.10.x compiler (and that were of course present in the 2.10.x library that the 2.10.x compiler used itself).

As an experiment I also tried compiling only the 2.12.9 compiler with 2.10.x but also against the library from 2.10.x. To make that work I had to explicitly add an external dependency to the library to the build and remove the internal dependency. It seems like that might work as well but would need additional changes to the 2.12.9 compiler code to make it work with the old library.

The compiler that is being executed has its runtime classpath (thatā€™s the java -cp of the JVM running the Scala compiler). The Scala library on the runtime classpath matches the compiler version, e.g., everything is 2.13.0.

Then thereā€™s the compilation classpath (scalac -cp), which is where the compiler looks for symbols used in the sources being compiled. In a ā€œnormalā€ project, the compilation classpath contains the Scala library of the same version as the compiler being used.

When compiling the Scala project itself however, the compilation classpath is empty when compiling the Scala library. The Scala compiler requires certain symbols from the standard library to exist in the symbol table, for example scala.Option. To make this work with an empty classpath, the Scala library is compiled with the -sourcepath compiler flag, which makes the compiler look up symbols from source files, assuming the file path matches the package and name of the symbol (scala.Option is defined in scala/Option.scala).

The runtime classpath and the compilation classpath donā€™t need to be the same, they can contain different versions of the standard library, or -sourcepath can be used.

To continue the compilation of the Scala project, the reflect module is then compiled with the classfiles of the library module on the compilation classpath (still using the reference compiler), so here we have an example where the two Scala versions (runtime classpath, compilation classpath) are not the same. The compiler module is the compiled with the classfiles of library and reflect on the compilation classpath.

5 Likes

:+1: Great, thanks for that information, @lrytz.

Thanks @lrytz, I am battling with those details and dont know if I am getting it right yet, but I will go through all dependencies and rt/cp/src-paths to check to see if I can find the cause of the compilation problems of the compiler moduleā€¦

Hi, I have gotten a bit further, so now I am able to compile by hand (ie bash script with 2.11.12 scalac command) asm, forkjoin, library, reflect and now lastly the compiler module.

One thing I was struggling with was that the compiler module was using Flags.class which it retrieved from the 2.11.12 jar file, which was lacking a couple of definitions compared to 2.12.0-M1.

So here is the question, I couldnt make scalacā€™s -bootclasspath argument work, it did not override the bootclasspath at all. What I did was use -javabootclasspath and listed all jre and scala 2.11.12 jars, except the reflect.jar (that contains Flags.class). Instead of the 2.11.12 reflect.jar, I added the path to my newly compiled reflect module from 2.12.0-M1. That did the trick.

So why didā€™nt scalac bootclasspath work, is it faulty or did I use it wrong? (Read disclaimer at bottom)

The faulty scalac command:

scalac -J-Xmx4G -d build-exp/classes/compiler -bootclasspath build-exp/classes/compiler:build-exp/classes/reflect:build-exp/classes/library:build-exp/classes/forkjoin:build-exp/classes/asm -classpath $HOME/.ivy2/cache/org.apache.ant/ant/jars/ant-1.9.4.jar:$HOME/.ivy2/cache/org.apache.ant/ant-launcher/jars/ant-launcher-1.9.4.jar -sourcepath src/reflect $SRC_COMPILER_MODULE_FILES

I tried this one with both classpath and bootclasspath, but it did not work.

The modified scalac command that worked:

scalac -J-Xmx4G -d build-exp/classes/compiler -javabootclasspath /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/resources.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/jsse.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/jce.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/charsets.jar:/home/tofi/src/zz_open_source_projects/scala-build-process/releases/scala-2.11.12/lib/akka-actor_2.11-2.3.16.jar:/home/tofi/src/zz_open_source_projects/scala-build-process/releases/scala-2.11.12/lib/config-1.2.1.jar:/home/tofi/src/zz_open_source_projects/scala-build-process/releases/scala-2.11.12/lib/jline-2.14.3.jar:/home/tofi/src/zz_open_source_projects/scala-build-process/releases/scala-2.11.12/lib/scala-actors-2.11.0.jar:/home/tofi/src/zz_open_source_projects/scala-build-process/releases/scala-2.11.12/lib/scala-actors-migration_2.11-1.1.0.jar:/home/tofi/src/zz_open_source_projects/scala-build-process/releases/scala-2.11.12/lib/scala-compiler.jar:/home/tofi/src/zz_open_source_projects/scala-build-process/releases/scala-2.11.12/lib/scala-continuations-library_2.11-1.0.2.jar:/home/tofi/src/zz_open_source_projects/scala-build-process/releases/scala-2.11.12/lib/scala-continuations-plugin_2.11.12-1.0.2.jar:/home/tofi/src/zz_open_source_projects/scala-build-process/releases/scala-2.11.12/lib/scala-library.jar:/home/tofi/src/zz_open_source_projects/scala-build-process/releases/scala-2.11.12/lib/scala-parser-combinators_2.11-1.0.4.jar:/home/tofi/src/zz_open_source_projects/scala-build-process/releases/scala-2.11.12/lib/scala-xml_2.11-1.0.5.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/cldrdata.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/sunec.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/zipfs.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/sunpkcs11.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/localedata.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/nashorn.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/dnsns.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/sunjce_provider.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/icedtea-sound.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext/jaccess.jar:build-exp/classes/compiler:build-exp/classes/reflect:build-exp/classes/library:build-exp/classes/forkjoin:build-exp/classes/asm -classpath $HOME/.ivy2/cache/org.apache.ant/ant/jars/ant-1.9.4.jar:$HOME/.ivy2/cache/org.apache.ant/ant-launcher/jars/ant-launcher-1.9.4.jar -sourcepath src/reflect $SRC_COMPILER_MODULE_FILES

Disclaimer:

I havent tried it in 2.12 or 2.13, cause its a bit of work with the current script, but when the script is more automatic I can compare it then.

1 Like