Scala build process

Hi

I am trying to understand the build file for the scala source, (spesifically the ant file), and see mentions of different types of build and build stages, such as minimal-scala, locker and bootstrap and so on, but I dont quite understand the order and what their purpose are.

Can anybody give me a quick overview of the general build process and the different types of build, so its easier to investigate the structure of the different types of builds?

Regards

Thomas

It’s documented a bit in the README, but it’s legacy so you’d need to find a copy before the switch to sbt:

https://github.com/scala/scala/blob/2.11.x/README.md
https://github.com/scala/scala/blob/v2.11.8/README.md

There were similar pages about running the Ant build, but you’d need to use the Wayback machine:

http://web.archive.org/web/20170615070251/http://docs.scala-lang.org/scala/
http://web.archive.org/web/20161003074035/http://www.scala-lang.org/contribute/hacker-guide.html

To expand on the (implied) ant point, the only currently available build for scalac and the stdlib is the sbt build, there is no longer an ant build for it.

I’d be happy to discuss this in the context of the sbt build, but if it’s about ant, then I think you’re going to have a lot of trouble finding someone who both:

  • remembers anything about the ant build, which we got rid of a long time ago
  • considers it worth their while to answer questions about it, unless you explain what the context and goal are

Since I am basically looking to understanding the build process, so I can try to reproduce it by command line, I dont really care wether its Ant or Sbt, I just assumed that the overall process was the same, even though the details are be different.

I am mostly interrested in the parts needed to build a Debian package of the sdk.

Since I wont have Sbt available I would have to create a shell script to reproduce the build, if possible. Since @eed3si9n mentioned last time that a similar script would be interresting for Sbt, by only using scala commands, I assumed the same would be possible with scala it self.

Also can you explain a bit more what the bootstrapping part entails?

Sure. Our documentation on that is here:

But see also the changes in this PR I just submitted: modernize the readme a bit (areas: IDEs, bootstrapping) by SethTisue · Pull Request #8351 · scala/scala · GitHub

What else would you like to know?

Note that the Scala 2.13 bootstrap is simpler than the 2.12 one, because in 2.12 the Scala compiler had a circular dependency on scala-xml. The circular dependency made everything more complicated. So, I’d strongly suggest you tackle 2.13 first, and then decide later whether you want to go back and try to extend your solution to cover 2.12 as well.

It would be very difficult, I think.

There are 3 steps to produce a Scala compiler.

  1. We retrieve a publicly available Scala compiler + Scala library binaries from Maven Central to be “starr”. Let’s call this Bootstrap1. In case of Scala 2.13.0 final, bootstrap1 version was 2.13.0-RC3.
  2. Using Bootstrap1, we locally build and publish Scala compiler + Scala library on a specific Git commit/tag, like v2.13.0, using sbt. This produces 2.13.0-pre-43e040f, or Bootstrap2.
  3. Next, we restart sbt using Bootstrap2 by specifying-Dstarr.version=2.13.0-pre-43e040f, and rebuild Scala compiler + Scala library on the same Git commit/tag. This time the artifacts that were rebuilt become the dist 2.13.0 version.

In other words, if you want to reproduce Scala 2.13.0 in a clean room, you need to first build 2.13.0-RC3. What did we use to build 2.13.0-RC3, you might ask? As you might have guessed it was 2.13.0-RC2.

It says here that sid contains 2.11.12-4 but I don’t know what they did that.

2 Likes

Thanks @eed3si9n for the description.

This means that the starr version used to build 2.13.0-M1 is 2.12.9, and so on all the way back to 2.11.12 which is the version available in sid. It will be a bit of work, but as soon as all the details are figured out, all work is done in Sid by just replacing the previous src-package with the new one that depends on the previous binary package.

As you mention, I dont know how 2.11.12 got into any Debian version, and I wont ask them at the moment. I am afraid the answer would be counter productive. Because I think they someone hacked the debian build, in some technical way to make it technically, legally and morally acceptable. In essence I suspect they used a public version of 2.11.11 somehow, but not quite.

I havent investigated the source package yet. Mainly because it does not matter right now. No matter what, I still need to understand the Scala build and process, to be be able to build a proper scala sdk binary that can be used properly in Debian. Then I can start thinking about how to get that into a Debian src-package, and after that I can start to think about how that package can compile the Debian way.

The readme @SethTisue linked above also contains a link to an extensive discussion on the mailing list from years ago that shows some interesting viewpoints on bootstrapping: https://groups.google.com/forum/#!topic/scala-internals/gp5JsM1E0Fo/discussion

Here are also older sources:

https://sources.debian.org/src/scala/

Doesn’t seem anything special was done for the versions older than 2.11.x, so they probably just used the starr binaries inside of the sources.

Thanks, Seth. One question I have is, what is the difference between the different results of the sbt commands: compile, pack and mkPack? They all produce the scrips and executable jar files of scala runtime, compiler and libraries, just as the release tarball contains.

The only two differences I have noticed so far, is that the compiler.jar file from “compile” is slightly smaller than the compile.jar from “mkPack”. And that the version number shown in the scala repls welcome text is not proper. Mind you I have not tested them extensively yet, just typed in some simple lines and checked the result.

Which is an infraction of the rules. So why they say the scala package is ok but the Sbt package is not, even thought the same is done there, I dont understand. But it might be how its technically done, but that is far beyound what I know about Debian packaging for now.

Unless, if they include the source code along with the binary, as the rules might hint at (depending on how its interpreted). But that is not what they told me: any binary used must be compiled from source code included in the package by pre existing tools available in the Debian release. So its ok to compile a “starr” version in the source package, as long as the “starr” version source code is in the src-package and it can be compiled by another compiler already in the debian release, But thats the ideal. How 2.11.12 got in legally, I dont know…

Edit: I just noticed the README file mentioned starr in the same sentence as “reference compiler”. I seem to remember something about reference compilers might be allowed, but how I dont understand. All the rules are confusing and sometimes seems to negate each other…

For the 2.11 builds, it seems they disabled fetching starr from maven and instead added a build dependency on a recent Scala version that was already available, i.e. they already did proper bootstrapping (given the former libraries that were built using the in-tree starr…).

i.e. they used external binaries located inside the src-package… (which is what sbt also did) this is very confusing and conflicting…

Only for Scala < 2.11 (or so), afterwards proper build dependencies for already existing Scala versions were added.

You mean that once they had a scala version in the Debian release, they used that as the starr version for the next scala version built? But then they still cheated the first time, by adding external binaries in the first package, only to replace it with the new src-packages that depended on the now “legal” scala version in the release.

So that might be what one of them hinted at, that as long as the bootstrapping occurs in sid, and the next version uses the first sid package as dependency, then the second build in sid produces a legal package that can be promoted to unstable??? Thats still cheating, according to the Debian principles and guidelines, and also statements made on the mailing-list. (We must be misunderstanding something here… I hope, or maybe I hope not :slight_smile: )

There isn’t any command called pack.

compile is a standard sbt task. It only makes class files, in the build/quick/classes directory (normally it would be the target directory, but the Scala build uses peculiar directories, for historical reasons).

package is also a standard sbt task. It takes those class files and puts them in JARs.

dist/mkPack is unique to the Scala build. It calls package to make JARs, and then also calls dist/mkBin to make the launch scripts (scala, scalac, and so on in build/pack/bin.

1 Like

Unless reference compilers are allowed, you are doomed to failure, as we already covered in the other thread, and has already been covered multiple times over the years, every time we have this discussion.

In the interest of not getting confused, I suggest we leave that aspect of the discussion in the other thread, and keep this thread focused on how the actual Scala 2.13 build actually works.

1 Like

I’m not able to reproduce that. I don’t know what you mean.

compile doesn’t make any JARs, it only writes class files (to build/quick/classes).

1 Like

Selecting a bootstrap compiler is not the only variable in building the scala library and compiler, there’s also the question of compiler flags, and the question of meta-data (e.g., OSGi).

Let me add that personally, I would advice anyone against using a Scala release that was built by a non-standard process. I think there should be one single scala-library-version-x.jar out there in the world.

6 Likes