Reproducibility of the Scalac compiler

My goal is to be able to have deterministically built artifacts so that multiple entities can build then sign their output, and that the signatures will agree. This all depends on byte-for-byte identical output from sbt and scalac. Assuming that the builds don’t do something like inject the build date or anything else obvious that would break determinism, and that both entities are using the same versions of the build toolchain, can I expect deterministic output?

Hi

The compiler is stable, we test this by running it a third time on the library and compiler sources and comparing the classfiles with the ones from the second round (https://github.com/scala/scala/blob/v2.12.2/scripts/jobs/integrate/bootstrap#L467-L485).

Im not sure if this is true under incremental compilation. Also, packaging jars in sbt is maybe not stable.

1 Like

This isn’t true under incremental compilation, nor when you feed sources into the compiler in a different order. See https://github.com/scala/bug/issues/10343. A similar (and harder to debug) issue exists for lambdas and anonymous classes that I didn’t have time to report yet.

Would it be feasible (for users wanting reproducible builds) to stick to clean build and sort sources?
Building from scratch sounds very feasible, sorting sources might depend on the build tool but is hopefully easier to hack than fixing Scalac?

EDIT:

IIUC, if stability is source-feature dependent, it’s only tested on compiler and library—which in turn are somewhat conservative/rely on more solid features than maybe shapeless or more original libraries. Of course that’s likely good enough for 90% of the code out there.

What about testing build stability on the community build, once known bugs are squashed?

That would probably work.

I have opened a ticket in scala-dev: https://github.com/scala/scala-dev/issues/405

1 Like

I see this has made it to 2.13.0-M5, great!

I’ve been playing with an sbt plugin for scrubbing the artifact and sharing/checking build results across different machines/environments (https://github.com/raboof/sbt-reproducible-builds) - would love your feedback!