Improving Scala 3 forward compatibility

Kordyjan · October 25, 2021, 4:48pm

Scala 3 has very good backward compatibility guarantees between the minor versions. On the other hand, after the recent release of Scala 3.1, we can see that the libraries should be really cautious with updating the compiler version, as it is forcing the bump on every user of that library. We do not want library authors to be stuck on old versions of the compiler as that would mean that they are locked out of many bugfixes, or we would need to spend enormous effort on backporting every bugfix to all past versioning lines.

To mitigate that, we propose the following plan.

Goals

Newer version of the compiler should be able to generate output that can be consumed by older ones
Library authors should be able to declare parts of API that require a newer version of the language than the rest of the library
Users should be able to safely use symbols added to the language API in 3.x, when they are using 3.y as their output version, as long as those symbols do not require language features added after 3.y, even if x > y.

Implementation

Output version

We propose that the compiler should accept the flag --scala-target that accepts the Scala version as an argument. The specified output version needs to be lower or equal to the used version of the compiler. The default value for this flag is its maximal value - the version of the compiler itself.

Specifying output version makes sure that tasty files produced during compilation would have a version matching output version. It also makes sure that no symbol is used that was added to the language api in any version newer than the specified output version. If the code is using any language feature that was added in a version newer than the output version, the compilation error should be raised. All new symbols added to stdlib in future versions need to be marked with some annotation specifying the version, e.g. @since(“3.1”).

Local output version

There should be a possibility to mark any file in the project to be using a higher output version than what is specified in the project configuration. It can be implemented by top-level import, such as:

import language.target.`3.1`

This should override other means of specifying output versions.

A symbol can reference a symbol from other files if and only if the file containing referencing symbols has a higher or equal output version than the file containing referenced symbols. This allows us to sort compiled sources at the start of the compilation. This means that if multiple files would be compiled together, the compiler first processes files with output version 3.0, then those with 3.1 (remembering symbols from previous files), then those with 3.2, and so on.

This feature gives library maintainers the possibility to gradually extend APIs with features requiring a newer compiler, without modifying the core of the library that can be used with older compilers. This has some limitations, e.g. it is not allowing for adding a new supertype for the existing type. On the other hand adding new types, new given instances and extension methods to existing types is possible and should be sufficient for relatively fast and stable improvement of library API.

Adding local output versions means that now it is a normal situation for the compiler to encounter tasty files with versions newer than the specified output version for the current compilation run. The compiler shouldn’t raise an error but instead should just ignore those tasty files together with corresponding classfiles. In the future, we may want to read classfiles to know what symbols can potentially reside there and use that information for better error messages for not resolved references that can potentially be resolved with a higher output version.

Language API compatibility

Multiple symbols that are added to the standard libraries in new releases do not require any new features of the language. Even though they won’t be available in projects with a lower output version. To mitigate that with every new release of the compiler we can also release the compat artifact, which would contain all symbols added between 3.0 and the current version of the compiler. Every one of them would be defined in the file with output version matching the earliest version of the language that given symbol can be defined in. They would have the same qualified name as the “true” symbols.

The compat artifact can be then added as an ordinary dependency to any project. This assures that the symbols are available both in compiletime and in the runtime. Those compat symbols need to be marked in some way so the compiler recognizes them and ignores them if “true” symbols are also on the classpath. This case can occur when library B is depending on A, B has a higher output version than A, and A is using compat as a dependency. We can implement mentioned marking either by new annotation or by some information in the metadata of the compat artifact.

The action plan

Implementing --scala-target flag and @since annotation and releasing them in Scala 3.1.1
Gathering community feedback and implementing local output version. This will be released no earlier than with Scala 3.2.
Creation of compat artifacts. Those can be published independently from Scala’s releases schedule. Moreover, we do not need to work on them right now as only a minimal number of symbols was added to the stdlib in 3.1, and right now, the benefit from the compat artifact would be negligible.

Jasper-M · October 26, 2021, 8:50am

If the backwards compatibility guarantees are very good, is it really a problem that users are automatically upgraded to a more recent version? This seems to run the risk of becoming at least as burdensome as the Scala 2 versioning system.

Wouldn’t it be sufficient if the compiler supports a compatibility mode that guarantees it can consume source code written for an older version? So if a user has e.g. -Xcompat:3.1 enabled it shouldn’t matter that his compiler version is upgraded to 3.2. Some parts of a library that use 3.2 features may not work in that mode but that would be the case either way.

smarter · October 26, 2021, 12:42pm

That’s impossible to guarantee (and has never been guaranteed between patch releases of Scala 2), any bug fix can potentially break code that someone relies on.

julienrf · November 3, 2021, 1:09pm

Thank you @Kordyjan for writing this detailed plan.

It is not clear to me why the @since annotation is necessary, though. Is it solely for documentation purposes? Or will some parts of the compiler chain rely on it? If this is just for documentation, what do you think of using the existing @since Scaladoc tag instead?

The compiler flag --scala-target seems very promising to me. I believe it will solve most of the issues that the end-users may face in practice. Typically, the ability to upgrade a library (because it has an important fix or feature) without having to upgrade the compiler (because that causes compilation errors that would take a lot of effort to fix). Which corresponds to your goal 1.

I am not sure about the ability to choose the language target per compilation unit. First, the use case of having a library whose some parts use new language features (goal 2 in your post) could already be addressed without special compiler support, by splitting the library into two modules, where the first module targets, say, Scala 3.0, and the second module depends on the first module and targets Scala 3.1.

The only place where we might need such a mechanism is in the scala3-library, which can not be split into several libraries. An alternative solution could be to customize the build of the scala3-library so that it is made of a single jar containing compilation products with TASTy files of various versions. So, in the same jar we would have TASTy files targeting Scala 3.0 whenever possible, and targeting Scala 3.1 or above when necessary. That would also achieve your goal 3.

Last, I believe another item should be part of the plan: the way library management tools resolve the scala3-library should be changed so that in case a project depends on libraries that depend on different versions of scala3-library, we pick the highest one (just like we resolve transitive dependencies). Currently, library management tools resolve the scala3-library version only based on the version of the Scala compiler used in the project.

mpilquist · November 3, 2021, 5:08pm

I maintain lots of libraries and really, really don’t want to have to maintain submodules for each minor release I want to integrate with / use a new API from. I don’t have specific feedback on the per-compilation unit proposal but I’m encouraged that folks are thinking about how to make this a language/compiler/tooling problem instead of making it something library authors and users have to deal with.

sjrd · November 3, 2021, 11:20pm

This would only be necessary if you want part of your library to use new language features, that have no equivalent in the older tasty format, and in a subset of your API that targets only users of the newer compiler. The use cases to even be in this situation in the first place should be few and far between, outside of the standard library itself. I would hope that most libraries will stick to one minimum required language version for the whole library.

mpilquist · November 3, 2021, 11:39pm

Understood, perhaps this use case is rare enough to not warrant the feature – that’s one reason I didn’t want to provide any specific feedback on that part of the proposal. However, if faced with adding an additional module or bumping the entire library to a new required --scala-target, I’ll take the latter almost every time as it’s more sustainable for library developers and users (despite the risk of forcing a compiler upgrade on everything downstream).

prolativ · December 21, 2021, 3:33pm

The implementation of the first part of the proposal is getting close https://github.com/lampepfl/dotty/pull/14156

nafg · February 1, 2022, 5:59pm

Why did --scala-target become -Yscala-release? The latter name is a lot less self-documenting. There’s no way to know what it does without having learned about it, while --scala-target is self-explanatory.

The other thing I’m confused about is I thought TASTy was supposed to be a lot more stable than classfile encodings and hardly ever have to change in a breaking way. And as for non-breaking changes (like new AST node types) why can’t that work dynamically? Why can’t the consuming scala compiler worry about it if and when it finds such a node? That way the producing compiler wouldn’t need a flag at all unless it could generate the same thing in an older encoding, or if I want to statically guarantee that I support a certain version (so if I accidentally use a newer feature I’ll know to take it out).

It seems like that would make life simpler for a lot of people.

A slightly different take is if the producing compiler knows what version of Scala introduced each TASTy node, it can at the end put in the file the minimum required TASTy version to read that particular file.

As for the standard library annotations, I’m confused why the standard library can’t be treated like just another library here. If my library depends on a newer version than some application using it, why can’t the normal eviction rules apply? Especially if my previous suggestion were implemented, that would enable me to use anything from the standard library that I got bumped to unless it uses a newer TASTy feature, at which point I would be told to upgrade my compiler version.

Kordyjan · February 1, 2022, 6:12pm

The flag is named -scala-release to make it look similar to -release flag that is doing exactly the same but on bytecode instead of TASTy level. Y is here to mark the flag as unstable, and will be dropped in 3.2.

For now, the standard library is treated in a special way by the compiler. In the future, it may be possible to treat stdlib as any other dependency, and then annotations will be only used for documentation purposes. We are thinking about that.

Jasper-M · February 2, 2022, 8:56am

I thought that the target flag specifies which version of bytecode to emit, and release specifies which version of the java std library to compile against. Or is that different in Scala 3 vs Scala 2?

prolativ · February 2, 2022, 9:38am

For javac both -target and -release allow you to specify the target version of JDK to produce bytecode for. The difference is that -release additionally checks that the parts of the stdlib API that you reference actually exist in this JDK version. -target is more like YOLO mode, hence it was renamed to -Xtarget in scala 3 and it’s usage is discouraged - -release should be used instead whenever possible. As for compiling to older TASTy we also perform the checks, it’s more like -release rather than -Xtarget.

smarter · February 2, 2022, 11:15am

Yeah I think Scala 2 -release doesn’t set the target, but both Scala 3 and javac -release do, IMO the behavior of Scala 2 should be changed to align with the others.

eed3si9n · February 15, 2022, 7:11pm

I also find the term -Yscala-release confusing, even though I think I am relatively familiar with JDK tooling. In JDK, the idea of release is (and has been) a thing – JDK Releases for example lists all JDK releases and their associated release type, release support timeline.

JEP 238: Multi-Release JAR Files and JEP 247: Compile for Older Platform Versions extended Java compiler toolchain to support multi-release JAR file that contains multiple A.class against the Java class A, but targeting JDK 8 and JDK 11 etc. In that context the flag --release, kind of makes sense because

javac --release 8 -d classes src\main\java\A.java
javac --release 11 -d classes-11 src\main\java11\A.java

are denoting JDK releases, and often compiled against different source code with one using Unsafe etc.

Since 2018 (Scala 2.12.5) Scala 2.x has supported --release 8 in the same semantics (https://github.com/scala/scala/pull/6362), which allows downgrading of JDK target and generating multiple *.class.

-Yscala-release 3.0 I don’t think translates into Scala version context, chiefly because we don’t use the term “release” to describe minor versions within a binary compatible series, and it might also be confusing especially because the idea of forward compatibility was something Scala 2.x library authors took for granted.

I am commenting here because there’s now a proposal that translates -Yscala-release name into an sbt setting (Add support for Scala 3 -scala-output-version flag by prolativ · Pull Request #6814 · sbt/sbt · GitHub):

ThisBuild / scalaVersion := "3.1.2-RC1"
ThisBuild / scalaReleaseVersion := "3.0.2"

This notion that the word release implies better checking of API compatibility but target doesn’t is an implementation detail I don’t most people appreciate or care since most libraries probably just use older JDK to build and not create multi-release JAR.

As alternative, I think something like

ThisBuild / scalaVersion := "3.1.2-RC1"
ThisBuild / targetScalaVersion := "3.0.2"

or

ThisBuild / scalaVersion := "3.1.2-RC1"
ThisBuild / compatibleScalaVersion := "3.0.2"

would be more intuitive. Alternatively:

ThisBuild / compileTimeScalaVersion := "3.1.2-RC1"
ThisBuild / scalaVersion := "3.0.2"

I think also should be considered. Primary objection (Add support for Scala 3 -scala-output-version flag by prolativ · Pull Request #6814 · sbt/sbt · GitHub) to that is that now we’d have to rewrite builds that uses scalaVersion to pick compiler options, which is fair.

Kordyjan · February 16, 2022, 8:42am

I really like scalaVersion / compatibleScalaVersion pair for an sbt setting. I find it much more intuitive than any other proposed option. Now I’m thinking about what should be used as a name for the compiler flag.
-compatible 3.0, -compatibility 3.0, -compatibility-version 3.0, -compatible-version 3.0? Any other ideas?

smarter · February 16, 2022, 10:33am

Whatever we choose, I think we should consider having a matching setting for setting -release X in scala 3 / javac and -release X -target X in scala 2 (because in scala 2 only -release doesn’t imply -target as discussed above). So for example if we go with compatibleScalaVersion I would call that setting compatibleJavaVersion.

prolativ · February 16, 2022, 10:50am

How about outputScalaVersion or scalaOutputVersion setting and -scala-output compiler flag?

eed3si9n · February 17, 2022, 6:32am

I am ok with either

ThisBuild / scalaVersion := "3.1.2-RC1"
ThisBuild / outputScalaVersion := "3.0.2"

or

ThisBuild / scalaVersion := "3.1.2-RC1"
ThisBuild / compatibleScalaVersion := "3.0.2"

For compiler flags:

scalac --compatible-java 8 --compatible-scala 3.0 -d classes src\main\scala\A.scala

scalac --output-java 8 --output-scala 3.0 -d classes src\main\scala\A.scala

Both work well here as well I think.

soronpo · February 17, 2022, 8:43am

Instead of compatible, I propose compat

prolativ · February 17, 2022, 9:03am

One argument against compat(-ible/-ibility/-) is that people might think that the compiled code will either only work with this particular version or will work only with versions which are chronologically equal or later, which is not entirely true if we take patches into account - code compiled with compatScalaVersion := "3.0.2" would work with code compiled with 3.0.0 compiler but it would bump the version of the stdlib to 3.0.2