I’ve been in a lot of debates on Scala 3 features, often arguing strongly for compatibility and less churn over novelty and experimentation. Here are some thoughts I’ve put down about the Scala 3 migration, from both my OSS and Professional contexts, that may be useful to see where I’m coming from. Somewhat long, TLDR at the bottom.
OSS Migration
I maintain a suite of OSS libraries and tools:
sourcecode
geny
requests-scala
os-lib
scalatags
upickle
pprint
utest
fansi
fastparse
ammonite
mill
In many ways I am lucky: my own libraries form a self-contained dependency graph, so I have the freedom to upgrade them at my own pace. For the sake of discussion, I will focus on Ammonite, which depends on roughly ~all of the libraries above.
Ammonite Scala Version Support
When Scala 2.13 came out, I dropped support for Scala 2.11 in all my libraries and tools. This was not without controversy, as a very large proportion of professional use is still on 2.11 (we’ll discuss this later), but it is what it is. One way of contextualizing this is to compare the Scala version releases (Ammonite usually supports each new version within a week of release) with the Scala versions that Ammonite dropped support for:
-
Scala 2.10 came out Jan 2013
-
Scala 2.11 came out May 2014
-
Scala 2.12 came out Nov 2016
-
Scala 2.13 came out Jun 2019
-
Ammonite dropped Scala 2.10 in Dec 2017
-
Ammonite dropped Scala 2.11 in Sep 2019
Overall, Ammonite (and all my other libraries) aimed to support the last ~2-3 major Scala versions. This resulted in supporting each major Scala version for ~5 years, or about 3 years after the next major version comes out.
Using “5 years of support” as our rule of thumb, we can extrapolate this to Scala 3, we may find it looks something like this:
- Scala 3.0 comes out Dec 2020/2021???
- Ammonite drops Scala 2.12 in Nov 2021???
- Ammonite drops Scala 2.13 in Nov 2024???
All these dates are just guesses, but it indicates that in a best-case scenario, for my OSS work, I expect to be in the “migrating” phase for the next half-decade before I end up fully on Scala 3. This says nothing about Scala 3: it simply extrapolates how long I’ve supported old Scala versions in the past. In fact, if the slow Scala 3 migration goes as smoothly as the slow upgrade from Scala 2.10 to Scala 2.13, I’d consider that a resounding success.
Cross Building and Cross Sources
All my OSS libraries support multiple Scala versions via cross-building: first using SBT, and then Mill. Most sources are shared between all 2-3 supported Scala major versions, up to 20 supported minor versions, along with Scala-JVM and Scala.js. For sources that differ between the entries in the build matrix, I use version-specific source folders to encapsulate version-specific logic.
Ammonite has the most of these, as it interacts with unstable compiler internals, and currently has the following version-specific source folders:
scala-2.12.0_8/
scala-2.12.10-2.13.1+/
scala-2.12.9+/
scala-2.12/
scala-2.13.1/
scala-2.13/
scala-not-2.12.10-2.13.1+/
scala-not-2.13.1/
While this seems like a lot, it’s an acceptable price to pay to support the current set of 10 different minor versions from 2.12.1
to 2.13.1
, interacting with tons of internal and unstable APIs. Most of these folders have O(10s) of lines of code, so the amount of duplication is minimal.
Maintaining a cross-build with version-specific folders hand is somewhat tedious, but it’s the best compromise I have found so far. It allows library-development and Scala-version-upgrades to happen independently, with new library features supporting all versions of Scala, and minimal “forced” upgrades where wanting to use a new version of a library forces a downstream user to also upgrade their Scala version.
Cross-building has worked remarkably well over the past decade, across 2 different axes (ScalaVersion X ScalaPlatform), and I do not see any of the discussed alternatives (version-specific git branches, C-preprocessor directives, etc.) as an improvement over the status quo. Honestly, it works great.
One consequence of cross-building is that the oft-mentioned “auto-migration tool” is of zero value. There is not single point at which I can “cut over” to the new version entirely: rather, there will be a ~5-year cross-build period as old versions of Scala are slowly dropped, until all remaining versions are in the Scala 3.x series.
Compatibility is a continuum, not a binary property, and this shows here as well: the less compatible Scala 3 is with Scala 2, the more code has to be duplicated from the shared src/
folder into version-specific src-2/
or src-3/
folders. This accurately reflects the fact that decreasing compatibility bifurcates the codebase and increases the maintenance burden for the 5-year period until the old versions are discarded.
Binary Compatibility and Macros
Almost all my libraries use macros. Whether simple ones like sourcecode.Line
to pick up line numbers, deriving case-class serializers using upickle.default.macroRW
, doing tree transformations using utest.Tests{...}
, or heavy inlining using fastparse.P{...}
. While macros are nominally experimental, the reality is that the entire ecosystem depends heavily on macros: Scalatest, Circe, SBT, etc. all depend heavily on macros, and libraries like Jackson-Module-Scala do not use macros but use scala.reflect.runtime
just as heavily.
These also happen to be the foundational libraries that everyone relies on. Even downstream libraries that do not rely on macros themselves very likely rely on Scalatest and make use of its macro asserts!
Due to macros, the fact that Scala 3 can use libraries in a binary compatible way isn’t all that helpful: the focus should thus be on getting these core libraries cross-built and cross-published for Scala 3 as soon as possible. These core ecosystem libraries are all macro-heavy, and will need to be re-published for Scala 3: only then will the rest of the ecosystem even stand a chance at migrating
Professional Migration
Our work codebase is perhaps 1MLOC on Scala 2.12, and 1MLOC cross built between Scala 2.12 and 2.11, with a few stragglers still on Scala 2.10. Essentially, all our backend services are on 2.12, and all our Apache-Spark-related code must support both Scala 2.11 and Scala 2.12 since the current major version of Spark 2.4 is on Scala 2.11, and the next major version of Spark 3.0 is on Scala 2.12.
Migrating Services
Migrating our services to new versions of Scala is relatively straightforward: once all our upstream dependencies are upgraded, we can upgrade as well. We do not have concrete plans to move to 2.13, but will likely investigate it later this year and I do not anticipate any real difficulties in upgrading.
The last upgrade from 2.10 to 2.12 which took place early-mid 2018 took a few weeks of full-time work to upgrade maybe a million lines of code, and went smoothly without any issue. People loved it; cut our jar sizes in half and compile times in half as well!
If Scala 3 comes out, and isn’t too breaking, it should not be hard to fully cut over this code to Scala 3 as well.
We do not make use of any fancy Scala language features: we have almost no implicits of our own, and I believe we do not define a single macro. Nevertheless, we do make heavy use of libraries like Scalatest or Jackson-Module-Scala which themselves make heavy use of scala.reflect at compile time and run time, and so we will need to wait for them to be re-published for Scala 3 before we can consider upgrading. This is not a big deal: we did that upgrading to 2.12, will do it to upgrade to 2.13, and can do it again when upgrading to 3.0.
Migrating Spark-Related Code
Migrating our spark-related code is more tricky: our spark-related code is used as a library by our customers, and shares the same JVM and classpath. Thus even if Apache Spark 3.0 is out supporting Scala 2.12, and all our dependencies support Scala 2.12, we still need to support Scala 2.11 (and Spark 2.4) as long as we have customers who demand it.
For our spark-related code, Even if Apache Spark 3.0 comes out with Scala 2.12 support later this year, we are likely going to support Spark 2.4 with Scala 2.11 for many years to come. And later, when Scala 3.0 comes out, or Scala 3.1, or Scala 3.2, I fully expect we will be cross-building much of this code against Scala 2.11 and Scala 2.12 for the foreseeable future.
Other Enterprises
As part of our developer tools team, I often compare notes with other organizations using Scala, which are generally similarly sized to somewhat larger than us (100-1000 developers). Most of our peers are still in various stages of migrating from Scala 2.11 to Scala 2.12: whether investigating it for the first time, just starting the migration, or already part way through and making good progress.
Two major things stand out when talking to people about migrating past Scala 2.11:
-
Apache Spark’s current major version (Spark 2.4) is still on Scala 2.11. Thus
any code that interfaces with Spark has to also be on Scala 2.11 -
If you do not have cross-building capabilities in your build tool (i.e. most
build tools except SBT and Mill), you are unable to have different Scala
versions in the same build/repository. Thus even non-spark-related code the
happens to be in the same repository is stuck on Scala 2.11!
Neither of these properties is likely to change quickly: Apache Spark is a large project and needs time to upgrade to new versions, and cross-building in non-SBT tools will take its time to appear. Thus even if Scala 3 comes out end 2020/2021, we might expect many of these enterprises to be still in the middle of their 2.11 -> 2.12 migrations, perhaps finally moving onto Scala 2.13 some time later and Scala 3 even further down the line
The overall consequence of this is that if we want to support enterprise users in our open source projects, even supporting the last 2 major versions of Scala is insufficient! I expect many to still be mostly on Scala 2.x past 2025
TLDR
-
For my OSS work, I expect to be in the “migrating” phase for the next
half-decade before I end up fully on Scala 3. If the slow Scala 3 migration
goes as smoothly as the slow upgrade from Scala 2.10 to Scala 2.13, I’d
consider that a resounding success -
One consequence of cross-building is that the oft-mentioned
“auto-migration tool” is of zero value -
Compatibility is a continuum, not a binary property, and this shows here as well:
the less compatible Scala 3 is with Scala 2, the more code has to be duplicated
from the sharedsrc/
folder into version-specificsrc-2/
orsrc-3/
folders -
Due to macros, the fact that Scala 3 can use libraries in a binary compatible
way isn’t all that helpful: the focus should thus be on getting the core
ecosystem libraries cross-built and cross-published for Scala 3 as soon as
possible -
Migrating our services to new versions of Scala is relatively straightforward:
once all our upstream dependencies are upgraded, we can upgrade as well -
For our spark-related code, even if Apache Spark 3.0 comes out with Scala 2.12
support later this year, we are likely going to support Spark 2.4 with Scala
2.11 for many years to come -
If we want to support enterprise users in our open source projects, even
supporting the last 2 major versions of Scala is insufficient! I expect many
to still be mostly on Scala 2.x past 2025
Conclusion
Hopefully this gives some background to why I’ve been arguing in favor of compatibility and smooth (if slow) migrations, rather than hoping for a fast “big bang” upgrade with some hypothetical tooling-assistance.
In both my OSS and Professional contexts, not only are “endless” slow and smooth Scala version upgrades the norm, they’re also fine: the Scala language and implementation has improved by leaps and bounds, and everyone in my organization could tell you how much more productive they are on Scala 2.12 than Scala 2.10 due to the improved compile times.
Going forward, what I would hope for is to minimize breaking changes where we don’t need to make them, and where we do need to make them do them in a way that’s measured and intentional. If that means some not-fully-baked work-in-progress breaking change might miss Scala 3.0, land in Scala 3.1, have the old thing deprecated in Scala 3.2 and removed in Scala 3.3, then so be it. Scala 3 won’t be the first version to cause some amount of breakage, and I wouldn’t expect it to be the last. And that’s OK: 2.12 broke usage on Java 7, 2.13 broke a lot of collections APIs, such is the price of progress.
From where I sit, there’s nothing special about Scala 3.0 with regard to breaking changes: it is just another upgrade in an endless series of upgrades, one that we hope to be able to upgrade to and cross-build against with minimal pain and suffering.
Thanks for reading, and I hope you found this post interesting!