Is scalafix really going to be able to manage migrations to Scala 3?

japgolly · April 11, 2020, 8:57am

Many months ago, I upgraded my 10 or so OSS libs to cross-compile to Scala 2.13. I tried to use scalafix and after a lot of effort, concluded that it just wasn’t capable. Now today I finally find myself in a position to upgrade my big, huge Scala project (ShipReq btw) to Scala 2.13 and I’m rediscovering the degree to which I can’t use scalafix at all. All of the concerns about the upgrade to Scala 3 have been responded to by saying “scalafix will handle the majority of the work” and if that’s the case, I think it’s going to need a bunch more work before it’s capable of that.

My understanding of how {,sbt-}scalafix works is as follows:

it is scoped to an SBT compilation unit, meaning each SBT module is processed separately, and for each module, main and test and processed separately
scalafix works by analysing semanticdb output and then updating files accordingly
semanticdb fails if ANY errors are encounted

The reason this has been terrifyingly ineffective for me is as follows:

ShipReq has 39 modules. Even my OSS libs have an average of ~8 modules each.
Simply changing Scala from 2.12 (with 0 warnings) to 2.13, results in a ton of errors, not just warnings
scalafix fails because of all the errors
With my OSS stuff, I then fixed all the errors so that scalafix would work, ran it, and it ended up creating new errors which prevented it from working against other modules
I ended up upgrading all my OSS projects with just regex replacements. Now I’m staring at my huge project thinking I’m going to have to do it all manually again.

This is a polar-opposite experience that has been advertised in the Scala community wrt both 2.13 and 3.0. It’s probably fine for tiny and/or toy projects but against the big ones (where it would be the most important) it just doesn’t work as is. I have grave doubts about state of migrations when 3.0 arrives.

But there’s time to get this improved before 3.0! How, I’m not sure but one idea would be to improve scalafix and semanticdb to be tolerant of imperfection. If one were able to just run scalafix against their entire project and it fixed what it could in one pass, (then one chases down the rest of the errors) sounds like it’d already be a huge improvement. Thoughts?

olafurpg · April 11, 2020, 9:26am

Thank you for bringing this up! It’s a valid concern.

I’m not aware of any ongoing investment in writing Scalafix migration rules to help automate the upgrade to Scala 2.13. The collection rewrites handle some cases but not everything, from what I understand. The Dotty compiler provides a -rewrite flag for Scala 3 migrations, which @odersky knows more about.

Scalafix is at its core is a general purpose refactoring and linting tool. All the utility of Scalafix comes from rules that need to be implemented on top. Non-trivial migration rules are difficult to implement often because the domain is inherently complicated. Unless there’s active investment in implementing custom rules to help automate Scala version upgrades, Scalafix won’t help with Scala version upgrades.

Semantic refactoring rules (the ones that operate on symbols and types) require the input source code to successfully compile, for better or worse. It’s still possible to implement syntactic rules that still work even if the code doesn’t typecheck, but they are less capable since such rules can only inspect the syntax of the input programs. If there is interest, it’s definitely possible to extend the SemanticDB compiler plugin to generate partially incomplete SemanticDB files for the parts of the code that do typecheck successfully. I estimate it’s not too difficult to implement, but I question how useful it would be. Most of the time, the job of the migration is to fix the parts of the program that won’t compile after the upgrade.

japgolly · April 11, 2020, 9:50am

Hey Olar!

The Dotty compiler provides a -rewrite flag for Scala 3 migrations, which @odersky knows more about.

Oh!.. I’ve heard @odersky refer to a migration tool in a lot of Dotty talks. Maybe I just assumed it was scalafix (?). Can someone in the know please confirm?

it’s definitely possible to extend the SemanticDB compiler plugin to generate partially incomplete SemanticDB files for the parts of the code that do typecheck successfully. I estimate it’s not too difficult to implement, but I question how useful it would be

It depends on the granularity. Say you had a big class and companion object, and there’s an error in one of the class methods. If the invalid method body and invalidates the method which invalidates the class which invalidates the object and all callers then that wouldn’t be useful at all. On the other hand if it only invalidates the method body and retained the knowledge that the method exists and its type signature, then that would be super-dooper useful! I hear what you’re saying that it’s still ones job to resolve all errors anyway, consider these two scenarios:

you run scalafix once, 95% of your entire project compiles, get what you can for free, then you manually fix the rest
you go through your project sub-module by sub-module (in my case I have 78), make it compile, run scalafix for the specific module, make it compile again (if you’re unlucky - it’s been “kind of often” in my experience).

For me the first case is very acceptable and the second case is very frustrating, just because it requires so much many fiddling and context switching (both mentally and changing all the SBT cmds).

Finally on a more personal level, Olar, thanks heaps for all of the effort you’ve put in to both tools. I have no idea how my tone above reads, (especially with screaming kids all around my as I wrote it, and now), but the intent was that, yeah I’m a bit frustrated but they are awesome tools. I think they just need some adjustment to handle bigger use cases like I’m describing

soronpo · April 11, 2020, 10:23am

See this PR

odersky · April 11, 2020, 10:40am

I was also hoping for scalafix to improve sufficiently to be the official rewrite tool for Scala 3, so I am disappointed that it is not there (yet?).

Until now, we had overall pretty good success with using the dotty compiler itself for the rewrites to Scala 3. At least the community managed to port a sizeable number of large projects, which are now in our community build. In our experience, the hard part is porting macro code; after that it’s relatively straightforward.

The idea is to use options

- language:Scala2compat -migration

Then dotty will compile in a more forgiving way and issue migration warnings. It also offers some automatic rewrites when the -rewrite option is added.

The PR https://github.com/lampepfl/dotty/pull/8700 proposes to simplify the options that need to be given for this to just one -source option:

-source 3.0-migration

Should we try to refine that or invest in a rewrite system on top of scalafix? I don’t know. Either way, it would be really important to get the community’s feedback about what are problem points and what additional rewrite rules are suggested.

olafurpg · April 11, 2020, 11:15am

Scalafix is very much capable of doing advanced rewrites. For example, the ExplicitResultTypes rewrite inserts readable type annotations for public members with inferred types and automatically inserts missing imports where needed based on the scope of the refactored position. Scalafix also provides rich infrastructure to productively develop, test and distribute custom rules.

The biggest missing piece for Scala 3 migration rewrites in my opinion is that I haven’t seen a comprehensive list of cases we expect the Scala 3 migration rewrites to handle. A concrete action item for the next step would be to write a set of input/output code examples documenting how Scala 2 programs should be refactored to be compatible with Scala 3. It’s important that false positive and false negative cases are also covered, not just the happy path.

This corpus of input/output code examples can be maintained in the Dotty repo. By itself, this corpus could already serve a valuable purpose as executable documentation for users looking to learn about the differences between Scala 2 and Scala 3. The structure of the sources can be like this

.
├── input
│  └── Hello.scala # Scala 2 source code before migration.
└── output
   └── Hello.scala # Expected output after migrating to Scala 3.

This corpus can be used as test cases for the migration rewrites regardless if they’re implemented with -rewrite or Scalafix. Once the corpus exists, I’m happy to provide an estimate how hard it would be to implement the rewrites with Scalafix.

odersky · April 11, 2020, 12:59pm

I think most of the easy migrations that can be described that way are already performed by Dotty.

Where scalafix would make a huge difference is if it could make everything that’s inferred by scalac explicit:

Inferred types of all fields and inferred result types of all methods
Inferred type arguments
Inferred implicit parameters
Inserted implicit conversions

Mark all inserted code so that it’s easy to see it was inserted, for instance by placing it between start/end tags in comments. For instance,

val x = 
  List(List(1, 2))

should be expanded to

val x/*SI*/: List[List[Int]]/*EI*/ = 
  List/*SI*/[List[Int]]/*EI*/(List/*SI*/[Int]/*EI*/(1, 2))

There will be some corner cases where the inferred type cannot be written down (your other post today is an example). Ideally, these unrepresentable types should be omitted by placing them in comments.

Once we have that, we can complement the tooling on the dotty side by selectively dropping inserted code in /*SI*/ ... /*EI*/ as long as everything still compiles. In most cases that should remove everything, but in some cases, some explicit type or term will still be required. That’s then much better to leave in than to have to hunt down an obscure type inference error. Also, it would give excellent guidance to the dotty team to further improve type inference and implicit inference.

So if we get this to work, it would make a huge difference!

julienrf · April 21, 2020, 9:38am

Hello!

Personally, I think it’s early to conclude whether a code rewriting tool (scalafix or in the Scala 3 compiler) will or will not be able to manage the migration to Scala 3 because we currently lack hindsight on the topic. To move forward, the Scala Center has created a collaborative Scala 3 migration guide, whose goal is to be a central place where developers can find information related to the migration to Scala 3. We encourage every Scala developer to share his/her knowledge/experience in migrating to Scala 3!