Pre-typer syntactic plugins in Scala 3?

fommil · February 2, 2022, 3:59pm

I have two usecases for pre-typer compiler plugins in Scala 3 that I would like the authors of the compiler to consider.

The first usecase is the ENSIME compiler plugin, which simply outputs compiler parameters for every file that is being compiled. An unfortunate limitation on Scala 3 is that the file must at least pass the typer phase successfully before the compiler parameters can be written. That means that the entire project has to typically be in a compilable state before the tooling can work. In contrast, in Scala 2, I can output the compiler parameters early which means I have access to ENSIME code completion and other features even for files and projects that have never successfully compiled.

The second usecase is to support the migration of the @deriving annotation for codebases that cross compile to both Scala 2 and Scala 3. This is effectively the derives language feature (I could write a version that more closely aligns with derives if I felt the annotation had a future) and I’m sure it is known as it was considered as part of the “review of the community macro landscape” several years ago, but cannot be implemented in the new metaprogramming model because it involves using annotations to modify the tree (and/or the companion object of that tree). There is currently no migration strategy for a cross compiling codebase, because @deriving cannot be implemented in Scala 3 and derives does not exist in Scala 2.

Could the Scala 3 authors please consider allowing compiler plugins to run before typer so that these two usecases can be supported?

som-snytt · February 2, 2022, 5:31pm

If the concern is “We don’t want people doing gnarly things before typer”, then early plug-ins could be constrained by comparing trees. Maybe no diff is permitted, or only adding annotations. Or plug-ins could be optionally quarantined and their effects only logged.

fommil · February 2, 2022, 5:38pm

counterpoint, .sbt file support (and other forms of Scala DSLs) can be implemented as a pre-typer compiler plugin. The reason why sbt is not implemented this way seems to be historic, but it would have trivially given support for perfect .sbt file editing in Scala IDE, Metals and ENSIME out of the box without any customisation needed, had it been implemented that way instead of a custom invocation of the compiler.

fommil · February 3, 2022, 12:01pm

note that this PR technically allows pre-typer plugins https://github.com/lampepfl/dotty/pull/13173#issuecomment-1028881424

but as @smarter notes, pre-typer plugins are intentionally restricted to “research plugins” only.

I’d like to understand further why this restriction exists and what would need to be done to convince the compiler authors why making this more generally accessible would be a good thing. e.g. how many use cases are needed to tip the scales ? With my two above, plus sbt files (and DSLs of that nature), that makes 3 use cases in the wild.

soronpo · February 3, 2022, 4:29pm

Another related thread:

soronpo · February 3, 2022, 4:33pm

To me this looks like just an opinionated block of “we don’t want Scala to do this”, and not a technical challenge. I can understand why plugins can’t be allowed inside of typer. But before it seems so trivial that it begs the question why not.

odersky · February 3, 2022, 5:07pm

We want to prevent language dialects. Scala 2 got a bad rep of complexity partly because it enabled extensions like that. It was certainly good for experimentation, but bad for having a simple, uniform language experience. For Scala 3 we decided we would not make the same mistake again, and to err on the side of caution, at least initially.

Note that also lots of other extensions have the same restriction. For instance, you can use a language.experimental import only in a snapshot compiler. I admit it’s inconvenient, but it’s better to be cautious. Rust has a similar policy.

So maybe the answer is to not shy away from snapshot compilers as a solution. They will always exist, so one can have a parallel track using them.

soronpo · February 3, 2022, 5:32pm

If we were to enable blackbox annotation macros, would that not just have the same “problems”? Or are you saying annotation macros are completely off the table for the non-snapshot versions? If they’re not, then I really don’t see the difference in allowing plugins to replace them instead of creating a whole API just for such macros.

IMO, snapshots are irrelevant for a library with a user-base. So currently such plugins are just an experiment or irrelevant for deployment. There is no in-between.

odersky · February 3, 2022, 5:51pm

I don’t know. We have not seen such macros yet. At first glance I would say it’s different since annotation macros will likely work on typed trees and their output will not influence the typing of the rest of the program. But in any case, too early to tell.

rssh · February 3, 2022, 6:02pm

As I understand, the original question was about the allowing use-case, when the compiler plugin should run before typer but restricted to read-only operations with existing sources. The compiler can enforce this by passing to the plugin ‘frozen’ view of the source tree.

Plugins like that can’t create a language dialect.

adampauls · February 3, 2022, 6:57pm

I don’t know. We have not seen such macros yet.

Sorry for my naivete, but isn’t the reason you don’t see them because they are impossible, or at least effectively so as @ soronpo argued?

fommil · February 3, 2022, 8:15pm

Thank you for the explanation.

However, I would challenge the premise. I am aware of lots of reasons why people think Scala is “complex”, but I have never once heard anybody blame anything that could be put down to pre-typer compiler plugins. Arguably the sbt dialect is one reason for sbt’s bad rep, but that isn’t even implemented as a pre-typer compiler plugin, and it already had a bad rep long before it started doing anything funky during the parse.

With regards to the nightlies; I’ve considered that as a solution to the migration usecase, although it’s fiddly. It, however, doesn’t work for the ENSIME usecase. (The ENSIME usecase could also be satisfied by making the list of source files visible alongside the Settings themselves, so that it can do all of its work during initialisation. But that’s tangential to this discussion).

It’s also worth noting that pre-typer compiler plugins are really easy to implement in IntellIJ, and the @deriving annotation was fully supported as a result.

som-snytt · February 4, 2022, 5:11am

We need the dark scala ecosystem where these tools can flourish and teem.

“They emerge in the night, and they only work with nightlies!”

odersky · February 4, 2022, 4:25pm

Maybe that’s the way to go then. This could probably be worked into the DottyLanguageServer, if it’s not already provided.

fommil · February 4, 2022, 4:36pm

I’d certainly be happy to see improvements in that area. If you download the ENSIME source code from https://ensime.github.io/ and look through the plugin.scala in the scala-3 folder you’ll see some other hacks I had to add in there to deal with the fact that Settings can no longer be “unparsed”. It would be good to recover that Scala 2 feature as this is a really good mechanism for extracting the compiler parameters for use by any tooling that then invokes the compiler (out of band, e.g. like in the IDE usecase). You should be able to see in that short file exactly how it could be cleaned up to simply .foreach over the list of source files, if it was available, instead of being called for each compilation unit at a later compiler phase.

adampauls · February 10, 2022, 7:36am

I just wanted to see if I understood the state of things. First, I’ll name three kinds of plugins:

Read-only: plugins that read code but never change or add it. Something like Java’s FindBugs falls into this category.
Append-only: plugins that generate new code, but never alter existing code.
Read-write: plugins that add new code and change or remove existing code.

At present, full read-write (non-research) plugins are permitted, but only after the typer. This means that you can do some pretty terrible things, like take every pair of single-arg methods in a class and create a new method with named with the concatenation of the method pair of method names that composes the methods. Other code in the same module cannot depend on those synthesized methods, but downstream modules or other projects could. You could even just swap + and - everywhere (in cases where they have the same type signature of course). So in some limited (but still terrible) sense, these plugins can “create a dialect.”

The compiler team says that no plugins should run before the typer. I think there is a pretty clear use case for append-only code generation before the typer, with a replacement for macro annotations (in particular, @deriving) being probably the canonical example as OP said. @rssh suggested that read-only plugins should be able to run before the typer, pointing out that such plugins can’t create a language dialect. IIUC level of power would also be sufficient for ENSIME. @som-snytt also suggested limiting the power of macros.

Is the position of the compiler team that no plugin of any kind, even read-only ones, should ever run? If it were technically possible, would append-only be palatable enough that the compiler team would allow them? And is there any reason that read-only pre-typer plugins shouldn’t be possible?

odersky · February 10, 2022, 10:58am

The compiler team has no official position in the matter. One can discuss things in the dotty repo in a feature request or issue. But you’ll have to get someone excited about it, who will actually push for the changes.

My personal opinion is that read-only plugins are much less or a problem than plugins that modify or augment the tree. But they also very limited. Maybe a more flexible alternative would be to open up the parsing in a separate tool. That would be beneficial on its own. I.e. a parser that can be customized with the kinds of trees it generates, maybe coupled with a formatter.

FelixSelter · June 4, 2023, 7:14pm

Are there any updates on this? Scala already has many features to customize the language. Macros and compiler plugins being some powerful examples. I have never experienced the old problems myself and would welcome a way to create language dialects. Everyone can decide on his own how much complexity he wants to add to the language. Currently anything like this is implemented with preprocessors that create a copy of the files, convert the dialect to actual scala code and run the compiler afterwards. Making this an official option would allow for better tooling and automatic linting, error generation etc. This feels just more like scala. Giving you all the possibilities but also the responsibility. I mean we also have scala xml. We should anything like this be not permitted. It would allow adding support for maybe scala json, scala toml or other languages. To me this sounds like it would improve developer experience in some cases by a lot.

jxnu-liguobin · June 6, 2023, 1:52am

I also want this

littlenag · June 6, 2023, 6:37pm

I would encourage folks who want an even more expressive metaprogramming story to look at Pre-SIP: Export Macros.

Export Macros can do quite a lot of what I think folks have been asking for. And better, the feature has been implemented in my branch so if you want to test it out you can. Its obviously not production grade, but it is enough to get a sense of the potential I think.