Scala 3, macro annotations and code generation

I try to clarify what I’m after.

I didn’t care so much about macro annotations so far. I thought they’re on a good way, especially as they’re useful already. Since you can generate “internal” definitions (not visible to the rest of the program) now I was under the impression that the next logical step is to make those definitions somehow visible outside of the macro (e. g. “export” them to the outer scope) to achieve what I’m actually after.

I’m after some nice code-gen facility, coherent with the rest of the language.

That’s exactly the point!

And you can’t, by no means, do code-gen without introducing new definitions. That would fully defeat the point in most cases.

But Scala has still no proper facility for code-gen.

I really don’t understand why such an important feature is neglected!

I’m waiting for this feature more or less since macros were announced in Scala 2. I knew that those would go away and never invested in learning them. But since than I’m waiting for “the real thing”. And now? Nothing?

I say it once more: Concatenating strings into source files is a joke. More or less everything else is better. Even C++ templates are better in this regard. (And C++ template magic is actually hell).

Quite a lot of other people were so desperate that they used the interim solution in Scala 2. Why did they? Why would anybody invest in a “throw away” solution? Especially as this meant to dabble in complex compiler internals without any stability guaranty.

Also, why are there now so many macros still left that can’t be ported to Scala 3? Why did everybody use those even you could be sure that this will mean trouble updating to a future Scala version?

The answer is imho quite clear: People desperately need code-gen.

Code-gen is everywhere. Whole frameworks in all kinds of languages are build upon it. Even the Scala compiler requires it. More or less everything prominent in the Java world is highly depended on code-gen (as otherwise Java would be even more boilerplately than it is already). But in Java almost nobody complains. Even Java code-gen is inconvenient and unsafe there. Most people actually praise that Java kluge and love their magic annotations in frameworks like for example Spring.

Rust has a quite “primitive” token stream macro system (not much better than gluing untyped symbols together). Still the Rust macros are hyped as one of the most important features in Rust. For many it’s even one of the absolute killer features that Rust has and others don’t.

Could we please recognize that code-gen is a game changing facility when built into a language? People lining up to use such a feature because it solves real problems!

What I finally want in Scala 3 is a safe and convenient way to generate code from code, that works fine in an IDE. I strongly suspect a lot other people also want that. Everybody who needs to port code-gen macros to Scala 3 is waiting for it!

This self-imposed constraint was a good idea to build a very nice foundation for Scala’s new macros. I’m glad it was done this way as it resulted in an very clean solution, so far.

But this constrain needs to be lifted in some form to make code-gen possible. Like said, code-gen is pointless when you can’t generate anything else then implementations of already existing symbols / definitions. Scala still offers nothing in this regard when it comes to meta-programm code-gen.

Also the proposal here at hand seems imho a little bit self contradicting: How does checking whether some types are present and correctly implemented doesn’t affect type checking? What kind of errors should we expect if a, let’s call it “constraining macro annotation” doesn’t find the right types it requires to exist (which the compiler needs to compute anyway, btw.)? My best guess would be this results in type errors at the usage side because expected definitions / implementations are missing… So a “constraining macro annotation” would affect type checking, wouldn’t it?

I think both points are valid.

Extended editor support to look at desugarings is imho a very good idea, but orthogonal to some code-gen facility.

Code-gen should imho always end up in some generated code on disk. Only not in files that are meant to be edited by humans, land in version control, and show up in reviews! That’s the most terrible part of this proposal, imho.

But I’m also dreaming of better introspectability of some magic the compiler does under the hood. Something like that would make tooling really valuable.

One could go even one step further and enable more of this kind of editor magic. I would love if the Scala tooling could implement something like “code portals”. This is an ingenious idea nobody ever picked up, which is a shame as it fits especially well with the evaluation by substitution model of Scala. (VSCode code-lenses aren’t interactive, and can’t nest like portals).

Code-gen on the other hand is often a kind of build step. It should be independent of the IDE / editor someone uses.

This would just mean that all kinds of tools would become part of Scala the language. We’re back to the 90’s where your language was sold together with an IDE. Moving away form this IDE meant substantial rewrite of your code (if it even was possible to reasonably move away form the IDE; think VB6, or so).

This doesn’t seem relevant as the Kotlin compiler is tightly bound to the JetBrains IDE.

In Scala an IDE doesn’t “see” anything a compiler pluing does. Quite the contrary…

Also the initial sentiment doesn’t look very honest given this here:

They’re literally selling a meta-programming toolkit… So the “it’s too complicated for the tooling developers” argument falls apart.

That’s a good idea!

Generated “invisible” code is problematic on all kinds of axes. You can’t navigate to it and introspect it, debugging it is a horror, as you can’t even see it to figure out what’s going on.

I don’t see this. At least this doesn’t fit my definition of “fragmentation”.

It makes absolutely no difference whether I write some boilerplate code by hand or let a robot do the work. In both cases the result will be vanilla Scala code. Code that can be feed into the currently existing compiler without issues.

Using code-gen can’t lead to “dialects”. At least this doesn’t fit my definition of “dialect”…

Only if it would be possible to add or modify syntax, or change semantics of build-in language constructs this would result in a “dialects”. But nothing like that is possible through code-gen that is embedded into the language. All you have are the expression and declaration types the language offers anyway. Nothing can be changed or added there. It’s just a robot writing vanilla Scala at the end of the day!

One can’t build a macro that adds for example “do notation” to the language. Or build a macro that would finally allow me to use emojis as symbol names. That would create dialects as such code wouldn’t be recognized by the currently existing Scala compiler. But just letting a robot write some definitions won’t create “dialects” whatsoever.

Yes it’s clever. A little bit too clever for the liking of some here, I think…

That’s not an “added convince”. That’s what a computer is actually for: It should do the tedious work! That’s the main reason to use a computer, namely to automate things away.

I already know what needs to be there. I don’t need the compiler to check that. I need the machine to do the actual work and implement what needs to be there. That’s the whole point of automation. It’s nuts when the actual reason to use code-gen gets relabeled as “an added convince”.

This cries for trouble.

Someone could change generated code in ways that break intend but doesn’t break it’s interface (which is all the compiler can reliably check). Than have fun debugging.

Generated code would be the last place to look at as it’s usually reasonable to assume that some code-gen tool works fine, as otherwise also other people would have issues at the same time which you get to know easily for example by looking into the bug tracker.

This doesn’t “work well”. All this generated code needs to be read and maintained together with the handwritten parts. Actually it’s not even clear form looking at that code which parts are auto-generated.

The very next question is how about updates to the generated code? Now you need to use refactoring tools… Because you can’t distinguish the parts that are generated and those that are hand written. Just changing something in the code-gen templates and regenerating code is not possible after the initial generation of code. Alternatively you have // DO NOT EDIT THIS IT IS GENERATED CODE ANY CHANGES WILL BE LOST blocks everywhere, so you can see what you shouldn’t edit. Only that’s it’s hard to enforce that. So even more tooling with more complex features is needed…

But that’s exactly the scenario code-gen is used for!

Nobody uses such a heavyweight feature to generate a few simple lines.

When you grab code-gen you will usually generate a lot, and often quite complex code. Code that otherwise nobody would like to write by hand. And now exactly this kind of unwieldy code is all over the place… Come on.

But that’s the exact reason to use code-gen in the first place!!!

Just write the generated code to disk. Problem solved.

That’s also easy for tooling, as tooling almost doesn’t need to be aware of any “magic” going on. It just gets some additional folder full of source files. Java’s annotation processors work that way, and it works fine, is simple, and easy to grasp for developers.

Code-gen is like multi-stage programming. Only that instead of having templates that are embedded in compiler output and than specialized at runtime before the actual execution you move everything a step back and have “templates” (or the-like) in your source code which get “specialized” (filled in) at compile-time into source files on disk, so those can than be picked up in the “next phase of compilation”.

But when you write the code out by hand it’s not an “dialect”? Come on…

THIS!

And what’s about other Java compiler plugins? For example:

Java is really flexible in this regard. You can even change syntax!

Nevertheless nobody ever complained about “Java dialects”.

That’s not an issue as long as you don’t have to deal with some very low-level data structure representing your code.

But this could easily happen if you need to combine some string based Scala source rewrites when there is no other safe facility to achieve some code-transformation. Back to “meta-programming with sed”.

At least not in Kotlin…

Implemented as compiler plugin.

Exactly! We’re back in the 90’s tied to some special sauce in our IDEs.

Exactly!

You put in the annotation and nothing happens. It does not do the hard work for you as expected. That’s more than confusing. It’s frustrating.


Of course, it makes no difference how much I write here. I see, the post got way too looong anyway. Maybe because I’m waiting for such a long time for proper code-gen and can’t stand it that this long awaited feature may fall apart on the last few meter before the finish line.

But what I can say for sure: If this “export macros” thingy succeeds to be implemented, but should it be only available on the fork of the compiler, I know what I’m going to use—shouldn’t there be any adequate alternative in the official Scala release. I’m a simple grug brained developer, I will use the tool that solves my problems. (And I guess the average dev out there thinks the same. Otherwise we wouldn’t have all the currently unportable macros everywhere).

At this point than we can start a serious discussion about “fragmentation” of the language, I guess.

7 Likes