Scala 3, macro annotations and code generation


As a macro fan, wanted to add two cents: IMHO, the strength of macros has always been that they help eliminate boilerplate. They accomplish that by (either directly, for custom macros; or indirectly, in shapeless and magnolia) generating the boilerplate at compile time, from the smallest and/or most idiomatic way to express that boilerplate. For example, writing a case class (a small and idiomatic thing) can get you the boilerplate of a JSON serializer, or a pretty-printer, or what have you. Annotation macros are in a similar vein, but for use cases where the boilerplate can’t be captured in a value and must be a generated definition.

Physically generating the boilerplate and writing it back to the source file is not remotely the same thing. It solves the problem of writing the boilerplate, but that was only a small problem to start with compared to the problem of reading the boilerplate. After all, code is read many more times than it’s written. The reduced form of the boilerplate (e.g. the case class) is much more readable than the expanded boilerplate itself. Evidence for this can be seen in Java projects which use code generation via annotation processors – the generated code is never checked in to source control; it’s instead generated only during the compilation pipeline.

I think the idea of using the rewrite system as a code generation tool is clever (smarter, even :wink:) but it won’t find much adoption unless it can be a transparent part of the compilation pipeline (i.e. it does not modify source files, only modifies the AST that goes to the next compilation phase – basically what annotation macros do in Scala 2)


I think there is a kind of tension here:

On the one hand generated code is “just noise” and usually nothing you would like to maintain manually, or even read in a lot of cases.

On the other hand having “invisible code” interfere with the rest of your code-base is an issue. This was a constant complain about the old annotation macros… Everything’s cool as long as everything works. But in case it doesn’t have fun debugging “invisible” code!

I think what Java annotation processors do is mostly sane: They generate code to disk so it can be read and understood. Also the code can be debugged easily this way. But it’s a kind of “don’t touch” code that doesn’t get versioned and shouldn’t be edited manually therefore.

I admit that the “invisible code” problem may be a pure tooling issue. But I’m not sure about that. Tooling would need to have the generated code anyway somewhere (at least in memory, likely even on disk). So there is in the end so or so no big difference to the annotation processor approach. Only the internal implementation would differ. But with purely “virtual” code the tooling would need to be quite complex, I guess…

To avoid touching and changing anything in manually written compilation units by a code-gen facility I want once more point to what C# did in this regard. This seems smart, imho.

This is a bad idea.

The field of programming languages is huge and varied, and yet ~nobody does codegen in this way. People do codegen, people have macros, but nobody has the compiler do codegen in existing source files that also contain user generated code. It breaks so many invariants and assumptions and conventions: handwritten and generated code in separate files, not committing generated code to source control, one-directional dataflow in your build tooling.

  1. Nobody wants to code review large swathes of generated code on Github, or have to skim over boilerplate generated code while reading a codebase to try and find the code that is actually meaningful

  2. We will inevitably find colleagues accidentally editing generated code living right next to hand-written code, causing the macro annotation to become a lie. And then next time someone regenerates the code to fix it, said colleagues handwritten changes are wiped out

  3. We will have to add extra build steps and validation to make sure their macro expansions do what they claim they do, and havent drifted accidentally over time

  4. All sorts of tools assume compilation is one directional. Compiler rewrites won’t work in strict tools like Bazel/Pants/Buck, and other environments with read-only filesystems (e.g. the remote devboxes we use at work). It may trigger repeated/redundant compilation in looser tools like SBT/Mill, and inevitably hit bugs where the rewrites fail to converge to fix points, resulting in infinite compilation loops (this already sometimes happens with Scalafmt and other languages’ autoformatters)

  5. Editors such as IntelliJ or VSCode or Vim do not expect other programs to be writing code while a user is doing so, which can easily cause user work and edits to be discarded and lost (the infamous “do you want to keep in-memory changes or filesystem changes” popup in IntelliJ)

  6. What about cross version sources, e.g. a single source file built against two versions of a library, with two different versions of a macro that are semantically identical but have different generated implementations? If they’re both macro-expanding the same source file, one of them has to fail, even though both versions are correct and would succeed if run separately

Mixing hand-written and generated code in the same file goes against decades of best practices of how to deal with generated code. As does committing generated code to source control, rather than having it be generated on-the-fly by your build tool. Having the compiler make changes to source files it should only be reading breaks the convention of “unidirectional dataflow” that is one of the success stories of 21st century software engineering.

Imagine if every case class you wrote would expand into dozens or hundreds of lines of boilerplate, polluting your code review and code reading, with big fenced comments // DO NOT EDIT THIS IT IS GENERATED CODE ANY CHANGES WILL BE LOST, with “golden” tests that run in CI to ensure the generated code reflects the actual sources they were generated from. These aren’t hypotheticals, these are real workflows and requirements if you want to manage generated code committed to source control in a reasonable manner. Doable - we do in fact manage generated code at my work this way - but it’s a lot of complexity that spills all over your developer tooling and workflows, and predictable consequnces and confusion any time the tooling falls short. Not a burden we should force people to take on lightly.

I know that debuggability, semantics, and tooling around macro annotations is a concern. For debugging purposes, being able to macro-expand a file to poke around would be a wonderful feature. But having the macro expanded code be committed to source control is a whole other can of worms that I don’t think is worth the complexity it entails


For what it’s worth, I wrote something that did exactly this (not in the compiler–it was a preprocessing phase in compilation that autogenerated code into the same file), and in some ways it was fantastic, but mostly it was a pain for exactly the reasons you describe. (It was a step on the way to something that would have been better, but the rest of the way was infeasible at least prior to LLMs, so it was a bad idea and a dead end at the time.)

So, yeah, I concur. Almost surely not a good solution. A debug phase that desugars to a given level but outputs code that compiles would be cool, but keep it away from the source tree.

I agree with your point 1 and it’s a weakness of this proposal, point 6 seems like a pretty rare occurence that could be addressed by using different source files for your different cross-builds, but I don’t think that points 2 to 5 apply to this proposal: I’m not suggesting that the compiler starts rewriting your code under your feet, that would be pretty chaotic indeed. Instead, the compiler would simply emit an error if the annotated code did not contain the expected generated code, this is something that can already be accomplished today as demonstrated by [Proof of Concept] Code generation via rewriting errors in macro annotations by smarter · Pull Request #16545 · lampepfl/dotty · GitHub, it’s just awkward to do because there’s no nice API for it.

The other piece of the puzzle is IDE integration via code actions generated from the compiler messages, this is something that is actively being worked on in both Scala 2 and Scala 3 for built-in rewrites in Roadmap for actionable diagnostics and it’s natural to expose that ability in the macro API too.

So really the only thing specific to this proposal is the idea of having some extra macro APIs around pretty-printing and diffing trees to make the proposed pattern easier to implement in a robust way. It doesn’t even need to be part of the standard library: if we can’t reach consensus that this is worth having, someone else might decide to make their own macro library that implements these APIs to promote this pattern.

I didn’t state this directly until now, but I also don’t want IDE code actions as primary facility for code gen. Especially if the generated code would end up mixed with hand written code! That’s just terrible!

Not only I think that @lihaoyi points mostly apply, additionally, how would this work with automated tooling? You don’t use an IDE to do builds in a lot of circumstances; think CI. But code gen is especially useful when run automatically.

The initial proposal looks like a hack, to be honest: “We have some facilities in the compiler and IDE, let’s use them for something nobody else ever did this way”. That’s not a clean design. More the contrary, imho.

The best proposal I’ve seen so far regarding code-gen are the “export macros”.

1 Like

It’s a two-step process: you use your IDE to apply the code action to fix the error generated by the macro, then in your CI when you compile your code normally, the macro runs again and ensures that the generated code hasn’t been changed (otherwise it emits an error once again). (If you don’t want to use an IDE you can also use the compiler -rewrite flag, but it’s not a flag you would enable by default in your build).

I’m not saying this hack wouldn’t work, somehow.

But it’s imho a mayor hack nevertheless.

Some formatting or even a comment on the wrong line could break the whole thing. Especially funny when this happens in CI (people have crazy stuff in CI, like code formatters). Now all kind of tooling needs to be aware of the hack. Also one would need sophisticated “tree-diffing” which accounts for all kinds of such possible breakage caused by otherwise harmless changes.

How about compile times when the compiler needs to regenerate code every time to do the actual diff (which isn’t fee either)? Code-gen can be quite heavyweight. That’s nothing you do on every recompile for a reason usually, but with this proposal the compiler would constantly need to check whether the // DO NOT EDIT THIS IT IS GENERATED CODE ANY CHANGES WILL BE LOST blocks are still intact. Additionally “tree-diffing” can become quite complex.

But mixing hand written code and generated code is anyway an K.O, imho. Nobody likes to see changing // DO NOT EDIT THIS IT IS GENERATED CODE ANY CHANGES WILL BE LOST blocks in diffs (and reviews). This would make things like git bisect also more complex, at least, I guess.

Those issues could be fixed of course if the code would be generated externally (as everybody else is doing). But than the question remains why go the route of a hack with all its issues instead of building a clean solution based directly on the current macro system like proposed in the other previously linked thread?

I understand that this proposal likely seemed simple to implement at first. So it’s a smart hack! It’s just not the right thing™ in my opinion.

If we would need as fast as possible something that works somehow maybe this hack would be even bearable. But why rush things? There is no reason to do so. This will be around for some time I guess. So having only a mediocre solution with a lot of design warts (like the old macros), which may be simple to implement in isolation but will make the live with tooling / automation very hard in the end, is not very optimal.

If it did that would be a deal-breaker indeed, I think in practice we could make it work (you can check out the PR I linked to earlier and try to break it)

Just generating the code isn’t the expensive part, it’s the rest of the compiler pipeline operating on the generated code which takes time. It’s a common trap for people to accidentally use macros in a way that generates a ton of code then be confused about their suddenly increased compile-time.

I’m not trying to rush anything, just proposing something that fits our self-imposed constraints of “macros are not allowed to create new definitions that affect subsequent typechecking” and doesn’t require new language features. Everyone is welcome to make their own proposals and explore other parts of the design space of course.

For me the big upside of this proposal is the following:
It shows the user what is going on, so that they can have a mental model of what these macros do
Which is useful for example when debugging code

But in my opinion, this should not be done by editing source files !

Since we kind-of assume IDE support, I think we should approach the problem from a different angle:
Develop tools that allow the user to inspect the generated code with their IDE

This has the benefit of also applying to already existing constructs like inline and macros, or even case classes, for and match !

I think this would be a very valuable tool for teaching as well, student could write the for they want, and see how it is desugarred

My vision would be for it to work somewhat like code folding:
If you click/expand/… on a macro, it displays the code generated by that macro. Probably either as a popup or in a different color, so that it’s always clear what code is “real”, and what code is generated)


And since not everyone uses a visual IDE, we can also think about how to show these expansions in CLI environments

But this is somewhat orthogonal, as the same issues would appear with the rewriting idea

1 Like

Moving the manual mangling of source files from compiler to the editor mitigates the problem a bit, but does not solve it. This assumes two things:

  1. The source files are writable at all (not always true today for ~half of my colleagues, who edit code on one machine and compile/run it on another!) and not shared (not true for most cross built code!)
  2. You can integrate with everywhere Scala code is written

Let’s consider (2). Maybe we work with Virtus to integrate Metals/vscode, and jetbrains integrates IntelliJ. Then what?

  1. What about Almond/Jupyter notebooks?
  2. Zeppelin Notebooks?
  3. Polynote notebooks?
  4. Databricks notebooks?
  5. What about the REPL? Will it edit the code being submitted without compiling/running it?
  6. What about alternate Repls, like Ammonite?
  7. What about codegen? Let’s say I generate code within a SBT, Mill, or Bazel build task on a CI machine. Who will be press the “autofix” button then?
  8. Mdoc snippets?
  9. Vim/Emacs/Sublime?

There’s a long tail of places where Scala code is written and run, the above is just what I came up with off the top of my head in 30s, I’m sure there are countless others I didn’t think of. All of these places assume code is written by the user and compiled and run with a one-directional dataflow. That is how it works for 99% of other programming languages they support. Most do not expect code to flow “backwards” from the compiler back to the sources.

These problems are solvable. A similar challenge exists with scalafmt/scalafix. But it’s one thing for third party tools to have incomplete best-effort support within all execution environments; these integrations are just icing on the cake making your life easier. But having core language features/workflows be unsupported depending on where you write your code, and requiring special integrations to properly use a language feature at all, is something quite different.


I’ve created a seperate thread about this, but I’d like to ask this here: How many of the previous uses of annotation macros that modified program state actually needed to modify program state? Is it possible that the situations where code generation is actually absolutely needed doesn’t truly apply to the listed environments?

If code generation is based on non-scala based information (ie: generating endpoints from an openapi spec), then yeah you need code generation, but I don’t think you’d actually use that with jupyter/the repl/etc.

However, there’s places where macro annotations were used in the past that can potentially be solved by Scala 3 features like programmatic structural types.

I’m not advocating that Scala should (or should not!) follow suit, but the approach Kotlin takes is interesting: essentially everything you might want to use macros for is implemented as a compiler plugin. Here’s a twitter thread from a couple years ago with some rationale.

I think what @smarter was advocating is a change in viewpoint,

First, I believe we should never let unrestricted macro annotations in their old form into Scala again. @smarter gave as an example @data classes. Looks like a great idea and is sure very convenient in some situations, but will get a hard no from me. If we admit things like that we open all doors again to language fragmentation. And this goes completely against our idea what Scala should be.

Macro annotations are OK when it comes to codegen for interop (e.g. something like @main annotations). Embedding your Scala program into a host environment without having to write boilerplate code is great, and since none of this is visible at type checking, it will not lead to dialects and fragmentation.

Having macro annotations restrict your program is also OK. Macro annotations could check that your program is pure, or that it can be translated to SQL or Datalog or any property you like. Enforced language subsets are not dialects,

The idea of @smarter, which I find quite clever, is to re-interpret a macro annotation like @data as a way to restrict your program. It now indicates that the definitions required by the @data class specification are all present. That by itself is useful and should be uncontroversial. You could manually check that all required definitions are there, but with @data the compiler does it for you, and you see at a glance what kind of class this is.

Then as an added convenience the compiler or IDE can also generate any definitions that are missing for you. You can freely replace or edit those definitions; the annotation will simply check that even after editing they are still of the right shape.

If the added code is small and easy to understand I can see this working quite well. Sort of like automatically adding getters and setters in Java IDEs. If the added code is large and convoluted that’s another matter. Then probably you should not do it. But at least we won’t have the situation that large and convoluted code gets added under the surface without this being pushed in the face of the developer.

So the point is, the rewrite aspect is really just an added convenience. It could be achieved in a number of ways or be omitted altogether. The important part is the change in viewpoint: The annotation tells you want content you can expect to see in the class.


Not sure what happened to my email reply, so I’ll repost:

The problem we’re trying to solve is that the macro should not influence the types. I think something akin to the Typescript example given can solve that. So basically what we need is

  1. A language for expressing in the types what the shape is of the code that will be generated

  2. Allow annotation macros to be responsible for providing the implementation of methods etc. expressed in the type.

So for example, there should be a way to express, not in a turing-complete executable macro code language, but in the types, whether as part of the definition of the annotation or otherwise:

  1. For the @data example, something like “Annotates a case class C, and for every field f: T there will be a method with${f.capitalize}(p: T): C
  2. For monocle @Lenses, something like "Annotates a case class C, takes a parameter prefix: String = "", and for every field f: T there will be a field in the companion ${prefix}f: Lens[C, T]
  3. I would like to be able to write a macro for scalas-react components, that can be expressed as: “Annotates a trait, class or object containing a case class that must be named Props, and a field named component, and generates an apply method with the same parameters as Props, with return type VdomElement

Then, after macro expansion you don’t need to run the type checker, but you do need to check that there aren’t missing method bodies. (If the current pipeline doesn’t allow missing method bodies that far, you could let them have some kind of special body that will error later if not replaced, I guess.)

It seems unfortunate that this would force the macro implementation and the shape specification to be redundant in some ways. IIUC match types can also be redundant with a corresponding pattern match. Maybe this could be solved, but even if it’s not, it’s still better than anything else IMO.

Also if type-level name mangling is too much, both examples could replace the prefix with an enclosing object. Something like:

case class Person(name: String):
  object `with`:
    def name(name: String) = copy(name = name)


case class Person(name: String)
object Person:
  object lenses:
    val name = Lens[Person, String]( => _.copy(name = name))

I think “dialects” and “fragmentation” refer to characteristics of the Scala code that people write. Facilities for boilerplate expansion will influence that, whether the boilerplate expansion happens before typechecking or afterward. And I think you’re implying that “dialects” are inherently bad, and are the kind of thing that sullied the good name of Scala 2.x. But, would you consider things like Lombok or Java annotation-driven frameworks to be “dialects” in that sense? I sure would, but they sure didn’t hurt the adoption of Java.

I’d agree that it sounds really elegant to reframe the codegen problem as a “lint failure” followed by a corresponding “code fix”. But however it’s framed, if a “dialect” is to be avoided, then it sounds like the benefit (compression of boilerplate into its essence) is what’s being avoided.

TLDR: I guess I’d argue that “dialect” is the goal, so if that’s incompatible with Scala’s principles then there’s no sense in trying to find some path toward code generation.

1 Like

@nafg I am arguing against the very idea of allowing to create language dialects like @data classes. Whether you do it via macro-expansion or via a super-powerful meta-type system is secondary. If we allow that by whatever means, we will get dozens or hundreds of un-coordinated de facto language extensions. That’s the Lisp dream and the Lisp curse.

As an aside, I’m really happy to see all the work that’s gone into cleaning up metaprogramming and making it first-class. But around the edges, I think there’s all too much over-correction around things that, maybe someone said one time confused them about Scala.

This is only my own opinion, for whatever it’s worth – but I don’t think that “dialects” from annotation macros were ever a deal-breaking confusion point for anyone in Scala 2.x. If anyone had a problem with “dialects”, they were of the variety that is still completely possible without any sort of metaprogramming.

Edit: want to make sure to note that my “air quotes” aren’t meant to be sarcastic, and I hope I’m not coming off as disrespectful. I really do appreciate all the work that’s gone into Scala 3, and I totally understand the desire to keep as many unbroken seals on it as possible.