Pre-SIP: Export Macros

littlenag · April 23, 2023, 4:45am

One of the more powerful features of Scala 2 is obviously macro annotations. Scala 3 has greatly dialed things back on the metaprogramming front, and understandably so, but expressive metaprogramming capabilities are something that I’d really like to have access to in some form. On that note, I’ve been working on language feature that I hope offers a compromise.

For me the primary use-cases I have around metaprogramming would be to add new declarations at compile-time to existing traits, classes, and objects. For example, I should at compile-time be able to write code that declares a case class, declares a type-level transformation of the shape of that case class, and applies that transformation to create a source-code-level new case class with that new shape. This would help eliminate duplicate copy/paste data models that differ in mechanically-expressible ways. In the language of Shapeless, if I can go from a case class to a Repr, why can’t I go from a Repr to a case class?

I’m sure the folks here could express another hundred different use-cases that require the same capabilities of the compiler.

I believe the Scala 3 export keyword could be leveraged for such use in metaprogramming.

I’ve put together a fork of the dotty compiler and implemented the feature to prove it can work (see links at the end). The basic idea is to allow export to expand out the content of a macro and splice in the generated definitions. In some sense this isn’t so different than what export already does, and as proof of this many of the same code paths are re-used in my fork.

Here is some sample code of the feature taken from a simple unit test I have:

     // In file A.scala
       import scala.quoted._
       object TestMacro {
           def dothis(b: Boolean)(using Quotes): List[quotes.reflect.Definition] = {
             import quotes.reflect.*
             if (b) {
                val helloSymbol = Symbol.newVal(Symbol.spliceOwner, "hello", TypeRepr.of[String], Flags.EmptyFlags, Symbol.noSymbol)
                val helloVal = ValDef(helloSymbol, Some(Literal(StringConstant("Hello, World!"))))
                List(helloVal)
              } else {
                val holaSymbol = Symbol.newVal(Symbol.spliceOwner, "hola", TypeRepr.of[String], Flags.EmptyFlags, Symbol.noSymbol)
                val holaVal = ValDef(holaSymbol, Some(Literal(StringConstant("Hola, World!"))))
                List(holaVal)
              }
           }
         }

     // In file B.scala
         class Foo {
           export ${TestMacro.dothis(false)}._
           // expands to: val hola = "Hola, World!"
         }

Depending on the value of the boolean argument to the macro dothis, one of two different definitions will end up spliced in to the resulting class. And because the expansion is done as part of the typer phase (which is where export is processed normally), these definitions are visible to other code and are part of the type signature of the surrounding class/object/trait.

One huge benefit of re-using export in this way is that user-code as written remains untouched, and in fact cannot be altered. The definitions produced by the macro have to play by the same rules as the normal ones would.

What do folks think?

Links:

Dotty Fork: GitHub - littlenag/dotty at export-macro
** I want to stress that the fork (in branch export-macro) isn’t synced with the latest from upstream and was created a long while back. It also isn’t very clean in its implementation, and there are debug printlns galore.
Longer write up: Expressive Metaprogramming for Scala 3 · GitHub
** I’ve been thinking about this feature for a while now. Linked is a much more in-depth write up.

odersky · June 6, 2023, 8:05pm

How should tooling work with this? If you auto-generate definitions and then refer to them, how does navigation and hyperlinking work? What about incremental compilation? How can I keep track of what changed and what needs to be recompiled if definitions are auto-generated?

Paint me skeptical. I believe Scala had too many efforts like this, where meta facilities for language dialects were created that then fell short for tooling purposes.

In general, the Scala 3 motto is NOT to enable dialects. We allowed that in Scala 2 and it caused much pain and I believe is one of the main causes for the backlash against the language that we are seeing. Scala 3 is powerful enough as it is, generally. I admit there are always niche cases were someone wants more power, but we have to realize that’s a tradeoff. There’s value in standardization, too.

littlenag · June 7, 2023, 4:15am

The choice to reuse export was in part driven by the fact that Scala 3 tooling will already have to integrate with export regardless and that the lift between what export already does, which is a very limited form of meta-programming already, and macro generated definitions hopefully wouldn’t be that large.

For tools that leverage the pickled type signatures, or can, then the story should be identical between this proposal and what already exists around export. It’s possible some tools might already have “support” for this feature. I think that’s pretty cool.

Re incremental compilation, the compiler already has to track upstream targets changing and affecting an export, so I don’t see how this feature makes that any different.

I do agree that there are issues around discoverability. When a developer sees a class name or method name in code they should be able to quickly navigate either to its macro invocation in source or to a more useful definition. It isn’t immediately clear how that could or should work, but I’m not sure that’s a problem unique to this proposal, or within the scope of this proposal to fix. I’m happy to outline ideas, but that feels like something with a very long tail.

If you have specific tools in mind that you would want to see this feature tested with, or have an integration story for, then I’m happy to investigate more deeply. Certainly I would need to ensure compatibility with what we use at $DAY_JOB.

sjrd · June 7, 2023, 7:11am

There is a fundamental difference. Currently, exports are decided by specified, declarative rules. With macros, what gets exported is decided by a Turing-complete program. That completely breaks all the assumptions that the incremental compiler is built on.

littlenag · June 7, 2023, 3:34pm

Perhaps. I would be surprised if the lift was that high, given that the incremental compiler already has to think about export and macros in general for each file that gets compiled. I’ll investigate and see what the story is.

MateuszKowalewski · June 8, 2023, 7:09pm

There’s a very similar issue with type- and “term-aliases”.

So nothing new here, imho.

In case I understand the proposal correctly there is no big problem with tooling. The generated code would end up in some form on disk, wouldn’t it? Or is the generated code purely “virtual”?

Nevertheless there is imho a need for such a feature. Scala is boldly lacking an unrestricted code gen feature!

Languages like Java have that: Annotation pre-processors. (Don’t try to use that form Scala though if you’re not keen on much pain in the *piep*).

It’s imho quite “funny” that such a powerful language like Scala has no mechanism to cut down mechanical boilerplate when even a still boilerplate-hell language like Java can do that.

The current “solutions” in Scala emit concatenated Strings as code files… That approach couldn’t be more primitive and error prone!

Please @odersky consider some form of proper code gen for Scala! I know your gut feelings are mostly right, but this issue at hand needs a solution. A solution that’s worth being in such a great language as Scala. Show the world how an outstanding feature in such regard could look like instead of leaving it to hacks like “emitting code as plain strings to disk”.

This proposal here looks really promising. (Almost) nobody complained, but it got instantly 11 Likes, which is quite a lot for this forum. (I read this as: The proposal is so convincing everybody is just shouting “take my money”. Usually you hear “Bedenkenträger” even in case of otherwise great proposals). It’s even already implemented half way… @nicolasstucki the macro maintainer seems to like the idea presented here also as I infer form some GitHub comments.

Btw., regarding tooling for code gen: Maybe Scala could get some inspiration form C# here. They introduced a feature called “partial classes” to aid with auto-generated compilation units.

When working with automatically generated source, code can be added to the class without having to recreate the source file. Visual Studio uses this approach when it creates Windows Forms, Web service wrapper code, and so on. You can create code that uses these classes without having to modify the file created by Visual Studio.

When using source generators to generate additional functionality in a class.

I’m not sure this should be imitated verbatim. But it’s interesting food for thought.

Everybody involved here: Thanks for finally looking into such an important feature like code gen!

adampauls · June 8, 2023, 9:50pm

I’m not sure if this is possible yet, but if macro annotations are allowed to fill in methods of a trait, I think I’ve come around to the idea that you should avoid generating declarations that typer relies on. That is, val hola = ${MyMacro.dothis(false)} is great, as is

trait HasSomeMethod {
  def classNameAndMethodCount(): (String, Int)
}

@fillInHasSomeMethod
class Foo extends HasSomeMethod {
  def method1(): Int = 5
  def method2(): Boolean = false
   // `@fillInHasSomeMethod quietly generates 
   // def classNameAndMethodCount(): (String, Int) = ("HasSomeMethod", 2)
}

Tooling might still get a little confused about where to bring you for the definition of Foo. classNameAndMethodCount, but it’s no longer totally magical – it could just bring you to the trait.

smarter · June 11, 2023, 11:31pm

That will be allowed once macro annotations become stable yes. See also Scala 3, macro annotations and code generation for a proposal attempting to bridge the gap between filling-in definitions and generating new members.

markehammons · June 12, 2023, 8:40am

Re: codegen, wouldn’t it be possible to use macros + show to do codegen? Quotes +Splices are much more principled codegen, and you can likewise use reflection to build up some code, though I don’t think reflected code generation would be checked like it is now.

Maybe a new method could be added to Expr called .check that would emit compiler errors if the Expr doesn’t make sense. That would let you create an Expr from the compile time reflection API, check that it’s sane code, and then use .show to emit the code to a file.

MateuszKowalewski · June 12, 2023, 11:41am

Re: codegen, wouldn’t it be possible to use macros + show to do codegen?

I don’t think this works.

The point about the current macros is that they don’t allow to introduce any new “typer visible” definitions.

The new macro annotations won’t be able to do that either.

So in the end of the day you can’t generate arbitrary code (like for example class hierarchies), besides filling out string templates and writing those to disk as an external (pre)build step.

This is a big issue! Code gen is vital to all kinds of tasks.

For example: You have some IDL (like say ProtoBuf or Smithy) and want to create classes based on it. The only Scala 3 way of doing this I know of would be to do code gen with string templates (or even worse by writing some kind of byte-code emitter). That’s imho ridiculous!

I guess you could hand-craft TASTy with the help of some Scala 3 APIs. But that’s more or less the same as fiddling with a byte-code emitter.

Type checking + code gen are of course a hard problem. I admit. So I wouldn’t even expect some 100% “sound solution”. But I would like some solution that’s good enough so most of the time tooling could help keeping writing macros sane.

In the end some form of templating would be good, I guess.

But actually I don’t care about the how (-> here possibility for research! <-), as long as code could be generated in a better way than writing raw string templates.

Maybe this isn’t even a language thing but a tooling issue. If there would be tooling that would help in writing (with some IDE like features) code templates that would be maybe enough. Like said, I have no clue about the how. I’m not an expert on those topics. I’m just a user who would like to be able to generate code based on some other code (e.g. I feed in a data structure and get back case classes and type classes and their instances defined for them; I guess the “IDL → code” example is here a good one).

To expand a little bit, just had an idea: Wouldn’t sound type checking be possible even in generated code if one would assume a closed world for this code? Say, code-gen would always produce kind of “separate” and (mostly, maybe besides StdLib?) self contained compilation units. You should be able to fully type check those, I think! Other code (which isn’t explicitly imported into the “template scope”) shouldn’t have any impact on what’s going on inside the template. OTOH the generated parts shouldn’t interfere with other definitions on the outside besides in a way as any other external module. All the template-generated code would always end up in some dedicated compilation unit or module (package). This would be good enough for most code-gen usage. “Splicing in” in such way generated code wouldn’t be directly possible like it is with the current macro inline defs, but you could import symbols.

markehammons · June 12, 2023, 12:00pm

Current macros can in fact introduce new definitions, and those definitions can affect your program. The thing is that current macros only produce expressions at the end of the day, meaning that any definitions you create are not visible outside of the Expr that contains them.

You can in fact do that with current macros, you just cannot have them visible outside of the Expr. However, since we would be using Expr.show to produce writeable files instead of splicing Expr into your code, that ends up not being a problem anymore.

The problems at present is

The requisite api is still experimental: SymbolModule
The requisite api also doesn’t allow non-empty constructors
The macro system only checks the sanity of Exprs borne from Terms when used normally.

MateuszKowalewski · June 12, 2023, 12:10pm

Which completely defeats the point.

Think auto-generated implementations of some IDL…

This sounds interesting. So you could have something like “template expansion” at least as a separate (pre)build step.

This goes in the direction of what I’ve just added as an idea to my previous post.

If this wouldn’t require build setup but be a compiler feature we would be almost there I guess.

markehammons · June 12, 2023, 12:33pm

I think this would be better suited as a build setup + library that extends some base functionality of Scala rather than a compiler feature. Building like that makes this template expansion very regular.

The code that accompanies/is on the classpath of your templates are available to your templates, but your templates are not available to that code.
Your templates are available downstream to code that consumes the sourceGenerators from your templating library.
This allows multiple competing templating libraries rather than sticking us with a templating that’s embedded in the Scala compiler

Regarding this templating, and having it be powered by the current quotes and scala compile-time reflection, maybe it would be beneficial to have a new type defined: CompilationUnit. Scala macros would still be the expansion of an Expr[?] into an inline method, and this CompilationUnit type would be something that could contain declarations, packaging, etc, could be checked for sanity (valid code), and emit strings that can be written as one or more .scala files.

MateuszKowalewski · June 12, 2023, 12:43pm

Yes, this sounds good!

CompilationUnit would be in this case the “closed world” I was reasoning about… You couldn’t splice it in like an expression, but the definitions contained could be still regularly imported as others form within any “regular” compilation units.

Maybe the expansion mechanism should be in fact at least in parts external to the compiler. Than proper IDE support would boil down to mostly a tooling issue.

But CompilationUnit sounds like a language level feature. So at least some compiler support would be needed. Also some quote support for CompilationUnits would be nice to have. This would require also some tweaks, I guess.

Additionally CompilationUnit could make it possible to define macros in the same file as the code using them. Currently you need separate files which is a little bit unergonomic. But wrapping the macro definitions in a CompilationUnit could maybe solve that issue too.

littlenag · June 12, 2023, 7:58pm

Just to meta up a level – broadly speaking I think @MateuszKowalewski and @markehammons are both discussing macros at the level of re-writing token streams, ie basically what Rust does for its macro system. I think Rust’s success (and C’s, and Dlang’s) success shows that that kind of macro system works really really well for certain kinds of repetitive boilerplate.

However, what I’ve suggested here is an altogether different kind of macro system. The macros I’ve implemented here work at the next level, and can have knowledge of actual Scala Types!

I know the example I chose to use didn’t demonstrate this, and I apologize for the lack of clarity.

But what this means, for example, is that you can generate new Type definitions as a function of your already existing Type definitions. To do this in Rust, or any other “templating” langauge, would mean that basically all your data definitions are now outside your language. While that can work, it leads to lots of issues, feels wrong to me and prompts me to ask if something better can be done.

I think we can do better. Export macros are a fundamentally different abstraction, and one that I prefer over “simple” token manipulation given that I can stay in Scala for more.

But I think the difference between these two systems, and which the community prefers, is really the crux of the issue. Scala 3’s macros operate over well-typed Expressions, not token streams. It would make sense to stick with that flavor of idea when abstracting the system to handle more metaprogramming tasks. OTOH token stream manipulation is much more well known, and while not perfect, it solves the issues that industry has and moves us forward.

MateuszKowalewski · June 12, 2023, 10:35pm

Sorry for the confusion, but at least I wasn’t really talking about “token stream” macros. Those aren’t much better than “raw strings”. No IDE support of the language as token streams are opaque to the compiler (at first). Also, as I understand it, hand crafting TASTy would be more or less the same thing as assembling a “token stream”. So this is already possible. Only very inconvenient, and of course completely “unsafe”.

Instead I’ve stated a few times that it would be good if things would be part of the regular language, and of course typed.

The point is to be able to import generated definitions (when they’re exported from the macro / template scope). So as a first step there needs to be some possibility to generate definitions. This seems solved with the SymbolModule. The remaining issue is to be able to actually import definitions form the macro / template scope into regular code…

My understanding was that export macros would make exactly this possible.

@markehammons proposed something on top that would make the use of this feature more convenient.

The quote talks explicitly about building this on top of the current macro features.

This CompilationUnit would, as I understand it, do in the end what was proposed in the beginning. Just that it could make the usage as convenient as having a solution based on “classical” templates. CompilationUnit would provide quote syntax for declarations, packages, macro exports, etc. Also you could use it inside a file as a kind of scope, I think, so the requirement for external files for macros could be lifted.

Only maybe the last part of the last sentence in the previous quote is something completely new in this thread. That’s also something I still didn’t get fully: Will the code form “export macros” be “virtual” or would it end up generated on disk? Both would work likely fine from a technical perspective. Only that a big complaint about the old macro annotations (which often did code gen under the hood) was that you could get “really funny compiler errors” from code you can’t even see. “One can’t even see the code” is imho indeed an issue! How to step through it in a debugger? How to see if it changed in unintended ways. How to understand it at all when you can’t read it directly? Generating to disk would solve those issues. Or, at least, the tooling needs to be extended in a way so it can “show virtual code”. But I guess just writing it out to disk would make this much much easier. But maybe this should be just a facility for debugging, and “virtual” code would be fine most of the time?

No matter the details, the proposal of “export macros” (maybe with a CompilationUnit abstraction on top) would bring Scala finally in this regard into a position on par with C++, Rust, or actually even LISP with its staging capabilities. Compile-time code-gen is vital to a lot of tasks, and can reduce boilerplate dramatically. With its clean design of the macro system Scala could even provide powerful tooling for this feature, better than other languages with code-gen capabilities! With proper tooling for such macros Scala would be imho light-years ahead of the rest of the pack.

But it’s not me who needs to be convinced… I’m fully sold on the idea presented here.

I was nagging to bring this important topic into attention of some people who are quite skeptical about unrestricted code-gen. So I’ve tried to present the perspective of a poor user who wants to automate the generation of boilerplate code, as that’s a very common requirement! There are examples of that even in the Scala compiler. OTOH code-gen by string concatenation is imho a big joke (especially in the context of such a language as Scala!) so we need a proper solution finally. This here seems to go in exactly the right direction. I really hope for some usable results out of the effort presented here!