Annotation macros

This post is a followup of http://scala-lang.org/blog/2017/10/09/scalamacros.html to initiate a discussion whether whitebox def macros should be included in an upcoming SIP proposal on macros. Please read the blog post for context.

Macro annotations have the ability to synthesize publicly available definitions, accommodating the public type provider pattern. Popular macro annotation libraries include

  • simulacrum first-class syntax support for type classes in Scala
  • Freestyle: cohesive & pragmatic framework of FP centric Scala libraries

I’m sure I’ve missed many other popular libraries using macro annotations.

To show an example macro annotation, consider the following @deriving macro
annotation from the library Stalactite

@deriving(Encoder, Decoder)
case class User(name: String, age: Int)

// expands into
case class User(name: String, age: Int)
object User {
  implicit val encoder: Encoder[User] = DerivedEncoder.gen
  implicit val decoder: Decoder[User] = DerivedDecoder.gen
}

The unique feature of macro annotations is that they can synthesize publicly available definitions such as User.encoder/decoder. Neither whitebox or blackbox def macros have this capability. It is generally considered best practice to place implicit typeclass instances in the companion object. This pattern significantly improves on compile times and prevents code bloat compared to full-derivation at each call-site.

Surely, it’s possible to manually write out the implicit val decoderBar: Decoder[Bar] = DerivedDecoder.gen parts. However, for large business applications with many domain-specific data types and typeclasses, such boilerplate hurts readability and often falls prey to typos, leading to bugs.

Code generation via scripting can used as an alternative to macro annotations in many cases. It is not a perfect solution, code generation traditionally comes with a non-trivial build tax. Maybe it’s possible to address those limitations with better tools for scripted code generation, avoiding the need to include macro annotations in the Scala Language Specification.

What do you think, should macro annotations be included in the macros v3 SIP proposal? In particular, please try to answer the following questions

  • towards what end do you use macro annotations?
  • why are macro annotations important for you and your users?
  • can you use alternative metaprogramming techniques such as
    code generation scripts or compiler plugins to achieve the same
    functionality? How would that refactoring impact your whitebox macro?
3 Likes

should macro annotations be included in the macros v3 SIP proposal?

Yes please

towards what end do you use macro annotations?

@case functionality for scala.js js.Object’s and typesafe GraphQL client in scala.js

@casejsTraitNative trait Variables extends js.Object {
val first: Int
}

which expands to

trait Variables extends js.Object {
val first: Int
}
object Variables {
@inline def apply(first: Int): Variables = {
val p = FunctionObjectNativeMacro()
p.asInstanceOf[Variables]
}
def copy( source: Variables, first: OptionalParam[Int] = OptDefault): Variables = {
val p = FunctionCopyObjectNativeMacro()
p.asInstanceOf[Variables]
}
}

currently i am using scala.meta(1.8) macro annotations.

why are macro annotations important for you and your users?

This will reduce a lot of boiler plate.

can you use alternative metaprogramming techniques such as
code generation scripts or compiler plugins to achieve the same
functionality?

I don’t know , my goal is to put some annotation on trait/class and it should generate Companion object with some members,new members to original trait/class, and newly generated Companion(and its members) should be available in IDE autocompletion.

I am unable to rewrite many of my own macro annotations as compiler plugins because post-namer, pre-typer compiler plugins do not have a reliable mechanism for identifying the correct annotation.

I’d be delighted to rewrite as a compiler plugin if this could be addressed (they work much better in ensime and scala-ide for a start!)

In addition, I often require the parameters to my annotations to be resolved to FQNs. Ideally Symbols (it might be nice to inspect the target symbol, coming from a 3rd party jar), but FQNs should be enough for my immediate purposes. Consider for example the “deriving” macro that takes a list of companions of typeclasses to derive… we need access to the FQNs of these typeclasses, not just the raw source names.

BTW, a thing that would be awesome here is if it was possible to have an annotation on a sealed trait and be able to generate code on all the subtypes in the compilation unit… and fail if the user does anything funky (basically only allow final case classes). That would be a really killer feature for stalactite.

2 Likes

Post-namer, pre-typer compiler plugins? I once tried that, but my experiment failed. It is my understanding that you cannot run anything between namer and typer, only before or after the frontend (which comprises namer, packageobjects and typer). So maybe this is the issue you experienced.

I’m happy to be proven wrong. Despite reading the code handling phases, I’m not 100% sure of this claim.

you can definitely run something post-namer pre-typer :slight_smile: (I have some examples if you want to see… e.g. tests of pcplod)

The problem is that “naming” doesn’t really name stuff… so everything is pretty much verbatim what’s in the source code. We need another offical phase in there that typechecks annotations.

Even if something can be rewritten as a compiler plugin, it makes the experience for the user less pleasant. A way to mitigate this somewhat is for authors to release an SBT plugin that sets up the right compiler flags. And that’s only helps for SBT users. So it’s a step down for users and for authors.

What exactly is the cost of supporting annotation macros?

4 Likes

@fommil Are you sure? I’ve cloned the repo and NoddyPlugin runs after parser and before namer. Maybe you meant other tests?

I would be surprised if you can actually run things in the middle of the frontend. It seems that PhaseAssembly should respect the hard links in packageobjects and typer specified by runsRightAfter. Otherwise a globalError is thrown (IIRC, this is what I got when I forced the hard link in my plugin too).

We can move this discussion to Gitter to avoid hijacking this thread. @fommil

Annotation macros have never been officially supported by scalac, they have always required an external compiler plugin. The “cost” of supporting annotation macros is not free, there are challenges that need to be solved

  • semantic APIs, annotation macros can only robustly be supported with syntactic APIs. By exposing semantic APIs to macro annotations you introduce a cyclic dependency between expanding annotation macros and typechecking (that depends on public members synthesized by annotations)
  • tooling: IntelliJ, the REPL, scaladoc, zinc, etc., need to accommodate the annotation macro expansion pipeline. For scaladoc, annotations should be able to attach docstrings to synthesized public members.
  • annotation macro signature and discovery, see https://github.com/scalacenter/macros/issues/6

All of these problems are solvable, but addressing them means less time spent on other improvements. I would say that scripted code generation is a stronger contender to annotation macros than compiler plugins. Compiler plugins are not portable and therefore suffer from the tooling problem.

Thank you for sharing your thoughts @chandu0101. Can you estimate how easy it would be to migrate your library from annotation macros to code generation via scripting? What do you think would be the tradeoffs from that change?

@olafurpg

Can you estimate how easy it would be to migrate your library from annotation macros to code generation via scripting? What do you think would be the tradeoffs from that change?

If out of box macro annotations support is going to be huge work, then i don’t mind going with scripting for now, could you please share some material(links) on scala code generation via scripting …

1 Like

Would it make sense to have macro annotations that do not change the annotated class, but do code generation instead? For me, having an official language-integrated way to do code generation would be an enormous advantage (compared to scripting) and it would cover most of my use cases for macro annotations.

As an example usage, one could have a set of annotated classes in some model package, that automatically generate corresponding transformed classes in some sibling gen package.

There are several advantages:

  • All the code base can be type checked, and then the macro annotations can be invoked (they would obviously have to generate code into a different project). Many macro annotations would benefit from being able to see type-checked trees with return types inferred, symbols available and fully-resolved names.

  • This would prevent “abuses” of macro annotations (such as this) where one uses syntactic forms that are not valid Scala but pass the parser and can be manipulated by syntactic macro annotations. I think this was one of the original design goals of macro annotations, but it was also recognized as being too flexible and making Scala look not like Scala.

Would it make sense to have macro annotations that do not change the annotated class, but do code generation instead?

Would this support cases like @deriving above that inserts members into the companion object?

I think so. As long as the generated class mirrors everything defined in the original (model) class, but adds members to the companion object. Users would always use the gen version –– the model version existing only for the purpose of easing code-generation.

material(links) on scala code generation via scripting

It’s quite a general topic, and seems to be more popular in other programming languages. Here are some Scala projects using code generation

In other languages

Code generation definitely has problems, but it’s a capable replacement for macro annotation in many scenarios.

1 Like

Macro annotations are instrumentals for frameworks like Freestyle and as far as I can tell what we do today is not possible with codegen,

Freestyle modifies the definitions of the annotated classes to remove boilerplate and make those classes implement the patterns it needs for programs to remain compatible in vanilla scala contexts where Freestyle is not in use.

For example the following definition:

@free trait Algebra {
  def doSomething: FS[Int]
}

Is expanded to: (simplified)

trait Algebra[F[_]] extends EffectLike[F] {
  def doSomething: FS[Int] = FreeS.liftF(...)
}

This could not be done with codegen without creating additional classes and types.
In the context of Free and Tagless carrying over a F[_] representing in the former the total Coproduct of Algebras and in the later the target runtime to which programs are interpreted is just boilerplate to most users and removing that and using the FS dependent type materialized by EffectLike is one of Freestyle features and foundations in which the entire library and companion libraries are built.

There is many other frameworks that modify in place annotated classes doing trivial things like adding companions or members to companions. Not being able to do that in place and just code generating things will change not only the usage but also semantics as to where implicits are placed and potentially result in other undesired behaviors that users will be responsible to fix.

Not supporting annotation macros would be a major breaking change and will break a ton of user code not only from a compilation stand point but also semantics and usage of these project public APIs.

6 Likes

There are a few points I’d like to clarify about my understanding of what the community requirements are, just so we’re all on the same page:

  • annotation macros are currently provided by a third party plugin and are notoriously broken in build tooling (presentation compiler, intellij, coverage, etc) and have some very bizarre behaviour in many cases when types get involved.
  • We have many existing macro libraries using annotation macros, e.g. simulacrum, deriving.
  • meta paradise offered an improved API over macro paradise, but unfortunately the tooling breakages are very bad (doesn’t support for comprehensions, crashing the presentation compiler, scaladocs, etc) with technical challenges mounting.
  • compiler plugins allow placement of meta programming at a specific phase in the compiler and are therefore more stable for tooling (only intellij requires custom support, and it is not hard to write it). But unfortunately quasiquotes (and the meta API) are not supported for compiler plugins and the ability to typecheck arbitrary trees is not available (which is only really “best efforts” in macros anyway, discovered through trial and error).
  • the difference for a scala developer to use a compiler plugin vs a macro is completely negligible. It’s a one liner in both cases and involves no code changes.
  • we should strive to use codegen where we can (e.g. the creation of new ADTs), but there are many places where we simply cannot (e.g. when modifying a compilation unit to have access to knownSubTypes, existing classes, and companions)
  • many current macro annotations require access to type information that is not available from early phases in a compiler plugin.

The obvious options seem to be:

  1. add support for annotation macros to the Production Ready Macros, including full support for all tooling (including the presentation compiler, intellij, etc).
  2. or, make a meta-like API available to compiler plugin authors and add some earlier typechecking / fqn / dealiasing phases (and the testkit!), such that existing plugins like simulacrum, freestyle and deriving can all be easily ported.
2 Likes

Annotation macros were suggested as a solution to a concern regarding opaque types; while the issue might also be solvable using scripted code gen, it would be substantially less user-friendly for the simple use case in question.

2 Likes

Synthesising members that need to be visible to other code currently being typechecked is a very delicate procedure, e.g. our implemenation of case classes in the Namer is pretty tough to understand. The hard part is that you want to base the logic of the macro on Types, rather than just syntax, but to get types you might trigger typechecking of code that will observe a scope before your macro has added/amended it. This could manifest as a fatal cyclic error, or as a failure to typecheck code that expects the synthetic members to be available.

How can we providing API/infrastructure support to macro authors who try to walk this tightrope? This is a core question, but one that requires a serious investment of research.

This question is orthoganal to the question of what API should be used to interrogate or synthesize trees or types, or the whether macros can be executed in the IntelliJ presentation compiler (which are both Hard Problems™️ in and of themselves!)

One small part of this research I’d like to see is a review of Java’s annotation processors (http://docs.oracle.com/javase/9/docs/api/javax/annotation/processing/package-summary.html#package_description). What can we learn from the design? What restrictions are imposed that would be too strict for our uses? How (if at all) do Java annotation processors integrate into IntelliJ’s presentation compiler?

6 Likes

the last time I looked at Java annotation processors, they didn’t work in IntelliJ and maintaining a separate impl of each expansion was necessary. e.g. Lombok.

However, I’m coming around to the thinking of having a separate impl for the IDE and the compiler, for perf reasons. For example, in stalactite (the deriving macro) we don’t do any implicit derivations in IntelliJ… which dramatically speeds up the dev cycle. It’s only during the actual compile that errors will be discovered, and a lot of people prefer that because it doesn’t get in the way and then it means you’re interacting with the real compiler to fix tricky implicit problems. I’m in favour of less false red squigglies, and letting more through to the real compiler to catch.

To add to this, Scalameta annotation macros and the inline/meta proposal don’t expose semantic APIs to annotation macros. They are purely syntactic. This limits the capabilities of annotation macros but avoids introducing a cyclic dependency between typechecking and macro expansion. It seems that syntactic APIs are still sufficient to accomplish many impressive applications.