Scala 3, macro annotations and code generation

MateuszKowalewski · June 14, 2023, 11:32am

Yes, something like that!

How it works in the end under the hood, I don’t care actually.

But the “templating” needs to be sane, safe, and convenient even for less skilled people.

My impression was that the current quote stuff in Scala, with its Expr[?] abstraction, would make a really great “templating language”. It’s the best I’ve seen so far as it’s type safe!

The trigger points that would deliver the data to the “templates” would be hand written annotated definitions (of for example case classes).

The results of the triggered code-gen needs to be “material” as this would be otherwise way to much opaque magic that can’t be debugged reasonably.

And yes, such a feature would be extremely useful! The lives of lesser beings consist in large parts of writing repetitive boilerplately code. Cutting this down to the bare minimum would make Scala especially attractive to Jon-Doe-average-programmer. It would be almost a killer feature for some jobs, making mundane tasks really easy—without compromise on safety or tooling support (like in the case of stringy code templates that are the only way to achieve the stated goal currently in Scala).

Just think about the large market share of poor web devs working with all kinds of languages who do mostly nothing else than writing such kind of “boilerplate”; defining entities, code that brings them over the wire, and persists them on the server side. Most of this is copy-paste, while just replacing entity and field names. A framework that could abstract this away would be a game changer! Spring killer…

Thanks a lot for trying to understand what the pain points are, and what would make things substantially better! That’s something I love Scala for. People are listening. (You sometimes just need to cry loud enough… )

rcano · June 14, 2023, 3:48pm

It’s worth noting that annotation processors are among the most popular tools right now in java-land (mapstruct, immutables, micronaut), and that somehow the annotation macro produces java files with code that are visible in the same compilation unit, because you are able to use the generated definitions on the same file where you introduced the annotations that produce the generated code.
I don’t know how this magic happens, but it is there and it is very necessary for the general usage of annotation processors, in java at least.

acruise · June 14, 2023, 4:47pm

You can use Kotlin compiler plugins in other contexts too, including Maven and REPL, but it is nice that IDEA’s error highlighting doesn’t get too confused by the syntactic absence of generated stuff.

lihaoyi · June 14, 2023, 10:55pm

Someone already linked the Dotty example. Some examples from my own code:

uPickle generates JSON serializers for each arity of tuple (upickle/build.sc at f9bf9984e5175e5f4b2020db17f99e26a3037250 · com-lihaoyi/upickle · GitHub). Not sure if this will fully go away with Scala 3, or whether we’ll need to keep the current implementation for performance
Templatized generics: files like upickle/ujson/templates-jvm/DoubleToDecimalElem.java at main · com-lihaoyi/upickle · GitHub have their Elem string replaced by Byte or Char, effectively specializing/monomorphizing them and avoiding boxing that would arise with generics. This is similar to what’s done in Java-land for specialized collections like FastUtils or Koloboke Collections
IDL codegen: at work we do build-time codegen from .proto schemas, OpenAPI specs, and AWS API specs to provide typed RPCs. The goal here is primarily to provide type-safe access to something defined outside the Scala codebase. In my personal projects, I’ve also used Scalably Typed which works via codegen.

There are also places where I haven’t bitten the bullet to use codegen, but there’s tons of boilerplate which cannot be made to go away:

Defining a whole bunch of related case classes with the same field, e.g. all Exprs in the Sjsonnet config compiler have a pos: Position field sjsonnet/sjsonnet/src/sjsonnet/Expr.scala at master · databricks/sjsonnet · GitHub. They all extend a trait, so .pos can be used seamlessly in downstream code, but it’s tedious to have to include pos: Position in every single case class declaration when there’s a lot of them.
Injecting implicits throughout all methods in a object, e.g. Fastparse’s def number[$: P] context bound. Having a context bound or implicit/using param isn’t a big deal when you have a few of them, but when you have hundreds of them one-on-every-line even the smallest amount of boilerplate gets old.
- Multiple implicits can be combined together easily into a single implicits via wrappers (e.g. here), but in many cases - such as for FastParse rules - even a single implicit is a ton of boilerplate when it happens on every single line (hence the contortions around context bounds to try and minimize it)
Dependency injection via implicits, somewhat similar to above. I wrote a compiler plugin back in the day to automatically add (implicit foo: Foo) to all definitions in annotated files (GitHub - lihaoyi/sinject: SInject is a Scala compiler plugin which helps auto-generate implicit parameters), to remove the boilerplate of tediously declaring the implicit over and over and over.
Re-using parameter lists between functions, without forcing the user to construct and pass in a config object. e.g. in Requests-Scala, the same parameter list is copy-pasted 4 times (1 2 3 4 with minor tweaks. There are some other places where the copy-pasta happens in expressions that aren’t particularly amenable to solve via macros (1 2).
- If the method signatures are exactly the same, then I could get away with defining a class Foo{ def apply(...) } and instantiating Foo multiple times. That is what is done for requests.get/post/etc. to re-use the signatures. But in the case of .get/.get.stream/Request(...), the signatures are slightly different, which means i have to copy-paste-edit the whole thing each time I want a new one
- These cases could be resolved by some kind of **kwargs keyword-argument-expansion language feature as exists in Python: both at the call site “expanding” a case class via foo("hello", **bar)into a bunch of keyword-arguments foo("hello", qux = bar.qux, baz = bar.baz), and at the definition side where could define a def foo(s: String, bar: MyCaseClass**) and have it automatically expand into a bunch of keyword parameters (with types and defaults) def foo(s: String, qux: Int, baz: String)

I also do a bunch of faux-macro-annotations, that don’t need to introduce stuff visible to the typer, but bundle up metadata or definitions for use at runtime. These requirements are probably satisfied by the “transparent” macro annotations that run purely after-typer

mainargs @main
Cask @get, @post, @postJson, @websockets, etc.
Mill def myCommand() = T.command{ ... } (not quite an annotation - it is discovered based on return type instead - but it works basically the same way)

Now, I won’t say that the way Scala allows you to abstract over definitions is bad. You can get surprisingly far with traits, type parameters, higher-kinded types, implicits, and so on. But there’s definitely a gap there.

In other languages you might not even notice this boilerplate, because everything is so boilerplatey it kind of blends together. But in Scala, given how nice we can make a lot of our expression-related code with functions, HoFs, by-name params, and macros, these areas of clunkiness really stand out and are probably the motivation for a lot of the requests for macro annotations

lihaoyi · June 15, 2023, 1:14am

Perhaps another set of places where the current Scala features for abstracting over definitions are insufficient, are those around ORM:

SLICK

final case class Coffee(name: String, price: Double)
// Next define how Slick maps from a database table to Scala objects
class Coffees(tag: Tag) extends Table[Coffee](tag, "COFFEES") {
  def name  = column[String]("NAME")
  def price = column[Double]("PRICE")
  def * = (name, price).mapTo[Coffee]
}
// The `TableQuery` object gives us access to Slick's rich query API
val coffees = TableQuery[Coffees]

ScalikeJDBC

import java.time._
case class Member(id: Long, name: Option[String], createdAt: ZonedDateTime)
object Member extends SQLSyntaxSupport[Member] {
  override val tableName = "members"
  def apply(rs: WrappedResultSet) = new Member(
    rs.long("id"), rs.stringOpt("name"), rs.zonedDateTime("created_at"))
}

In general, ORMs need a few things:

They need some kind of case class representing a row in the database table, with each field in the case class representing a single entry in that database column
They need some kind of object representing the database table itself, with each field in that object representing the entire database column as-a-whole. This may have table-level or column-level configuration, and support table-level or column-level operations

The case class and the object usually have a lot of similarities, but it is impossible to encapsulate this boilerplate using normal Scala language features.

You cannot, for example, define a trait and use that to auto-generate the case class signature and object members.
You might be able to use a sufficiently abstract trait to enforce that the case class and object have matching sets of column definitions. But as has been discussed earlier, merely enforcing that the boilerplate matches a particular pattern is not enough. People want to encapsulate the boilerplate so they don’t see it!

What ends up happening is one of two things:

Listing out all the columns in the database table twice: once for the case class and once for the object.
Move the configurability from the object into magic annotations on the case class, and generate the object using an expression-macro. This gives up considerable flexibility, and introduces its own weird DSL: there is no standard for annotations, and any annotations can do just about anything. This is what Squeryl and Quill do with their table[T] and query[T] macros respectively:

Squeryl

class Book(
  val id: Long,  
  var title: String,  
  @Column("AUTHOR_ID") // the default ‘exact match’ policy can be overriden
  var authorId: Long,  
  var coAuthorId: Option[Long]
) {
    def this() = this(0,“”,0,Some(0L))  
}

val books = table[Book]

Squeryl also supports more dynamic configuration via schema objects, as a sort of look-aside table containing an odd DSL

object Library extends Schema {    
  on(borrowals)(b => declare(  
    b.numberOfPhonecallsForNonReturn defaultsTo(0),  
    b.borrowerAccountId is(indexed),  
    columns(b.scheduledToReturnOn, b.borrowerAccountId) are(indexed)  
  ))

  on(authors)(s => declare(  
    s.email is(unique,indexed(“idxEmailAddresses”)), //indexes can be named explicitely  
    s.firstName is(indexed),  
    s.lastName is(indexed, dbType(“varchar(255)”)), // the default column type can be overriden  
    columns(s.firstName, s.lastName) are(indexed)  
  ))  
}

Quill

case class Circle(radius: Float)

val areas = quote {
  query[Circle].map(c => pi * c.radius * c.radius)
}

Quill goes a different way, and instead of annotations, it makes configuration get pulled in via implicit resolution:

def example = {
  implicit val personSchemaMeta = schemaMeta[Person]("people", _.id -> "person_id")

  ctx.run(query[Person])
  // SELECT x.person_id, x.name, x.age FROM people x
}

These workarounds work, but they’re not ideal. People want to define their database table as a case class and object pair. The object representing the entire table and the case class representing a row within it, with many similarities but many differences. You can configure either separately if you need something unusual

People don’t want to jump through hoops with annotations that get read by magic expression-macros to do their thing, or be forced to define their config in some look-aside data structure, or have the configuration of their database table be pieced together via implicit resolution. But given the boilerplate of duplicating all definitions N times to set up the case class/object pair, the weird ad-hoc workarounds become attractive.

If we could allow users to write an annotation macro that expands predictably into a case class/object pair, with some programmable defaults and allowing user-definable overrides, that would obliviate the need for a lot of these crazy contortions that ORM libraries go through to let users define and configure their schema in a type-safe way

MateuszKowalewski · June 15, 2023, 4:51pm

Thank you for expanding in such detail on my remark that “I want to abstract away the persistence layer”!

It shows exactly where all the boilerplate is.

To give the example more weight: Imagine something like a web-CMS. There you have often dozens or even hundreds of flat tables. All the “logic” operating on them is usually always the same. Basically CRUD, with some hooks.

In Scala you would currently need to write all the semi-complex repetitive code out by hand.

Compare with something like Java’s JPA:

@Entity
public class T {
   @Id private K id;
   // … rest of Java boilerplate for data type

@Repository
public interface TRepository extends JpaRepository<T, K> {}

// somewhere else:
private final TRepository tRepository; // gets injected…
// …
tRepository.findById(id);

(And this above could be even more abstracted in case of something like the mentioned CMS if you would have some kind of code templating… Just spit out this code snippet for all kinds of Ts.)

You get basically everything for free, just by some magic annotations. That’s why people use stuff like Spring. Even a junior dev can be very productive with that because things are really simple and straight forward!

Java-land is code-gen land. Same for Go (even it got better since they have “generics”).

Also all the many usages of Rust macros are prominent examples.

MateuszKowalewski · June 15, 2023, 7:19pm

As a small addition, as I just realized that this wasn’t mentioned here anywhere:

Python, a language marketed as simple and approachable even for beginners has also excellent meta-programming features everybody is using on a day to day basis.

https://python-3-patterns-idioms-test.readthedocs.io/en/latest/PythonDecorators.html

https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Metaprogramming.html

The whole purpose of meta-classes is of course the programmatic introduction of new class definitions.

Also you can see “decorators” (~ macro annotations) everywhere in Python! From data-classes to serialization frameworks, and all kinds of boilerplate reduction, down to validation, logging, or debugging aid.

It’s really hard to write real-word Python without at least some decorators.

People seem to love them:

Please note that decorators are quite similar in some regards to Scala’s implicit functions. Over there people think you can’t live without and use them everywhere. In Scala implicit functions are “dreaded” because someone overused them somewhere in the past and someone else complained very loud… And what happened than was the above already mentioned complete overreaction in Scala.

(The same goes by the way for implicit conversions: When you look at them in C# everybody loves them! But Scala is trying to fight them lately… The “issues” with them are imho mostly a marketing thing. At some point Google’s auto-suggest even returned as continuation of the query “Scala implicit conversions” the word “bad” as top suggestion; but for “C# implicit conversion” it spits out positively annotated and helpful content anytime I tried.)

kitlangton · February 26, 2024, 2:48am

Eureka! After (literal) years of experimentation—though, luckily for my psyche, extremely intermittent experimentation—I’ve stumbled upon a solution* to this problem. *At least, a solution for my particular problem subspace.

The Problem

I’ve been pining for one specific aspect of Scala 2’s meta-programming: The ability to outfit a companion object with a set of methods, loosely derived from the structure of the underlying trait or case class.

I’m recalling the anguish of writing Lenses in longhand:

case class Person(name: String, age: Int)

object Person:
  val name = Lens[Person](_.name)(p => name => p.copy(name = name))
  val age  = Lens[Person](_.age)(p => age => p.copy(age = age))

Whereas, in Scala 2, we had:

@deriveLenses
case class Person(name: String, age: Int)

Person.name // Lens[Person, String]
Person.age  // Lens[Person, Int]

Similarly, a common pattern of boilerplate besmirching many a ZIO codebase is that of “accessors”:

trait ExampleService:
  def add(x: Int, y: Int): Task[Int]

object ExampleService:
  def add(x: Int, y: Int): ZIO[ExampleService, Throwable, Int] = 
    ZIO.serviceWithZIO(_.add(x, y))

So much needless RSI, when we simply could’ve written:

@deriveAccessors
trait ExampleService:
  def add(x: Int, y: Int): Task[Int]

Solution

I’ve taken five or six abortive stabs at this problem throughout the years. The closest I’d found previously is the Selectable pattern. The pattern, described in this issue, didn’t work at first, due to the lack of autocomplete support—hence the issue, which has since been addressed .

So, until now, the best I’d had was this (copy-pasted from the issue, so ignore the commented caveat):

case class Person(name: String, age: Int)

object Person {
  val lenses = Lenses.gen[Person]
}

Person.lenses.name // For this to be tenable, this would need to autocomplete with the type Lens[Person, String]

You know, this ain’t too bad. But that little gap of convenience, of needing to call through some intermediate Selectable value, has been gnawing at me. I wanted to call Person.name or ExampleService.method directly.

And so, I’ve finally concocted a way of doing this. I’m surprised it works at all, to be honest. It is, essentially, the daisy-chaining together of the Selectable pattern with a Conversion and a given macro, allowing for arbitrary macro-generated extension methods. It’s a neat trick and I’m lucky to have found it, because there are about 12 subtle variations which all fail spectacularly. I was on the verge of giving up when it finally compiled.

With this trick in place, we get the following:

case class Person(name: String, age: Int, isAlive: Boolean)
object Person extends DeriveLenses

@main
def example(): Unit =
  val person  = Person("Alice", 42, true)
  val name    = Person.name.get(person)
  val age     = Person.age.get(person)
  val isAlive = Person.isAlive.get(person)

  println(s"Name: $name, Age: $age, Is Alive: $isAlive")

It’s still not perfect, as one must extend the Companion object, which means one must still define the companion object, even if it’s otherwise unnecessary. Yet, save for that blemish, this long sought after syntactic summit is finally reachable.

It works for the ZIO accessor pattern as well:

trait ExampleService:
  def launchRockets(): Task[Unit]
  def addNumbers(a: Int, b: Int): UIO[Int]

object ExampleService extends DeriveAccessors

object Example extends ZIOAppDefault:
  val program: ZIO[ExampleService, Throwable, Unit] =
    for
      _ <- ExampleService.addNumbers(1, 2)
      _ <- ExampleService.launchRockets()
    yield ()

  val run =
    program.provide(ExampleServiceLive.layer)

The other downside, of course, is that the transparent inline defs required to make this work, only truly work with Metals. So IDEA is decidedly uninvited to the party. I really hope this changes before long, but that’s a separate issue.

Code

The implementation of DeriveLenses is here: quotidian/examples/shared/src/main/scala/quotidian/examples/lens/LensMacros.scala at main · kitlangton/quotidian · GitHub

As you can see, it’s a thin, yet necessary, wrapper around some other macro-generated bits.

trait DeriveLenses:
  given conversion( 
      using
      cc: CompanionClass[this.type],
      lenses: LensesFor[cc.Out]
  ): Conversion[this.type, lenses.Out] =
    _ => lenses.lenses

I hope the pattern can be simplified somewhat (open to suggestions!), but at least it’s nice and clean at the call-site.

Final Entreaty

Of course, what I’d really love is the reinstatement of this particular subset of annotation macros. It sure would be neat if they could once again extend the companion object with arbitrary helper methods.

Luckily, one can achieve the same effect with this combination of non-experimental Scala 3 macros + other mechanisms. Therefore, if anyone fears of the consequences of such a feature, well: Be afraid now! . The only issue is that we’re about 2% shy of syntactic perfection.

Anyhow, thanks for reading! I hope this was useful/entertaining/distracting-from-some-chronic-pain-now-reminded-of. And, just to end on a positive note: Any frustration I express, now or ever, over the Scala 3 macro system is born of pure joy and love. It’s been so fun messing around with (and trying to break) it over all these years. Endless gratitude to all who build and maintain it.

smarter · February 26, 2024, 7:52pm

I believe you could use the “computed field names” support in dotty/docs/_docs/reference/experimental/named-tuples.md at named-tuples-2 · dotty-staging/dotty · GitHub to avoid the conversion.

steinybot · June 19, 2024, 11:57am

That link is broken. I’m guessing it was merged into main. dotty/docs/_docs/reference/experimental/named-tuples.md at main · dotty-staging/dotty · GitHub

ncreep · April 15, 2025, 1:48am

To follow up on @smarter’s response, I tried adapting @kitlangton’s code with computed field types. I wrote up the result here.

TL;DR: it works and lets us simplify Kit’s approach by removing both implicit conversions and transparent macros from the solution (at the price of a small syntactic addition). IDEA still doesn’t work with this approach due to only partial support of named tuples.