Scala 3, macro annotations and code generation

Yes, something like that! :smiley:

How it works in the end under the hood, I donā€™t care actually.

But the ā€œtemplatingā€ needs to be sane, safe, and convenient even for less skilled people.

My impression was that the current quote stuff in Scala, with its Expr[?] abstraction, would make a really great ā€œtemplating languageā€. Itā€™s the best Iā€™ve seen so far as itā€™s type safe!

The trigger points that would deliver the data to the ā€œtemplatesā€ would be hand written annotated definitions (of for example case classes).

The results of the triggered code-gen needs to be ā€œmaterialā€ as this would be otherwise way to much opaque magic that canā€™t be debugged reasonably.

And yes, such a feature would be extremely useful! The lives of lesser beings consist in large parts of writing repetitive boilerplately code. Cutting this down to the bare minimum would make Scala especially attractive to Jon-Doe-average-programmer. It would be almost a killer feature for some jobs, making mundane tasks really easyā€”without compromise on safety or tooling support (like in the case of stringy code templates that are the only way to achieve the stated goal currently in Scala).

Just think about the large market share of poor web devs working with all kinds of languages who do mostly nothing else than writing such kind of ā€œboilerplateā€; defining entities, code that brings them over the wire, and persists them on the server side. Most of this is copy-paste, while just replacing entity and field names. A framework that could abstract this away would be a game changer! Spring killerā€¦

Thanks a lot for trying to understand what the pain points are, and what would make things substantially better! Thatā€™s something I love Scala for. People are listening. (You sometimes just need to cry loud enoughā€¦ :grin:)

3 Likes

Itā€™s worth noting that annotation processors are among the most popular tools right now in java-land (mapstruct, immutables, micronaut), and that somehow the annotation macro produces java files with code that are visible in the same compilation unit, because you are able to use the generated definitions on the same file where you introduced the annotations that produce the generated code.
I donā€™t know how this magic happens, but it is there and it is very necessary for the general usage of annotation processors, in java at least.

1 Like

You can use Kotlin compiler plugins in other contexts too, including Maven and REPL, but it is nice that IDEAā€™s error highlighting doesnā€™t get too confused by the syntactic absence of generated stuff.

Someone already linked the Dotty example. Some examples from my own code:

  1. uPickle generates JSON serializers for each arity of tuple (upickle/build.sc at f9bf9984e5175e5f4b2020db17f99e26a3037250 Ā· com-lihaoyi/upickle Ā· GitHub). Not sure if this will fully go away with Scala 3, or whether weā€™ll need to keep the current implementation for performance

  2. Templatized generics: files like upickle/ujson/templates-jvm/DoubleToDecimalElem.java at main Ā· com-lihaoyi/upickle Ā· GitHub have their Elem string replaced by Byte or Char, effectively specializing/monomorphizing them and avoiding boxing that would arise with generics. This is similar to whatā€™s done in Java-land for specialized collections like FastUtils or Koloboke Collections

  3. IDL codegen: at work we do build-time codegen from .proto schemas, OpenAPI specs, and AWS API specs to provide typed RPCs. The goal here is primarily to provide type-safe access to something defined outside the Scala codebase. In my personal projects, Iā€™ve also used Scalably Typed which works via codegen.

There are also places where I havenā€™t bitten the bullet to use codegen, but thereā€™s tons of boilerplate which cannot be made to go away:

  1. Defining a whole bunch of related case classes with the same field, e.g. all Exprs in the Sjsonnet config compiler have a pos: Position field sjsonnet/sjsonnet/src/sjsonnet/Expr.scala at master Ā· databricks/sjsonnet Ā· GitHub. They all extend a trait, so .pos can be used seamlessly in downstream code, but itā€™s tedious to have to include pos: Position in every single case class declaration when thereā€™s a lot of them.

  2. Injecting implicits throughout all methods in a object, e.g. Fastparseā€™s def number[$: P] context bound. Having a context bound or implicit/using param isnā€™t a big deal when you have a few of them, but when you have hundreds of them one-on-every-line even the smallest amount of boilerplate gets old.

    • Multiple implicits can be combined together easily into a single implicits via wrappers (e.g. here), but in many cases - such as for FastParse rules - even a single implicit is a ton of boilerplate when it happens on every single line (hence the contortions around context bounds to try and minimize it)
  3. Dependency injection via implicits, somewhat similar to above. I wrote a compiler plugin back in the day to automatically add (implicit foo: Foo) to all definitions in annotated files (GitHub - lihaoyi/sinject: SInject is a Scala compiler plugin which helps auto-generate implicit parameters), to remove the boilerplate of tediously declaring the implicit over and over and over.

  4. Re-using parameter lists between functions, without forcing the user to construct and pass in a config object. e.g. in Requests-Scala, the same parameter list is copy-pasted 4 times (1 2 3 4 with minor tweaks. There are some other places where the copy-pasta happens in expressions that arenā€™t particularly amenable to solve via macros (1 2).

    • If the method signatures are exactly the same, then I could get away with defining a class Foo{ def apply(...) } and instantiating Foo multiple times. That is what is done for requests.get/post/etc. to re-use the signatures. But in the case of .get/.get.stream/Request(...), the signatures are slightly different, which means i have to copy-paste-edit the whole thing each time I want a new one

    • These cases could be resolved by some kind of **kwargs keyword-argument-expansion language feature as exists in Python: both at the call site ā€œexpandingā€ a case class via foo("hello", **bar)into a bunch of keyword-arguments foo("hello", qux = bar.qux, baz = bar.baz), and at the definition side where could define a def foo(s: String, bar: MyCaseClass**) and have it automatically expand into a bunch of keyword parameters (with types and defaults) def foo(s: String, qux: Int, baz: String)

I also do a bunch of faux-macro-annotations, that donā€™t need to introduce stuff visible to the typer, but bundle up metadata or definitions for use at runtime. These requirements are probably satisfied by the ā€œtransparentā€ macro annotations that run purely after-typer

  1. mainargs @main
  2. Cask @get, @post, @postJson, @websockets, etc.
  3. Mill def myCommand() = T.command{ ... } (not quite an annotation - it is discovered based on return type instead - but it works basically the same way)

Now, I wonā€™t say that the way Scala allows you to abstract over definitions is bad. You can get surprisingly far with traits, type parameters, higher-kinded types, implicits, and so on. But thereā€™s definitely a gap there.

In other languages you might not even notice this boilerplate, because everything is so boilerplatey it kind of blends together. But in Scala, given how nice we can make a lot of our expression-related code with functions, HoFs, by-name params, and macros, these areas of clunkiness really stand out and are probably the motivation for a lot of the requests for macro annotations

5 Likes

Perhaps another set of places where the current Scala features for abstracting over definitions are insufficient, are those around ORM:

SLICK

final case class Coffee(name: String, price: Double)
// Next define how Slick maps from a database table to Scala objects
class Coffees(tag: Tag) extends Table[Coffee](tag, "COFFEES") {
  def name  = column[String]("NAME")
  def price = column[Double]("PRICE")
  def * = (name, price).mapTo[Coffee]
}
// The `TableQuery` object gives us access to Slick's rich query API
val coffees = TableQuery[Coffees]

ScalikeJDBC

import java.time._
case class Member(id: Long, name: Option[String], createdAt: ZonedDateTime)
object Member extends SQLSyntaxSupport[Member] {
  override val tableName = "members"
  def apply(rs: WrappedResultSet) = new Member(
    rs.long("id"), rs.stringOpt("name"), rs.zonedDateTime("created_at"))
}

In general, ORMs need a few things:

  1. They need some kind of case class representing a row in the database table, with each field in the case class representing a single entry in that database column

  2. They need some kind of object representing the database table itself, with each field in that object representing the entire database column as-a-whole. This may have table-level or column-level configuration, and support table-level or column-level operations

The case class and the object usually have a lot of similarities, but it is impossible to encapsulate this boilerplate using normal Scala language features.

  1. You cannot, for example, define a trait and use that to auto-generate the case class signature and object members.

  2. You might be able to use a sufficiently abstract trait to enforce that the case class and object have matching sets of column definitions. But as has been discussed earlier, merely enforcing that the boilerplate matches a particular pattern is not enough. People want to encapsulate the boilerplate so they donā€™t see it!

What ends up happening is one of two things:

  1. Listing out all the columns in the database table twice: once for the case class and once for the object.

  2. Move the configurability from the object into magic annotations on the case class, and generate the object using an expression-macro. This gives up considerable flexibility, and introduces its own weird DSL: there is no standard for annotations, and any annotations can do just about anything. This is what Squeryl and Quill do with their table[T] and query[T] macros respectively:

Squeryl

class Book(
  val id: Long,  
  var title: String,  
  @Column("AUTHOR_ID") // the default ā€˜exact matchā€™ policy can be overriden
  var authorId: Long,  
  var coAuthorId: Option[Long]
) {
    def this() = this(0,ā€œā€,0,Some(0L))  
}

val books = table[Book]

Squeryl also supports more dynamic configuration via schema objects, as a sort of look-aside table containing an odd DSL

object Library extends Schema {    
  on(borrowals)(b => declare(  
    b.numberOfPhonecallsForNonReturn defaultsTo(0),  
    b.borrowerAccountId is(indexed),  
    columns(b.scheduledToReturnOn, b.borrowerAccountId) are(indexed)  
  ))

  on(authors)(s => declare(  
    s.email is(unique,indexed(ā€œidxEmailAddressesā€)), //indexes can be named explicitely  
    s.firstName is(indexed),  
    s.lastName is(indexed, dbType(ā€œvarchar(255)ā€)), // the default column type can be overriden  
    columns(s.firstName, s.lastName) are(indexed)  
  ))  
}

Quill

case class Circle(radius: Float)

val areas = quote {
  query[Circle].map(c => pi * c.radius * c.radius)
}

Quill goes a different way, and instead of annotations, it makes configuration get pulled in via implicit resolution:

def example = {
  implicit val personSchemaMeta = schemaMeta[Person]("people", _.id -> "person_id")

  ctx.run(query[Person])
  // SELECT x.person_id, x.name, x.age FROM people x
}

These workarounds work, but theyā€™re not ideal. People want to define their database table as a case class and object pair. The object representing the entire table and the case class representing a row within it, with many similarities but many differences. You can configure either separately if you need something unusual

People donā€™t want to jump through hoops with annotations that get read by magic expression-macros to do their thing, or be forced to define their config in some look-aside data structure, or have the configuration of their database table be pieced together via implicit resolution. But given the boilerplate of duplicating all definitions N times to set up the case class/object pair, the weird ad-hoc workarounds become attractive.

If we could allow users to write an annotation macro that expands predictably into a case class/object pair, with some programmable defaults and allowing user-definable overrides, that would obliviate the need for a lot of these crazy contortions that ORM libraries go through to let users define and configure their schema in a type-safe way

9 Likes

Thank you for expanding in such detail on my remark that ā€œI want to abstract away the persistence layerā€!

It shows exactly where all the boilerplate is. :+1:

To give the example more weight: Imagine something like a web-CMS. There you have often dozens or even hundreds of flat tables. All the ā€œlogicā€ operating on them is usually always the same. Basically CRUD, with some hooks.

In Scala you would currently need to write all the semi-complex repetitive code out by hand.

Compare with something like Javaā€™s JPA:

@Entity
public class T {
   @Id private K id;
   // ā€¦ rest of Java boilerplate for data type

@Repository
public interface TRepository extends JpaRepository<T, K> {}

// somewhere else:
private final TRepository tRepository; // gets injectedā€¦
// ā€¦
tRepository.findById(id);

(And this above could be even more abstracted in case of something like the mentioned CMS if you would have some kind of code templatingā€¦ Just spit out this code snippet for all kinds of Ts.)

You get basically everything for free, just by some magic annotations. Thatā€™s why people use stuff like Spring. Even a junior dev can be very productive with that because things are really simple and straight forward!

Java-land is code-gen land. Same for Go (even it got better since they have ā€œgenericsā€).

Also all the many usages of Rust macros are prominent examples.

1 Like

As a small addition, as I just realized that this wasnā€™t mentioned here anywhere:

Python, a language marketed as simple and approachable even for beginners has also excellent meta-programming features everybody is using on a day to day basis.

https://python-3-patterns-idioms-test.readthedocs.io/en/latest/PythonDecorators.html

https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Metaprogramming.html

The whole purpose of meta-classes is of course the programmatic introduction of new class definitions.

Also you can see ā€œdecoratorsā€ (~ macro annotations) everywhere in Python! From data-classes to serialization frameworks, and all kinds of boilerplate reduction, down to validation, logging, or debugging aid.

Itā€™s really hard to write real-word Python without at least some decorators.

People seem to love them:

Please note that decorators are quite similar in some regards to Scalaā€™s implicit functions. Over there people think you canā€™t live without and use them everywhere. In Scala implicit functions are ā€œdreadedā€ because someone overused them somewhere in the past and someone else complained very loudā€¦ And what happened than was the above already mentioned complete overreaction in Scala.

(The same goes by the way for implicit conversions: When you look at them in C# everybody loves them! But Scala is trying to fight them latelyā€¦ The ā€œissuesā€ with them are imho mostly a marketing thing. At some point Googleā€™s auto-suggest even returned as continuation of the query ā€œScala implicit conversionsā€ the word ā€œbadā€ as top suggestion; but for ā€œC# implicit conversionā€ it spits out positively annotated and helpful content anytime I tried.)

2 Likes

Eureka! After (literal) years of experimentationā€”though, luckily for my psyche, extremely intermittent experimentationā€”Iā€™ve stumbled upon a solution* to this problem. *At least, a solution for my particular problem subspace.

The Problem

Iā€™ve been pining for one specific aspect of Scala 2ā€™s meta-programming: The ability to outfit a companion object with a set of methods, loosely derived from the structure of the underlying trait or case class.

Iā€™m recalling the anguish of writing Lenses in longhand:

case class Person(name: String, age: Int)

object Person:
  val name = Lens[Person](_.name)(p => name => p.copy(name = name))
  val age  = Lens[Person](_.age)(p => age => p.copy(age = age))

Whereas, in Scala 2, we had:

@deriveLenses
case class Person(name: String, age: Int)

Person.name // Lens[Person, String]
Person.age  // Lens[Person, Int]

Similarly, a common pattern of boilerplate besmirching many a ZIO codebase is that of ā€œaccessorsā€:

trait ExampleService:
  def add(x: Int, y: Int): Task[Int]

object ExampleService:
  def add(x: Int, y: Int): ZIO[ExampleService, Throwable, Int] = 
    ZIO.serviceWithZIO(_.add(x, y))

So much needless RSI, when we simply couldā€™ve written:

@deriveAccessors
trait ExampleService:
  def add(x: Int, y: Int): Task[Int]

Solution

Iā€™ve taken five or six abortive stabs at this problem throughout the years. The closest Iā€™d found previously is the Selectable pattern. The pattern, described in this issue, didnā€™t work at first, due to the lack of autocomplete supportā€”hence the issue, which has since been addressed :partying_face:.

So, until now, the best Iā€™d had was this (copy-pasted from the issue, so ignore the commented caveat):

case class Person(name: String, age: Int)

object Person {
  val lenses = Lenses.gen[Person]
}

Person.lenses.name // For this to be tenable, this would need to autocomplete with the type Lens[Person, String]

You know, this ainā€™t too bad. But that little gap of convenience, of needing to call through some intermediate Selectable value, has been gnawing at me. I wanted to call Person.name or ExampleService.method directly.

And so, Iā€™ve finally concocted a way of doing this. Iā€™m surprised it works at all, to be honest. It is, essentially, the daisy-chaining together of the Selectable pattern with a Conversion and a given macro, allowing for arbitrary macro-generated extension methods. Itā€™s a neat trick and Iā€™m lucky to have found it, because there are about 12 subtle variations which all fail spectacularly. I was on the verge of giving up when it finally compiled.

With this trick in place, we get the following:

case class Person(name: String, age: Int, isAlive: Boolean)
object Person extends DeriveLenses

@main
def example(): Unit =
  val person  = Person("Alice", 42, true)
  val name    = Person.name.get(person)
  val age     = Person.age.get(person)
  val isAlive = Person.isAlive.get(person)

  println(s"Name: $name, Age: $age, Is Alive: $isAlive")

Itā€™s still not perfect, as one must extend the Companion object, which means one must still define the companion object, even if itā€™s otherwise unnecessary. Yet, save for that blemish, this long sought after syntactic summit is finally reachable.

It works for the ZIO accessor pattern as well:

trait ExampleService:
  def launchRockets(): Task[Unit]
  def addNumbers(a: Int, b: Int): UIO[Int]

object ExampleService extends DeriveAccessors

object Example extends ZIOAppDefault:
  val program: ZIO[ExampleService, Throwable, Unit] =
    for
      _ <- ExampleService.addNumbers(1, 2)
      _ <- ExampleService.launchRockets()
    yield ()

  val run =
    program.provide(ExampleServiceLive.layer)

The other downside, of course, is that the transparent inline defs required to make this work, only truly work with Metals. So IDEA is decidedly uninvited to the party. I really hope this changes before long, but thatā€™s a separate issue.

Code

The implementation of DeriveLenses is here: quotidian/examples/shared/src/main/scala/quotidian/examples/lens/LensMacros.scala at main Ā· kitlangton/quotidian Ā· GitHub

As you can see, itā€™s a thin, yet necessary, wrapper around some other macro-generated bits.

trait DeriveLenses:
  given conversion( 
      using
      cc: CompanionClass[this.type],
      lenses: LensesFor[cc.Out]
  ): Conversion[this.type, lenses.Out] =
    _ => lenses.lenses

I hope the pattern can be simplified somewhat (open to suggestions!), but at least itā€™s nice and clean at the call-site.

Final Entreaty

Of course, what Iā€™d really love is the reinstatement of this particular subset of annotation macros. It sure would be neat if they could once again extend the companion object with arbitrary helper methods.

Luckily, one can achieve the same effect with this combination of non-experimental Scala 3 macros + other mechanisms. Therefore, if anyone fears of the consequences of such a feature, well: Be afraid now! :stuck_out_tongue_winking_eye:. The only issue is that weā€™re about 2% shy of syntactic perfection.

Anyhow, thanks for reading! I hope this was useful/entertaining/distracting-from-some-chronic-pain-now-reminded-of. And, just to end on a positive note: Any frustration I express, now or ever, over the Scala 3 macro system is born of pure joy and love. Itā€™s been so fun messing around with (and trying to break) it over all these years. :heart: Endless gratitude to all who build and maintain it.

20 Likes

I believe you could use the ā€œcomputed field namesā€ support in dotty/docs/_docs/reference/experimental/named-tuples.md at named-tuples-2 Ā· dotty-staging/dotty Ā· GitHub to avoid the conversion.

3 Likes