Support for arbitrary evaluation of `Expr[A]` -> `A` at compile-time

Current Support

Currently, the only way to go from an Expr[A]A in a Scala 3 Macro are using a FromExpr. In order to implement a FromExpr[A], you must essentially pattern match on an arbitrary tree. This makes it very limited what you can support.

For example, you get a FromExpr[String] by default, which can extract only literal strings. You could define your own FromExpr[String], which could match on something like String + String and String.toUpperCase/String.toLowerCase, but then its ”hi there, ” + Person.toString, and your ability to match on that is gone to the wind.

Context

In order to explain all of this, I will need to set up a well defined example. I will start with a lightweight version of the ”Macro Mirror” helper I generally use, and then an example typeclass to derive. Please note that some of these examples are severely simplified compared to how you would really try to implement something for the case of brevity to make a point.

Lightweight Macro Mirror

Usage of this helper is demonstrated below, and makes it extremely easy to grab all the bits you generally need to derive typeclass instances.

trait CaseClass[A] {

  val name: String
  val tpe: Type[A]
  val fields: List[Field[?]]

  def instantiate(exprs: List[Expr[?]]): Expr[A]

  def mapFields[O](f: [b] => Type[b] ?=> Field[b] => O): List[O] =
    fields.map { _field =>
      type B
      val field: Field[B] = _field.asInstanceOf[Field[B]]
      f[B](using field.tpe)(field)
    }

  def optionalAnnotationExpr[T <: scala.annotation.Annotation: Type](using Quotes): Option[Expr[T]]

  trait Field[B] {

    val name: String
    val tpe: Type[B]

    def select(expr: Expr[A])(using Quotes): Expr[B]
    def summonInstance[F[_]: Type](using Quotes): Expr[F[B]]

    def optionalAnnotationExpr[T <: scala.annotation.Annotation: Type](using Quotes): Option[Expr[T]]

  }

}
object CaseClass {

  def of[A: Type](using Quotes): CaseClass[A] = ???

}

Example Typeclass

A basic schema. Things can be NonObjectLike, or ObjectLike. Things that are ObjectLike have a list of Field, which has a name and the MySchema for that field.

sealed trait MySchema[A] {

  val typeName: String
  def transform[B](ab: A => B, ba: B => A)(using ct: ClassTag[B]): MySchema[B]

}
object MySchema {

  case class flattenFields() extends scala.annotation.Annotation

  trait NonObjectLike[A] extends MySchema[A] {
    override final def transform[B](ab: A => B, ba: B => A)(using ct: ClassTag[B]): MySchema.NonObjectLike[B] = MySchema.TransformNonObjectLike(this, ct.runtimeClass.getName, ab, ba)
  }

  final case class TransformNonObjectLike[A, B](a: NonObjectLike[A], typeName: String, ab: A => B, ba: B => A) extends NonObjectLike[B]

  trait ObjectLike[A] extends MySchema[A] {
    lazy val fields: List[Field[?]]
    override final def transform[B](ab: A => B, ba: B => A)(using ct: ClassTag[B]): MySchema.ObjectLike[B] = MySchema.TransformObjectLike(this, ct.runtimeClass.getName, ab, ba)
  }

  final case class TransformObjectLike[A, B](a: ObjectLike[A], typeName: String, ab: A => B, ba: B => A) extends ObjectLike[B] {
    override lazy val fields: List[Field[?]] = a.fields
  }

  final case class Field[B](
      name: String,
      schema: MySchema[B],
  )

  private def derivedImpl[A: Type](using Quotes): Expr[MySchema[A]] = ???

  inline def derived[A]: MySchema[A] = ${ derivedImpl[A] }

}

Example Types

  given MySchema[Int] = ???
  given MySchema[String] = ???
  given MySchema[Boolean] = ???

  final case class CaseClass1(int: Int) derives MySchema

  final case class CaseClass2(string1: String, string2: String)
  object CaseClass2 {

    final case class Repr(string: String) derives MySchema

    private val reg = "^([^:]*):([^:]*)$".r

    given MySchema[CaseClass2] =
      summon[MySchema[CaseClass2.Repr]].transform(
        _.string match {
          case reg(a, b) => CaseClass2(a, b)
          case _         => ???
        },
        cc2 => CaseClass2.Repr(s"${cc2.string1}:${cc2.string2}"),
      )

  }

  final case class CaseClass3(cc1: CaseClass1, cc2: CaseClass2, boolean: Boolean) derives MySchema

Problem Statement

You can get a lot, and I mean A LOT done with the current situation. You are limited, but there is almost always a way to ”get it done”, but ”getting it done” in this context usually means that you must push any potential error validations and/or optimizations from the compile-time phase to the run-time phase.

Here are a few potential things you could do with such a schema, where you are forced into the runtime phase without your macro essentially implementing an entire compiler. It is my hope that the compiler could implement the compiler :joy: .

Generating and Validating SQL

  private def allFieldNames(prefix: String, schema: MySchema[?]): List[String] =
    schema match {
      case _: MySchema.NonObjectLike[?]   => prefix :: Nil
      case schema: MySchema.ObjectLike[?] =>
        schema.fields.flatMap { field =>
          val newPrefix = if prefix.isEmpty then field.name else s"${prefix}_${field.name}"
          allFieldNames(newPrefix, field.schema)
        }
    }

  def sqlSelectAllQuery(schema: MySchema[?]): String =
    s"SELECT ${allFieldNames("", schema).mkString(", ")} FROM ${schema.typeName}"

  def conflictingFieldNames(schema: MySchema[?]): Set[String] = {
    val allFields: List[String] = allFieldNames("", schema)

    allFields.groupBy(identity)
      .iterator
      .flatMap { case (k, vs) => Option.when(vs.size != 1)(k) }
      .toSet
  }

Once you have this schema, it is very easy to write a function which could generate a select all SQL query or validate that fields are distinct. The problem? In your macro, you are only able to get Expr[MySchema[?]], not MySchema[?]. With only an Expr[MySchema[?]], it is not possible to do that generation at compile-time. Sure, you could do:

  def sqlSelectAllQueryCompileTime(schema: Expr[MySchema[?]])(using Quotes): Expr[String] = '{ sqlSelectAllQuery($schema) }
  def conflictingFieldNamesCompileTime(schema: Expr[MySchema[?]])(using Quotes): Expr[Set[String]] = '{ conflictingFieldNames($schema) }

But that requires you to wait until run-time to do that validation.

Json Field Flattening

  def deriveCaseClass[A: Type](using Quotes): Expr[MySchema.ObjectLike[A]] = {
    val caseClass: CaseClass[A] = CaseClass.of[A]

    val childFieldExprs1: List[Expr[List[MySchema.Field[?]]]] =
      caseClass.mapFields[Expr[List[MySchema.Field[?]]]] { [b] => _ ?=> (field: caseClass.Field[b]) =>
        val flatten: Boolean = field.optionalAnnotationExpr[MySchema.flattenFields].nonEmpty
        val instance: Expr[MySchema[b]] = field.summonInstance[MySchema]
        // Yes, I summoned `MySchema[b]` instead of conditionally summoning `MySchema.ObjectLike[b]`. I will explain why later.

        if flatten then generateNestedField[A, b](caseClass)(field, instance)
        else generateFlattenedFields[A, b](caseClass)(field, instance)
      }

    val childFieldExprs2: Expr[List[MySchema.Field[?]]] =
      '{ ${ Expr.ofList(childFieldExprs1) }.flatten }

    '{
      new MySchema.ObjectLike[A] {
        override val typeName: String = ${ Expr(caseClass.name) }
        override lazy val fields: List[MySchema.Field[?]] = $childFieldExprs2
      }
    }
  }

  private def generateNestedField[A: Type, B: Type](caseClass: CaseClass[A])(field: caseClass.Field[B], instanceExpr: Expr[MySchema[B]])(using Quotes): Expr[List[MySchema.Field[B]]] =
    '{
      List(
        MySchema.Field[B](
          name = ${ Expr(field.name) },
          schema = $instanceExpr,
        ),
      )
    }

  private def generateFlattenedFields[A: Type, B: Type](caseClass: CaseClass[A])(field: caseClass.Field[B], instanceExpr: Expr[MySchema[B]])(using Quotes): Expr[List[MySchema.Field[?]]] =
    '{
      val instance: MySchema[B] = $instanceExpr
      instance match
        case instance: MySchema.ObjectLike[B] => instance.fields
        case _: MySchema.NonObjectLike[B]     => throw new RuntimeException(s"Can not flatten `MySchema.NonObjectLike` for field ${${ Expr(field.name) }}")
    }

Again, since we are unable to go from Expr[MySchema[A]]MySchema[A] at compile-time, we are stuck holding the bag, and have to rely on doing it at run-time. In a very simple example like this one, we could have been stricter and required that we were able to summon a MySchema.ObjectLike[b], but let’s say that is too much to ask of the user of our macro in a more complicated case. Even if it was a realistic ask, it doesn’t help with the SQL example above, its the same underlying issue: We can not arbitrarily go from Expr[A]A.

Failed Attempts

I attempted to use the scala3-staging library to do this, as it has the exact function signature I am looking for: Expr[A] => A. That didn’t work, as the Expr did not come from quotes provided by the Compiler.

[error]     |Exception occurred while executing macro expansion.
[error]     |scala.quoted.runtime.impl.ScopeException: Cannot use Expr oustide of the macro splice `${...}` or the scala.quoted.staging.run(...)` where it was defined

Desired Functionality

A function with the signature, which works without a FromExpr:

  object compiler {
    def compileTimeEval[A](expr: Expr[A])(using Quotes): A = ???
  }

Which could be used in the following way:

  trait Tags[A] {
    def tags: Set[String]
  }
  object Tags {

    def const[A](t: String*): Tags[A] =
      new Tags[A] {
        override val tags: Set[String] = t.toSet
      }

    final case class Both[A, B](a: Tags[A], b: Tags[B]) extends Tags[A & B] {
      override lazy val tags: Set[String] = a.tags ++ b.tags
    }

    private def tagsForBothImpl[A: Type, B: Type](using quotes: Quotes): Expr[Tags[A & B]] = {
      import quotes.reflect.*
      val aInstanceExpr: Expr[Tags[A]] = Expr.summon[Tags[A]].getOrElse { report.errorAndAbort(s"Missing Tags for ${Type.show[A]}") }
      val bInstanceExpr: Expr[Tags[B]] = Expr.summon[Tags[B]].getOrElse { report.errorAndAbort(s"Missing Tags for ${Type.show[B]}") }
      val aInstance: Tags[A] = compiler.compileTimeEval { aInstanceExpr }
      val bInstance: Tags[B] = compiler.compileTimeEval { bInstanceExpr }
      val overlap: Set[String] = aInstance.tags & bInstance.tags
      
      // it could also be possible here to have some sort of flag or ENV var which would flip this back and forth from compile-time to run-time (DEV/PROD).
      if overlap.nonEmpty then
        report.errorAndAbort(
          s"""Overlap between Tags for ${Type.show[A]} & ${Type.show[B]}
           |  ${Type.show[A]} : ${aInstance.tags.mkString(", ")}
           |  ${Type.show[B]} : ${bInstance.tags.mkString(", ")}
           |  overlap: ${overlap.mkString(", ")}
           |""".stripMargin,
        )

      // here, you could technically choose to just inline a call to `Tags.const`,
      // but I arbitrarily chose to keep the original Expr[A]/Expr[B].
      '{ Tags.Both($aInstanceExpr, $bInstanceExpr) }
    }
    inline def tagsForBoth[A, B]: Tags[A & B] = ${ tagsForBothImpl[A, B] }

  }
  final class ClassA
  final class ClassB
  final class ClassC

  given aTags: Tags[ClassA] = Tags.const("A", "extra")
  given bTags: Tags[ClassB] = Tags.const("B")
  given cTags: Tags[ClassC] = Tags.const("C", "extra")

  val ab: Tags[ClassA & ClassB] = Tags.tagsForBoth[ClassA, ClassB] // inlines: Tags.Both(aTags, bTags)
  val bc: Tags[ClassB & ClassC] = Tags.tagsForBoth[ClassB, ClassC] // inlines: Tags.Both(bTags, cTags)
  val ac: Tags[ClassA & ClassC] = Tags.tagsForBoth[ClassA, ClassC] // compile error, overlap: extra

It is very confusing to me why this would not be allowed and already supported.

  • There is an extension library to the compiler which is able to evaluate an arbitrary Expr[Expr[Expr[A]]] at runtime, which would presumably mean supporting arbitrary conversion of Expr[A]A.
  • The compiler must also already support using code within the same project at compile time if you can define a class in a project, and use it within a macro.

Given this, why would the scala-3 macro system not give you an Expr[A]A? I understand its complicated, but it seems like all the necessary bits to support it are already there, and its just a deliberate decision to not allow it?

Scala already has one of the best macro systems in any language, and besides a more usable API, not having this compile-time eval functionality feels like its the only thing keeping it from being essentially perfect.

4 Likes

I would love compile-time evaluation of arbitrary expressions (incl. lambdas, i.e. Expr[A => B] => (A => B)). I asked @nicolasstucki about it at ScalaDays 2023. I recall him noting it would require full Tasty interpreter (i.e. a lot of work) and I think handling some edge cases (maybe regarding calling out to Java?).

I wish someone could undertake this project, as it would give Scala compile-time multi-stage programming.

1 Like

How do things like this get done? The Scala core team deems it valuable enough and prioritizes it?

The part that I don’t get is that there is this runtime stage compiler which already exists, and is able to have multiple levels of staging, which can generate code, load it, execute it, and have that generate more code, repeat. This already exists. In order to do that, it gives you a quotes instance. You already have a quotes instance at compile-time.

I must be severely under-thinking this, because it feels like all the pieces already exist for this to happen. If someone knows why the pieces that already exist for runtime multi-stage programming, and that macros can load code from within the same project, is not enough to make this happen without tons of extra work, it would be greatly appreciated :smiley:

Given the requirement of not being allowed to call macros in the same file, it seems to me that the requirement would be that if you have an Expr[?], that the Term underlying that Expr can not:

  1. Be in the same file as the macro
  2. Be in the same file as the macro caller
  3. Have a direct or transitive dependency on the file calling the macro

And as long as thats the case, you just compile that file, and shove it into the class loader the same way the runtime multi-stage compiler works?

My sense is that big language features need to be worth someone’s PhD or post-doc project, otherwise, they don’t get done. Sigh.

I think that your proposed constraints might suffice to implement it on top of the existing tools. Which would be cool and not require a PhD project :slight_smile:

In particular, I think the main thing that makes a difference between staging.run vs. a macro is that, by example:

val x: Boolean = <big-fat-expression>

staging.run {
  if x then Expr("x is true") else Expr("x is false")
}

// vs.

myMacro {
  if x then "x is true" else "x is false"
}

the Expr[String] to be compiled and run by staging.run does not depend on anything in the same file (indeed, the Expr[String] is simply either Expr(“x is true”) or Expr(“x is false”)).

On the other hand, when myMacro would try to evaluate it’s argument (of type Expr[String]) by compiling and running it, it would hit a reference to x which is not yet compiled.

Note, however, that the restriction of not depending on anything in the same file also excludes your intended use:

final class ClassA
final class ClassB

given aTags: Tags[ClassA] = Tags.const("A", "extra")
given bTags: Tags[ClassB] = Tags.const("B")

// error: expressions passed to compiler.compileTimeEval reference
// the not-yet-compiled ClassA, ClassB, aTags, bTags
val ab: Tags[ClassA & ClassB] = Tags.tagsForBoth[ClassA, ClassB]

For this simple example, it work not work, assuming that’s all in one file. It could easily be split into ClassA.scala, ClassB.scala and util.scala/Main.scala, and then it would work.

As far as the if x then ““ else ““ example, the reference to the same file seems to fit into the category of only being in the same file by choice. It seems it could easily be defined elsewhere. I could definitely see how this might decrease usability, but it sounds WAY better to me than not having this feature at all… Maybe an initial implementation could work for this use case, and then potentially be relaxed by covering simple cases (IDK, just a thought).

I agree. Would love to see a constrained version of this.