Pre-SIP: Usable quotes API in std lib

Motivation

After much effort, as a user of the scala language, it seems apparent that the manner in which the quotes API is defined offers an unfortunately low level of usability. This is caused by several factors.

  1. The manner in which the std lib makes all the types within quotes.reflect abstract makes the API extremely difficult to use.

    1. Intellij can not figure out the extension methods whatsoever. So, for the many developers who use Intellij and would like to program against this API, they are severely limited in the code completion capabilities. This is debilitating when trying to use an API with compiler-level complexity. Maybe metals is better, not sure.

    2. One might say ā€œIntellij just needs to fix its completionā€, but this does not fix the fact that this manner of definition is A: strange, and B: completely loses exhaustive pattern matching capabilities. Now, imagine trying to write symbol.tree mat... looking to match on the tree of a symbol, and not only can the IDE not find .tree, but even if you told it with (symbol.tree: Tree) mat.., you can’t exhaustively match on what the options are. Many people would probably quit here. Speaking from personal experience, I gave up on this API twice before I finally wrote an entire wrapper around it, and was then able to make actual progress.

      type DefDef <: ValOrDefDef
      
      given DefDefTypeTest: TypeTest[Tree, DefDef]
      
      val DefDef: DefDefModule
      
      trait DefDefModule { this: DefDef.type =>
        def apply(symbol: Symbol, rhsFn: List[List[Tree]] => Option[Term]): DefDef
        def copy(original: Tree)(name: String, paramss: List[ParamClause], tpt: TypeTree, rhs: Option[Term]): DefDef
        def unapply(ddef: DefDef): (String, List[ParamClause], TypeTree, Option[Term])
      }
      
      given DefDefMethods: DefDefMethods
      
      trait DefDefMethods:
        extension (self: DefDef)
          def paramss: List[ParamClause]
      
          def leadingTypeParams: List[TypeDef]
      
          def trailingParamss: List[ParamClause]
      
          def termParamss: List[TermParamClause]
      
          def returnTpt: TypeTree
      
          def rhs: Option[Term]
        end extension
      end DefDefMethods
      
  2. The scoped nature where types like Term and TypeRepr are defined make it very restrictive to work with at scale. When writing a large or complicated program, its necessary to be able to split code up into multiple files, and define types and abstractions for your domain logic.

    1. Imagine trying to define the following type:

      final case class Function(
          rootTree: Tree,
          params: List[Function.Param],
          body: Term,
      )
      object Function {
      
            final case class Param(
                name: String,
                tpe: TypeRepr,
                tree: Tree,
                fromInput: Option[Expr[Any] => Expr[Any]],
            )
      
      }
      
    2. In order to do this, you have to either define it within your function body, like:

      def myCode(using quotes: Quotes): Any = {
        import quotes.reflect.*
        final case class Function // ...
        
        // do stuff with Function
      }          
      
    3. Or within a class, like:

      final class MyCode(using val quotes: Quotes): Any = {
        import quotes.reflect.*
        final case class Function // ...
      
        // do stuff with Function
      }
      
    4. It would be insane to try and define all this within a single function, so many open-source libs go with an approach like #3. But, even then, its very easy to end up with multi-thousand line files inside something like MyCode, because the type system is making it very difficult to split things out. It is sometimes possible, if you try really hard, to split things out into separate files, but even then there are limitations, and it makes things very messy:

      final class Types1(using val quotes: Quotes) {
        import quotes.reflect.*
      
        final case class Function(
            rootTree: Tree,
            params: List[Function.Param],
            body: Term,
        )
        object Function {
      
          final case class Param(
              name: String,
              tpe: TypeRepr,
              tree: Tree,
              fromInput: Option[Expr[Any] => Expr[Any]],
          )
      
          def parse(term: Term): Function =
            ??? // TODO (KR) :
      
        }
      
      }
      
      final class Types2[Q <: Quotes](using val quotes: Q) {
        import quotes.reflect.*
      
        final case class Function(
            rootTree: Tree,
            params: List[Function.Param],
            body: Term,
        )
        object Function {
      
          final case class Param(
              name: String,
              tpe: TypeRepr,
              tree: Tree,
              fromInput: Option[Expr[Any] => Expr[Any]],
          )
      
          def parse(term: Term): Function =
            ??? // TODO (KR) :
      
        }
      
      }
      
      final class Logic1(using val quotes: Quotes) {
        val types: Types1 = Types1(using quotes)
        import quotes.reflect.*
        import types.*
      
        def getFunction(term: Term): Function =
          Function.parse(term) // error, wrong `Quotes` type
      
      }
      
      final class Logic2(using val quotes: Quotes) {
        val types: Types1 = Types1(using quotes)
        import types.*
        import types.quotes.reflect.* // this matters
      
        def getFunction(term: Term): Function =
          Function.parse(term)
      
      }
      
      final class Logic3(using val quotes: Quotes) {
        val types: Types2[quotes.type] = Types2(using quotes)
        import quotes.reflect.*
        import types.*
      
        def getFunction(term: Term): Function =
          Function.parse(term)
      
      }
      
    5. It seems like it should be very intuitive to be able to do something along the lines of:

      import scala.quoted.ast.*
      
      final case class Function(
          rootTree: Tree,
          params: List[Function.Param],
          body: Term,
      )
      object Function {
      
        final case class Param(
            name: String,
            tpe: TypeRepr,
            tree: Tree,
            fromInput: Option[Expr[Any] => Expr[Any]],
        )
      
        def parse(term: Term): Function =
          ??? // TODO (KR) :
      
      }
      
      def myCode(expr: Expr[Any])(using quotes: Quotes): Function =
        Function.parse(expr.asTerm)
      
    6. As a general principle, it seems that if using an API borderline forces you to define any related logic in a single file, it is not designed properly.

Potential Downfalls

It is possible that there is something inherent to the Quotes API that forces all instances to be scoped to the same Quotes instance, but this seems unlikely, for a few reasons:

  1. The API enforces that the only way you are getting an instance of Quotes is via an inline def + interpolate impl, so its not like there are many instances of Quotes coming from different roots. You are only ever getting an initial instance from 1 place, and any other nested instances are derived from that one.
  2. If you really care about the exact instance of Quotes which a Symbol or Tree belongs to, it seems far more detrimental to have an API that encourages files thousands of lines long, with dependent types everywhere, and the only thing making it usable is a global import quotes.reflect.* at the top. Therefore, if you are quoting and splicing Exprs, and have helper types with something like final case class MyType(repr: TypeRepr), then any MyType created in some nesting technically has the wrong Quotes instance.

Suggested Design


package scala.quoted.ast

trait Quoted private[ast] {
  def quotes: Quotes
}

trait Symbol private[ast] extends Quoted

sealed trait Tree extends Quoted {

  def symbol: Symbol

}

sealed trait Statement extends Tree

sealed trait Term extends Statement

sealed trait Definition extends Statement

sealed trait ValOrDefDef extends Definition

trait ValDef private[ast] extends ValOrDefDef
object ValDef {
  
  def apply(symbol: Symbol, rhs: Option[Term])(using quotes: Quotes): ValDef =
    quotes.reflect.ValDef.apply(symbol, rhs)
  
}

This way, you still need an instance of Quotes to create instances of things, but you are not burdened with AST nodes being scoped as an inner class. And then the implementations can happen elsewhere, like:

package scala.quoted.ast.impl

private[quoted] trait Tree { self: ast.Tree =>

  def symbol: ast.Symbol = implemented
  
}

private[quoted] final case class ValDef(quotes: Quotes, /* ... */) extends ast.ValDef

Also, without the limitation of the inner classes, you can have nice top level definitions like:

final class Expressions[F[_]] // ...

type Id[A] = A

trait ProductMirror[A] {

  val tpe: Type[A]
  val label: String
  val fields: Seq[ProductMirror.Field[?]]

  final case class Field[B](
      idx: Int,
      name: String,
      sym: Symbol, // no quotes nesting, just a normal class
      tpe: Type[B],
      get: Expr[A] => Expr[B],
  ) {

    def getExpr[F[_]](expressions: Expressions[F]): F[B] = ??? // ...

    def typeClass[F[_]]: Expr[B] = ??? // ...

  }

  def typeClasses[F[_]]: Expressions[F] = ??? // ...

  def instantiate(f: [b] => Field[b] => b): A = ??? // ...

  def instantiateEither[L](f: [b] => Field[b] => Either[L, b]): Either[L, A] = ??? // ...

  // and many other very easily usable builders, no fighting with scope

}

And derivation is just as easy:

trait Show[A] {
  def show(a: A): String
}
object Show {

  def product[A](g: ProductMirror[A])(using quotes: Quotes): Expr[Show[A]] = {
    def fields(a: Expr[A]): Seq[Expr[String]] =
      g.fields.flatMap { f =>
        Seq(Expr(f.name + " = "), '{ ${ f.typeClass[Show] }.show(${ field.get(a) }) })
      }
    def all(a: Expr[A]): Seq[Expr[String]] =
      Seq(
        Seq(Expr(g.label + "(")),
        fields(a),
        Seq(Expr(")")),
      ).flatten

    new Show[A] {
      def show(a: A): String = '{ ${ Expr.ofSeq(all('a)) }.mkString }
    }
  }

}

In a world without this nesting constantly getting in your way, IMO, there is really no need for mirrors. All the mirrors do is give you a boatload of asInstanceOf, and uncertainty about what is inlined and what isnt.

With a usable API, its actually easier and more type safe to just implement generic type classes directly with quotes/exprs, instead of type-level summon functions and mirrors. But currently doing this requires all sorts of extra hoops to jump through.

It also gives you way more control over the code thats generated. And you can easily do things like caching lazy vals outside your instance, and then when you summon an instance, it just gets the cached lazy val.

TLDR: lots of amazing things you can do with quotes & exprs, but the way the API is defined quite heavily makes the programmers life more difficult.

1 Like

I would suggest you attempt to create this as an external library first that uses the existing scheme internally.

I would say I don’t see this ever changing again to avoid breakage for all those who already moved to Svala 3 macros. But if you create a library that is better and sits on top, that’s the proper way forward to support the ā€˜old’ API and offer a better one.

3 Likes

@soronpo, before I posted this yesterday, this is what I was going through the exercise of doing. The reason I made the post suggesting the change be made to the std lib instead of an external library is as follows:

  1. This API is HUGE, and unless there was some kind of code generation mechanism, keeping it in sync with the std lib would be humanly impossible.
  2. The effort of coverting these types back and forth was very grueling, and compiling very slowly, so it seemed easier to convert it at the std lib level.

Do you think there would be an openness to coontributing this to the std lib, if it could be done in a backwards compatible manner?

The std lib very rarely changes and the changes are also minor and backwards compatible. And I bet you can actually do most of the work with AI. Just write a partial API of what you want to do so a pattern is clearly understandable by the AI and ask it to complete it for you.

I really don’t see why it should be like this and even if so, I’m skeptical a new API from scratch would be much faster.

If you can make this backwards compatible, sure. I’m not sure this needs to go through a SIP process. Currently changes to the stdlib are up to the decision of the compiler team.

What would be the best place to get feedback from the compiler team? Discord? Github issue?

I feel your pain. Not only is the API hard to use, it is a footgun, too.

Regarding exhaustivity, though, I don’t think you can hope for exhaustive pattern matches, unless you freeze the language. If you pattern-matching on user-provided code, I think you have to settle for supporting a only defined subset of code constructs, and try to give a helpful error message otherwise.

Regarding the current design of Quotes as a module with type definitions inside, one can legitimately question, as you do, whether it does not cause more problems than it solves. I’d even say that your example of

final class Logic3(using val quotes: Quotes) {
  val types: Types2[quotes.type] = Types2(using quotes)

is still rather mild in terms of how complex things can get.

However, my preferred solution would be to improve usability of this sort of modular programming, as it would be useful much more broadly than just the Quotes API. There are some ideas in this thread.

To clarify, I think that the functionality provided by macros in scala-3 is absolutely amazing. The amount of power under the hood, and the things you can achieve, is amazing. The only thing I have a problem with here is usability and scoping.

On your point about not having exhaustive pattern matching, this is a huge bummer, and a totally fair point. That being said, I still think having actual traits defined, instead of a dependent type B <: A improves both:

  1. The compilers ability to understand the code, and generate match statements at all. Having the IDE write this for you:
    x match {
      case A => ???
      case B => ???
      case C => ???
      case _ => ???
    }
    
    is still WAY better than getting:
    x match {
      case 
    }
    
    because it cant figure out anything.
  2. The programmers ability to understand the code. Having all of this module or dependent nesting, whether the semantics and usability of such a concept is improved, feels overkill and confusing, in my opinion. Why should the user be exposed to such levels of complexity? It feels very natural to me that I have a scala.quoted.Expr[?], and as long as I have a given Quotes instance, I can do .asTerm, and get a scala.quoted.ast.Term, and then I can click on .asTerm, because the IDE actually knows that exists, and then can see trait Term, and its defined like a normal type that I understand and see every day as a scala developer.

@TomasMikula My 2 questions for you would be:

  1. What value do you see being derived from having these types defined within some dependent module, and just having a better way to express that, as opposed to being top-level definitions.
  2. Could you provide an example of how your module improvement proposal would look for the following example? Admittedly, I had a bit of a difficult time following the other example, potentially because it was not as related to real examples I have experienced. Would it be possible to define the following helper as a top-level definition?
    final class K0[Q <: Quotes](using val quotes: Q) {
      import quotes.reflect.*
    
      trait ProductGeneric[A] {
    
        val fields: Seq[Field[?]]
    
        final case class Field[I](
            idx: Int,
            symRepr: Symbol,
            constructorSymRepr: Symbol,
            typeRepr: TypeRepr,
            tpe: Type[I],
            valDef: ValDef,
            get: Expr[A] => Expr[I],
        )
    
      }
    
    }
    

The quotes API was designed this way since it needs to hide the compiler. Without the compiler doing the actual work you would need to re-implement most of its functionality in the quotes implementation. This will take many years and the result will probably still not be a 100% match. So, that’s not a viable option. That leaves you with two possibilites:

  • Hide by type abstraction. That’s what’s done in the Quotes API.
  • Hide by wrapping everything. You’d need to wrap every exposed type and introduce global bijective maps to go back and forth without losing reference identity. I believe that’s also a lot of work, and I doubt the added complexity of the interface layer gets amortized by easier usage. But if you want to go ahead with the idea of an alternative facade for the quotes API that would be the way to go.

One caveat though: There is no way a massive blob like that will land in the standard library without extensive trials in the community at large. So it will need to start life as a separate library. Then, if most people would agree that it’s an important improvement for their work, we can discuss whether to include this in stdlib at some later point.

4 Likes

When I say ā€œmy preferred solution would be to improve usability of this sort of modular programmingā€, it is where I’d prefer efforts be directed, given the current state. I didn’t mean to imply that the current design was superior.

Your K0 is already a top-level definition, so I suppose you want to make ProductGeneric top-level. But you already know how to do that, too. For example, you can add the [Q <: Quotes] type parameter or val quotes: Quotes member to the ProductGeneric trait:

trait ProductGeneric[Q <: Quotes, A] { ... }

// or 

trait ProductGeneric[A] {
  val quotes: Quotes
  ...
}

The problem arises when you need to convince the typechecker that, for example, TypeRepr inside p1: ProductGeneric is the same type as TypeRepr inside p2: ProductGeneric. The proposal (and the proposal linked from it) are supposed to help with that problem.

For example, it would allow you to define a type alias

type ProdGeneric(using q: Quotes)[A] = ProductGeneric[q.type, A]

// or

type ProdGeneric(using q: Quotes)[A] = ProductGeneric[A] { val quotes: q.type }

In any context where given q: Quotes is available, you would simply use the type ProdGeneric[A] (inferred to be ProdGeneric(using q)[A]). If you had p1, p2: ProdGeneric[A], the compiler would know that TypeRepr inside p1 is the same as TypeRepr in p2, which is the problem I was trying to solve.

as it is currently defined, a Quotes object is tightly coupled to the exact symbol and position where a macro is expanded from, its not ā€œglobalā€ in that sense