What kinds of macros should Scala 3 support?

At the least, a handwave description of how some of the commonly-used whitebox use cases would be addressed in the new environment would probably go a long ways towards alleviating the general angst about losing the old system. I think most folks have accepted that macro libraries will need rewriting – everyone just wants some confidence that the key functionality will still be possible.

(In principle, I very much like the new design – it’s much more elegant than the old system. But I can’t say I grok yet how it will apply to some of the more complex use cases…)

2 Likes

The hash is not enough for bazel, and probably won’t work for make (and maybe pants, I don’t know). Bazel insists on reading the inputs to see if it should rebuild. If you say the .apihash is the input, then you have to only compile with that.

You don’t have a way to say if this hash changes, then pass me this jar. You can only declare your real inputs (buck does allow you to customize the key of the build cache).

Bazel devs at Google are reluctant to change this because it is a non-trivial change to the model, and they are nervous about buggy implementations of this apihash. Since bazel does aggressive distributed caching, correctness is a priority to not corrupt the cache.

I am still not quite sure how Tasty fits into this picture. If there is a 1-to-1 correspondence between Tasty constructs and the interface that programmers will actually use to consume and produce it (shown in definitions.scala), then what is the point of imposing Tasty in the first place? Does it not become all but an implementation detail?

It seems to me that most of the trickiness will come from those “semantic operations” added on top of the standard data structures. I imagine we can’t let each compiler implement them using its own infrastructure, as this would result in the very problems of compiler-dependent semantics that we already have. So in effect, if Scalac/nsc wants to support this new macro interface, won’t it have to basically embed the whole Dotty/dotc frontend to implements these operations?

One important point: Tasty is fully internally-typed, so there should be no support for “untyped tree”, as opposed to the current macro system. This probably entails a quite different (possibly better) way of developing macros.
It is not listed on the blog post, but I assume one of the semantic operations will be to dynamically re-infer the types of reconstructed program fragments, which also seems to be a non-trivial task.

2 Likes

Should the dotty team be proactive and contact library authors to try and solve their issues together? Should library authors open whitebox issues on the dotty repo? (And all that is without knowing what hell goes on in private repos that use macros).

Yes to both. We will reach out to library authors to work with them to solve their issues so that the projects can be integrated in the new community build. The plan is, once we have a macro system working, we will gradually port projects, hopefully with the help of the authors. We will start with testing frameworks because these are usually at the bottom of the dependency graph for any projects and work our way up. Opening issues in the dotty repo is a good way to start the conversation. But it might be best to hold off with that until we have a first version that works.

As far as I understand it, the problematic case is a whitebox macro or annotation macro that produces new definitions or refines the types of existing ones, where these definitions have to be accessed in the same compile. I.e. it is not feasible that all consumers of these abstractions reside in a downstream project.

@olafurpg did a survey of existing macro-uses, and classified them into categories. I can see two possible categories that would match the description above: typeclass derivation and compile-time type providers.

Typeclass derivation will be supported natively in the language, but we have more discussions and experimentation work to do before we can settle on a concrete scheme. Getting use cases and empirical data in this area would be very valuable.

Compile-time type providers that produce Scala types from external sources like schema descriptions will need to be factored out into an upstream project. I.e. say you want to import a database schema S. Project A would contain the macros that read S and produce case classes that mirror S. Then all code that accesses these case classes would have to be in a different project. I believe that’s actually a saner way to go about things than to mix everything in one project. The advantage is that the generated types in project A can be inspected and verified separately - since the Doc tool info is integrated in Tasty you could even provide ScalaDocs for generated code!

I understand. The only thing I’m worried about is that we may not have the full picture of how whitebox macros are actually used. I suggest a “Call for whitebox macro issues” to be extended out to library authors and encourage them to open issues on the dotty repo. If you see that a specific use-case is already handled, then that issue can be closed with reference to the PR/blog post that handles it. I think it is best to properly document all the macro use-cases, so we can know ahead of time which Scala3 solutions cover which macro use-case (and see what we’re missing).

TL;DR, I propose:

  • Create a flag for whitebox macro support on the dotty bug tracker.
  • Encourage library authors to detail their whitebox macro use cases (maybe suggest a template), now.
  • Covered use-case tickets can be closed with reference to “the plan how”.

Call for whitebox macro issues

Related: Whitebox def macros

Yes, but I think the dotty tracker is a better way to handle this and also provide (very) detailed tickets.
I’m open to a separate bug tracker. That is OK too.

I added the following section to the post to summarize the relationship between ScalaMeta and Scala 3 macros.

Meta Programming in the Large

The future Scala 3 macro design is intended to replace the existing def macros and the scala.reflect infrastructure. But there is another meta programming system that is quite complementary to it: Scalameta provides high-quality syntactic and semantic analysis and code generation tools which are separate from the Scala compiler. As the name implies, Scalameta is run at the meta level, that is, it takes programs as input and produces syntactic or semantic information or rewritten programs as output. A macro system, by contrast, is integrated in the language and expands programs as they are compiled. There are potential synergies between the two projects. To name but two possibilities:

  • Scalameta or projects derived from it such as SemanticDB could obtain type information directly from Tasty, which would make them independent from specific compilers.
  • IDEs could use Tasty for single projects but refer to SemanticDB for more complicated multi-project and multi-language builds.
2 Likes

This is a fantastic idea, thanks for the great work. I might be a little late here, but the symbols ' and ~ seems strange to me.

From my understanding ~ starts a block that is executed at compile time, whereas ' inserts a symbol into the code there.
So what about changing them to meta and $ (akin to string interpolation)?

E.g.

inline def concat[Xs <: HList, Ys <: HList](xs: Xs, ys: Ys): Concat[Xs, Ys] = 
  meta {
    case Xs =:= HNil => ${ys}
    case Xs =:= HCons[type X, type Xs1] => ${Cons(xs.hd, concat(xs.tl, ys))}
  }

2 Likes

Compile-time type providers that produce Scala types from external sources like schema descriptions will need to be factored out into an upstream project. I.e. say you want to import a database schema S. Project A would contain the macros that read S and produce case classes that mirror S. Then all code that accesses these case classes would have to be in a different project.

That seems to be a huge pain - that is, I need to better understand what it looks like real projects, but it is exactly the kind of pain that Java with its “one class by file” rule created, but here it is at project level.

Please, be very, very careful when introducing that kind of requirement. Builds are already sufficiently hard to manage by themself, it would be a real barrier (especially for new comers) if any project wanting to demo a json-schema (or whatever) example need to also explain how multi-projects build works…

Again, it may not be that important, I don’t really understand the cases where it will be needed for now.

If creating a multi-project build is difficult compared to creating and using a proper JSON schema, I think the fault lies with the build system. Conceptually, creating an upstream project is trivial; build tooling can be altered to make it trivial in practice, too. (Existence proof: it is trivial with cargo, Rust’s build tool. It’s not that hard with sbt either, though it does feel too much like an exotic and dangerous thing to do.)

However, a lot of trivial boilerplate is still onerous, so your point stands: requiring separate projects can be burdensome if it is frequently required. (E.g. if reasonable projects would require dozens of separate sub-projects.)

1 Like

I believe the proposed syntax is derived from LISPs macro system which used ' for “do not evaluate” which in LISP means “keep this as code”. I think ~ is selected because it is not heavily used and LISP used , which would definitely be bad in Scala.

I think using meta would be reasonable, but $ would be a problem since it is actually a valid “letter” in many places in Scala (and Java). So I would be very nervous to make $ act as an operator in any context.

I’m a proponent of making anything which is rarely used relatively verbose (like asInstanceOf). So I would actually go for the older Scala meta proposal of meta to prevent evaluation and inline to reenable it. Any code that uses these operators a lot should probably look obviously unusual anyway.

Olafur certainly did a great job at analyzing the various use-cases (here’s the mentioned blog with the results), though I think it might be worth re-visiting the subject, as the analysis mostly contained open-source library code, which is usually quite different from “everyday” application code.

For example, I would expect to see a lot more usages of libraries such as circe or monocle, and much less of spire and parboiled, not to mention annotations (which I think are a dead-end and people coming from Java often instinctively avoid them, but that’s another topic :wink: )

If that would be interesting for the development group, we’d be happy to help run such a more detailed survey on a larger scale.

Adam

2 Likes

This is kind of a naïve question, but I think it’s worth answering: Why can’t Scala 3 allow white box macros by rerunning the type checker after applying the macros?

It would require duplicate work, but it would only be needed when whitebox macros are actually used.

If the first run of the type checker succeeds, then running it again but with refined types will presumably also succeed, and will not achieve anything. Whitebox macros are useful when they are needed for the (first) type-checking phase to complete successfully – i.e., they guide the type checker for the right types and implicits to be inferred.

If that would be interesting for the development group, we’d be happy to help run such a more detailed survey on a larger scale.

It would definitely be interesting!

I was wondering what would the best method here be, and I suppose any automated means of checking the source code would raise both privacy concerns and be hard to do accurately (because of transitive dependencies).

So a self-reported usage report would be the way to go. We could ask people which macro libraries they (consciously :slight_smile: ) use in their projects, with a division between blackbox/whitebox/annotation macros if the library offers multiple. Maybe @olafurpg has a good starting list of such projects? Plus a free-form area where people could state if they have custom macros, and what are the use-cases.

What do you think?

Adam

When designing the research try to keep in mind that, at least in my experience, a lot of scala codebase is close to Java and written and maintained by people not that familiar with advanced topics such as macro and their differentiation. So if you are interested in real statistics, try to keep questions very specific, so the person who answers them doesnt need to know what is whitebox/blackbox macro or if they’re even using macros in the first place :slight_smile:

Ah yes, sure, I wouldn’t expect anyone not interested in macros development to know about the difference. What I had in mind is asking about specific features of a library, if it offers both blackbox/whitebox/annotation macros.

@adamw But I think that users would likely not know what kinds of macros they were using. At least not the difference between whitebox and blackbox. Annotation/def macros is easier to answer, so this could be a useful datapoint.

Maybe it’s easier to ask: Can you give us a shortlist of the macros you use most often (name of macro and library where it comes from)? Given the list, we can do the classification ourselves.