Pre-typer syntactic plugins in Scala 3?

I’m against dialects, but the “code generation by string concatenation” joke has to die finally!

It’s laughable that Scala, one of the most powerful and advanced programming languages under the sun, has literally nothing to offer in that regard, so you’re back to assembling raw strings like it was done already with m4 macros almost 50 years ago.

I can’t believe the thing I’m editing is just a string.

I also don’t get it.

Why are we converting an AST back and forth to some limited string representation constantly?

This should happen only once on input (as we need to type in the code somehow), and from there the editor should work with the rich data structure that an AST is (likely even augmented by some meta-data).

Problem is of course that we don’t have proper AST editors yet.

We have some kind of hybrid, with a “stringly frontend” which is powered by a backend engine that works with the actual AST. But we don’t even cache the AST in memory afaik. We convert constantly back and forth form strings. That seems indeed ridiculous.

I’m of course not asking to change that right now. This would need some greater move across the whole filed of software development.

But what I’m asking for is some sane and safe method to generate code in Scala.

Having to resort for code gen to raw strings, without any safety, not even against typos, in one of the most powerful languages out there seems just wrong. And yes, people want code gen! But the result should be still maintainable.

I guess there is even some space for research and advancement of the state of the art in this topic. Scala should see this as a chance to become better than other languages on one more axis.

But rankly, at the moment we’re at the technical level of the m4 macro processor, in my opinion.

OK, I’m exaggerating of course: Scala has already very advanced and powerful meta-programming facilities. Only that those don’t allow unrestricted code generation. You can’t even abstract over the creation of similar shaped data structures which differ only by some concrete names. In that case you would have to write some string template, and iterate through some data to fill in the gaps and write out the this way generated code to disk. That’s the most awful and error prone approach possible! Even the C pre-processor hat more safety guards built in…

And it’s not like this is some exotic feature nobody needs. Here, just some random examples what kind of workarounds are currently employed because of this limitation of the mata-programming features:

1 Like

It’s not a string, it’s a rope. And you’re holding far too much of it, give me some back!

Do you mean like, enough to toss over a rafter?

There is other text for that purpose.

I’m against dialects, but the “code generation by string concatenation” joke has to die finally!

Speaking of a well-executed joke, just look at what dialects did to Switzerland.

Making bureaucracy for the government clerks more complicated to handle?

I thought the central advantage of Scala was to create language dialects or domain-specific languages, external or embedded. Its ability to capture intent of code, but defer the execution I think has contributed to the success of Scala 2. Setting aside the purely functional camp like Scalaz/Cats and we can bundle sbt into it too, effectively all user-land success stories are some form of dialect that enabled something that was previously not possible or difficult on JVM.

  • Twitter’s Future was probably among the first major commercial success story of Scala 2.x (see https://youtu.be/Jfd7c1Bfl10?t=495 for details). Long before SIP-14 added a watered-down version of Future to scala-library, Twitter implemented Future with local scheduling, root compression, and cancellation.
  • Akka code looks nothing like normal Scala code, but it encoded message-passing actors that can automatically be distributed across machines with safety mechanisms like isolation and backpressure.
  • Spark powers many of the major enterprises for distributed computation, implemented by DataFrame, which literally takes a lambda expression and bundles it up and ships it across different worker machines.
  • Morgan Stanley’s code base (see https://www.youtube.com/watch?v=BW8S92jP5sE&t=984s for details) also seems to be powered by a dialect of Scala.

    We’ve created this construct, the Node, this is an annotation that extends the Scala language, and we’ve implemented it ourselves using a compiler plugin.

So I wouldn’t say that Scala 2.x succeeded despite the dialects, but because of it. This is not to say that Scala 2.x was without warts, bad rep, and complexities. Sometimes the developer experience of using some of the above are downright confusing and horrible when things fail, because:

  1. parallel computation and concurrency is confusing, esp when you hide them from the users.
  2. because it’s using some hacks that interfere with other language features in unobvious ways.

When we notice these sharp edges, I think better thing for Scala to do rather than shutting down dialect would be to adopt it into the feature, like Pickling and Spores for shipping lambadas, and make it a friendlier language to implement dialects. To put another way, if Spark was shopping around for a host language today, we should make sure they would still choose Scala 3.

5 Likes

I should have been more clear. When I said dialects, I meant things that are not expressible in normal Scala, but could be expressible by changing the parser, or having a pre-typer syntactic plugin, or doing stuff with macro annotations advanced enough to confuse tooling.

The things you mention, Twitter Futures, Akka, Spark, are not dialects in this sense. They demonstrate the great syntactic flexibility that Scala has already. It’s precisely for this reason that I think we don’t want to go beyond what Scala already offers.

2 Likes

I agree, I wouldn’t call such things “dialects” either.

As long as the resulting code is “vanilla Scala” it’s not a dialect.

But some more flexible form of code generation would still be more than nice!

Concatenating strings as seen above is also not tooling friendly, and you lose all the things the language normally provides. So some plan for a more powerful but still safe way of generating code needs to be made, imho.

1 Like

So this would exclude things like kind projector?

1 Like

Btw, is Ammonite’s import $ivy. now considered a “dialect” which needs to get burned? “Vanilla Scala” won’t compile that…

https://ammonite.io/#import$ivy

I hope all the given examples show now clearly that this “no language extension” policy is complete nonsense. Scala lives by its various extensions! (Which most of the time aren’t proper dialects anyway). If you take this away you have at best Kotlin. Why would anybody use Scala then?

Edit: I just realized that Kotlin has actually quite some dialects. They use compiler extensions excessively, and this is considered “a good thing”. So once more: The issue is again just marketing!

Yes, tooling is an extremely important part of the picture. Nobody claimed otherwise. But this should go hand in hand with powerful abilities for language extensions on all kinds of axes. So the key here would be to think about some language extension mechanisms that are tooling friendly, and actually properly integrated into tooling before prime time.

A good and stable officially supported pre-typer plugin API seems to be the right thing. (The alternative is of course just to hack something. Nothing holds one back to manipulate the part of the compiler that disallows currently pre-typer plugins. The JVM is a dynamic runtime, you can do all kinds of hacks, up to runtime bytecode rewriting. Should there be no official way people will just find workarounds, because people don’t like arbitrary limitations. Especially if those are there only for ideological reasons without any true technical necessity.)

In the end people want power, not limitations. @lihaoyi is just right about that. But limiting power where needed is of course also a valid requirement. I understand this, and that’s why I’ve proposed a kind of relief to this situation in the other thread.

The real problem are arbitrary extensions that aren’t properly integrated so people need to fall back to some kind of “hack”. So Scala should always look to pick those up when the time is ripe. Kind projector is a great example of how this can work out nicely! This never would have happened if things would have been outright limited form the get go.

1 Like

FWIW I would love if this kind of thing were available in mainstream Scala code… When I’m bootstrapping an experiment, writing a build.sbt can be a major distraction…

1 Like

Scala has now “magic comments” for that, which scala-cli will recognize.

I would suggest to try it out. It’s especially useful for some quick experiments.

Enjoy! :smiley:

3 Likes

@MateuszKowalewski
Unrestricted generation of source code using better way than string concatenation would certainly be very good, but is the compiler the right place to do it? Compilers consume source code, not produce one. Producing source code is more of a job for build tools with plugins.

Scala 3 has already an AST-based representation on disk: TASTy GitHub - scalacenter/tasty-query . Maybe instead of tasty-reader we can have a tasty-writer too? That would give us typesafe API to construct AST trees. That AST could then be decompiled using some tool to ugly *.scala file, then reformatted using scalafmt and finally we would have AST and nicely formatted *.scala file. I’m not sure if it makes sense, though. The downside is that compilation whould have to happen in phases, i.e. first you compile code generator in some module, then you run that generator to produce code for other module, then you compile the other module (that’s why you need support from build tools). Worse than Scala 3 macros which AFAIU allows you to run everything in one module (I’m talking about modules in sbt or Maven sense), but Scala 3 macros don’t let you generate code in unrestricted way. So there’s always a tradeoff. On the upside, the pretty printed generated code should be much easier to debug than some crazy macros giving crazy error messages.

To have type safe APIs this would mean you need to rebuild the Scala typer from scratch, backwards, in an external tool. This won’t fly…

Because writing TASTy directly is not more than sophisticated string concatenation. That’s back on the level of simple syntactic macros (like e.g. in Rust).

The whole point of TASTy is that it’s already typed (Typed Abstract Syntax Tree). If you construct it by hand you circumvent the typer and you can construct arbitrary garbage.

The other point is:

Scala has already a type-safe DSL to construct Scala expressions: The macro DSL. You don’t need to reinvent anything. It’s all there. It just needs the ability to get constructed in “phases” and the output from one phase is passed along to the next. Also this isn’t a very “innovative” idea. The multi-stage feature does something similar. You would just need to adjust is so the output is code on disk, and not ByteCode in JARs.

And for syntactic tweaks (which would not be possible with the above in the current state) some other mechanisms should be thought out. Maybe something like “meta-expressions” (in the form of syntactic templates) in the macro DSL?

That’s not a prob. macros and especially the multi-stage compilation feature in Scala 3 require this anyway. It’s already there.

This stopped working when we needed dependency from the current codebase.
Example: let me have case class MyCaseClass and want to implement Query[T].
We have class DB. with the method. query[T], and when we try to write an argument for
DB.query[MyCaseClass]( ... - IDE shows fields of MyCaseClass as possible choices.

(this kind of functionality implemented, for example, in the typescript mongoose )

If we will use tasty-writer for generation Query<MyCaseClass>. this means that we can’t use Query in the same module as MyCaseClass, which will be made programming quite uncomfortable.

1 Like

A real-world example of this is the source generator for scalapb. If you want the generated classes to extend some trait, you can make that work, but it’s kind of a pain.

1 Like