I don’t think it’s feasible any more to try to go down the whole bootstrap chain until some prehistoric Scala compiler which wasn’t build in Scala. This ship sailed long ago, imho.
The only way I can think of is to “build” a Scala compiler (or interpreter) in a different language, one which is bootstrapable; so likely in Java.
But of course rebuilding the compiler in Java is also not doable.
But one could try to “cheat”:
I’ve tried the obvious approach of just decompiling the compiler. But this does not work as no Java byte-code decompiler I know of (and I’ve tried all I could find) is able to generate sources that even would compile again. Also I don’t think the resulting code would be acceptable as it’s too much obfuscated. The main offender are pattern matches which decompile very often to labeled jumps. (The “goto-encoding” of pattern matches is an engineering marvel, really clever, very efficient, but the code can’t be comprehended by mortal beings any more frankly.)
So my current “best idea” would be to create a Scala (or TASTy) to Java compiler. This seems at least somehow doable as the compiler internal AST in late phases, close to byte-code generation, is actually “almost Java”. One would “just” need a kind of “Java pretty printer” for that AST.
As Scala’s and Java’s type-systems aren’t compatible statically typed Java output is not really achievable. But I think it would be OK if one would take type erased tees as input. The resulting code wouldn’t be very pretty and full of casts wherever generics where used, but this wouldn’t be worse than with for example compilers / interpreters written in dynamic languages. The code would be still acceptable as human readable and understandable, I think.
“Only” things that can’t be mapped directly to Java—and the expanded encodings aren’t really meant to be read by humans—would need some extra love. Like for example the mentioned pattern matches. (Maybe some of that can be solved by the new Java pattern match capabilities, or maybe by outputting some code which uses something like Vavr?)
This Scala => Java “transpiler” would be a throw-away artifact. It doesn’t need to be good for anything than this task here at hand. So no docs, no optimizations (which would be anyway contra-productive as the goal is to generate readable code, not fast code), and just enough features to process the current, but as much as possible striped down Scala sources once.
The result of this “pretty printing to Java” doesn’t need to be perfect! If some manual post processing of the output would be easier than trying to solve this in the “pretty printer” this would be OK; as long as this doesn’t get out of hands. The idea is to “cheat” around writing a Scala compiler in Java from scratch. Not to create a real Scala to Java compiler…
So the idea would be to first rip everything out of the compiler sources that isn’t strictly necessary to do the source => byte-code transformation. Especially things like the type checker. And also everything around the compiler as such, like runners, doc processors, whatever. (Though the std. lib is a little bit problematic as it’s quite large and not modular.)
Than compile the hopefully reasonably small remaining sources to Java with the help of the above described “AST pretty printer”.
Yes, no type checker in here. The bootstrap compiler would have only one purpose: Create a “seed” binary of the full current compiler. Form that you could used the result to build (first itself once again) and than further versions of Scala 3. We would have a proper bootstrap again! (I’m not sure it makes sense at this point to do the dance for Scala 2. I would leave it out, I guess.)
Also the build of this minimal “rump-compiler” needs to be rewritten in something else than SBT. Trying to bootstrap (or even just transpile) SBT is imho likely more complex than doing this for the compiler itself. SBT has a shitload of dependencies… (And that’s the problem with dependencies. You need to bootstrap all of them first! This explodes usually very quickly. Especially in an environment where the usual expectation is that you can just “download the internet” during build. Java is really a glory mess when it comes to dependencies and the “modularity” story. Almost all “modern” Java stuff can’t be build from source any more. All the newer big JVM projects are missing in the software repos. That’s for a reason. Super bad trend since around a decade… Nobody cares!). But I think creating a build for a kind of “rump-compiler” even in something like make
would be doable. (Or Ant, Maven, or whatever is already in the repos and would fit in here.)
Than you need to do more or less the same for SBT: Create a “rump-SBT”, with a build that is not based on SBT, and compile that with the bootstraped compiler. Use this “rump-SBT” to build SBT. Than you can run actually the regular Scala build. From here one could start packaging Scala libs finally!
So that’s my current “best” idea.
It would help tremendously if the Scala compiler, and actually it’s lib, would be as modular as possible!
Same for SBT.
This would really help deconstructing this whole thing to just the parts that are strictly needed for source to byte-code translation without further checks (as we may just assume the Scala codebase is correct, as it actually compiles with the “real” compiler with all checks).
The “Scala AST Java pretty printer” thingy would profit from good APIs for code generation (which could be tweaked to output Java syntax). But I think it would be doable even without proper APIs as one could try to hack the current AST pretty printer. The result would be anyway just one-time use throw-away code…
Yeah, I know this sounds “a little bit” crazy. But after looking into this some time ago, and playing around with some ideas I came to the conclusion that this would be still the “simplest” way to do it currently.
All this wouldn’t be needed of course if Scala would have a proper OpenSource story form the beginning, and if people would care about such things in the first place, and not make the situation even gradually worse over time.