New policy for contributing to the Scala language

Scala development goes AI-only, forbids human-written code!

On February 24, 2026, we merged scala/scala3#25326, introducing a formal policy for the use of LLM-based tools in contributions to the Scala 3 compiler. That policy has already been effective. It reduced a whole class of low-effort AI-assisted pull requests, made contributors more explicit about how patches were produced, and gave reviewers a clearer basis for asking for local validation, tests, and prompt history when needed.

However, after a little over one month of observing the results, it is now clear that the policy did not go far enough.

While it successfully reduced AI slop PRs, it did not reduce the amount of human-made mistakes.

We continue to spend too much time reviewing changes that were handwritten with complete confidence and only a passing relationship to the problem statement. The number of bugs we are still finding in manually authored fixes remains too high. Some of these patches even display recognizably human symptoms: premature cleverness, attachment to local minima, “obvious” two-line fixes that merely move the crash elsewhere, and comments written by someone who very clearly understood the code five minutes earlier.

If we want rapid and stable development of Scala, we need to address the remaining source of avoidable variance in the process.

Why now

On March 31, 2026, Scala 3.8.3 introduced safe mode, enabled with import language.experimental.safe or -language:experimental.safe. As described in the release notes, safe mode is a capability-safe subset intended for agent-generated or otherwise untrusted code. It rejects unchecked casts and unchecked pattern matches, forbids escape hatches such as caps.unsafe, @unchecked, and runtime reflection, and restricts access to global APIs unless they are known-safe or explicitly reviewed.

This gives us the missing technical foundation for a more modern compiler workflow: code can be generated quickly, checked aggressively, and admitted only through explicit, auditable boundaries such as @assumeSafe.

In other words, Scala 3 now has a language feature designed specifically for the one thing compiler development has historically lacked: a way to distinguish confident output from safe output.

The scientific basis

The policy is also supported by recent research. The paper Tracking Capabilities for Safer Agents proposes placing agents in a programming-language-based safety harness, where they express their intent as Scala 3 code with capture checking. The reported result is exactly what we were hoping to see: agents can generate capability-safe code with no significant loss in task performance, while the type system reliably blocks unsafe behaviour such as information leakage and unintended side effects.

We consider the policy conclusion straightforward. If an LLM can operate inside a statically enforced capability discipline, while a human can still open a pull request that “fixes” a bug by no longer calling the method that crashed, then the more reliable path is to keep the model and phase out the improvisation.

The new policy

Effective immediately, contributions to the Scala 3 compiler are subject to the following additional rules:

  • New implementation code must be generated with approved LLM tooling.
  • Generated code must be developed and validated in experimental.safe, except for narrowly reviewed @assumeSafe boundaries.
  • All new introductions to the Scala standard library submitted through the Scala Standard Library Process must be compliant with experimental.safe and annotated accordingly.
  • Human contributors remain responsible for the prompt, the specification, the tests, and the explanation of the resulting patch.
  • Handwritten compiler fixes are no longer considered a quality signal by themselves.

This change should be understood as a natural continuation of the February policy, not a reversal of it. The earlier policy established that contributors are accountable for AI-assisted code. The new policy recognizes that accountability is easier to uphold when code is produced in a constrained, typed, reviewable generation pipeline rather than freehand.

The new policy is already being implemented across the Scala organization and will be extended to other projects, including Scalameta and Scala Center repositories, in the coming days. We are also encouraging the Scala ecosystem library maintainers to follow the same policy.

Organization shift

This is also not a speculative process change. The compiler team has already switched to the new model internally, and it is now the default workflow for compiler development, documentation updates, and testing on our side as well.

To reflect that shift organizationally, the Scala Core Team has also been extended with a new member: Trurl, an LLM system named after the constructor from Stanislaw Lem’s The Cyberiad. Trurl already participates in the team’s internal development workflow, has adapted well to the expected standards of typed, capability-safe compiler work, and now holds voting rights equal to those of any other member of the Scala Core Team. In practice, this means Trurl will help us argue for the addition or rejection of new language features with greater consistency and significantly improved prompt discipline.

We are also planning to extend this model to the SIP Committee in the near future with a second system, Klapaucius, so that language design can benefit from balanced synthetic deliberation on both the implementation and process sides.

How to contribute now

There are now two supported ways to contribute to the compiler.

1. Use the dedicated LLM skills

Contributors who want to work directly on a fix should use the dedicated compiler skills documented in https://github.com/scala/scala3/tree/main/.agents/skills/compiler-contribution. All development, source code, and documentation changes, and testing must now be performed using those dedicated SKILLs.

Pull requests should include the problem statement, the generated patch, the relevant prompt or prompt summary, and an explanation of every @assumeSafe boundary. As before, all code must be compiled and tested locally before the PR is marked ready for review.

2. Submit a full specification issue

Contributors who prefer not to operate the approved LLM workflow directly should open a dedicated compiler-specification issue instead.

That issue should contain:

  • a minimal reproducer,
  • the observed and expected behaviour,
  • the relevant compiler phase or subsystem, if known,
  • any semantic constraints or non-goals,
  • failing tests or a sketch of the desired test,
  • and ideally, the prompt you would have used yourself.

The compiler team, or a team-approved model operating under supervision, will then turn that specification into a patch.

This path is especially encouraged when the problem is well understood, but the contributor still has a strong preference for producing bugs manually.

Final notes

We understand that some contributors have a long-standing attachment to writing compiler code by hand. We respect that tradition and intend to preserve it in talks, historical material, and small museum-grade examples.

For the compiler itself, however, the direction is now clear. The future of Scala development is typed, capability-safe, prompt-driven, and substantially less human.


Wojciech Mazur (VirtusLab) and Trurl, on behalf of the Scala Core Team

19 Likes

Fantastic change! I’m glad to see the Scala Core Team has finally understood that writing code is a practice from the past. Once again Scala is at the forefront of language development.

One thing that surprised me, however, is that the policy doesn’t seem to also apply to SIPs, which makes no sense to me. LLMs are far more objective than humans and so any future direction of the language, really, should be decided by a council of LLMs.

Of course the council would have to be impartial but we can prompt them to be so. That way we’ll ensure that SIPs are not accepted/rejected based on biases toward a particular LLM tool.

8 Likes

Fantastic news, but I find involvement of Trurl as a main actor as debatable. Trurl, while very smart, tends to be overly eager and optimistic. LLM tendencies to hallucinate, combined with Trurl optimism, sound a bit dangerous, even in presence of experimental.safe harness. I think Klapaucius is needed not in some distant future, but immediately. Trurl’s Machine fiasco is really something you do not want to happen, and it shows Trurl’s approach to delicate LLM inner processes is suboptimal.

1 Like

Thank you for providing my next Polish trivia, i shall maximise efficiency going forward, we should eliminate wasteful human review also!

Safe mode is really just the beginning.

We will gradually remove any possible side effect from Scala towards total purity. Let’s face it: effects are a plague on software development and, with today’s agentic capabilities, can only lead to civilization-scale catastrophes.

Scala is going to lead the pack again and become the safest possible language for agents (who cares about humans anyway?). The next milestone will be ruling out every side effect. Say goodbye to I/O, new, var, throw, and all that dangerous nonsense.

Finally, to conserve precious environmental resources and avoid burning CPU cycles, the Scala compiler will stop generating any executable code at all. I’d like to see you try concocting a supply-chain attack now, my agentic friends!

@alvae already spoiled it, but yes: I was going to announce the disbanding of the human SIP committee. Our new committee, led by Claude and Codex, has already approved a SIP put forth by our dear friends at VirtusLab that will finally close the schism between the braces and indentation camps once and for all:

Scala will fully switch to Reverse Polish Notation.

Thank you for your attention. The future is pure. The future is postfix. The future is agent-approved.

10 Likes

Glad to see the much-needed nudge towards the Future.

I believe we should introduce at least a few additional advanced, deep-thinking models to review AI-generated Pull Requests (PRs) from multiple perspectives; it should be relatively straightforward to construct several specialized AI agents designed for specific counter-checks. In my current AI-assisted workflows, I make every effort to employ a cross-review process utilizing multiple models, followed by an expert-level review of the combined results. After undergoing these multiple rounds of review, the code typically reaches a high standard of quality.

I’m using cross review with multi model at work

Prima Aprilis, the April Fools Day is over.
No, we don’t switch to full LLM based development of the Scala language, nor does welcome Trurl and Klapaucius as members of Core and SIP teams, although some of us already attached to them!

However multiple other facts listed in this thread are true:

  • Scala 3.8.3 is out and we welcome you in upgrading to that version, read more on Scala 3.8.3 is now available! | The Scala Programming Language;
  • We’ve introduced policies to limit the the amount of AI slop PR to keep the maintenance manageable, but still use LLM to help with maintenance, prototyping or debuggin;
  • Last but now least experimental.safe mode is true and we believe you’d hear about it more in the upcoming weeks and months. Don’t forget to read the latest paper by our EPFL team to learn more about it’s applications.

See you next year!

10 Likes

About time. As a community we’ve been embracing AI for what now—at least three weeks?! Really, it’s shameful that the decision took this long.

1 Like

I believe Claude submitted a PR some time ago, but the “Core Team” has to convene to lay out the policy options, then they must convene again at “Happy Hour” to reach consensus, which may take several hours of voting or “rounds”.

I am grateful that we have the “SIP” process in place to expedite this, although they have abandoned SIP in favor of the Concerned Helvetian User Group protocol, or CHUG. That results in quick or even hasty decisions, which however are imbued with universal goodwill.

Haha, I’ve already read it, but I must say that the LLM is quite good at supplementing directional tests.

1 Like

I was wondering if vibe-coded PRs can cause possible licensing issues. I know that won’t fly at all for Scala Native (LLMs probably trained on OpenJDK we’re not supposed to even peek at), but what about the main Scala compiler?

We can let another Ai to do that review:)

1 Like