Scala 3 macro security

I’m wondering if, while the macro system is being completely changed, it might be a good time to think about security considerations of macros.

Currently, a macro from a library (maybe a distant transitive dependency) could, at compile-time, exfiltrate data or run any other arbitrary code on a developer’s system. This is one bad actor away from a massive security incident.

I can imagine legitimate use cases for accessing the filesystem or the network within a macro, but maybe this should be something that is impossible by default (via something like java.security.AccessController, I dunno).

Something something pure functions :slight_smile:

4 Likes

The first thing to do when talking about security is to answer the question: what’s your threat model? If a jar you’re depending on is malicious, macro security should be the least of your concern compared to the fact that you would actually end up running code from that jar in production. The only solution (which solves both problems) is to only use dependencies you trust. So instead of investing time in band-aids like the java security stuff, I suggest trying to find a good way to trust your dependencies (not an easy task by any mean!).

8 Likes

The only flaw in that reasoning, is that my tests could potentially check for malicious JARs doing things at runtime. But by the time I’ve compiled everything in order to run my tests, a macro could have already done malicious things (and there’s no way I can prevent that, no matter how careful I am otherwise). So I guess I’m saying, macros absolutely do open a new vector for attack from malicious dependencies that wasn’t otherwise there. Passing the buck to “trust your dependencies” is obviously the easiest solution, but I guess I figured given Java’s security model is designed for this kind of thing, it wouldn’t be too much effort to sandbox macro code either. If it’s more trouble than it’s worth, I get it – just wanted to raise the issue.

2 Likes

FWIW, this was motivated by this recent thing which I’m sure you’ve all seen: https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610

1 Like

I agree with @smarter – both focusing on threat models as well as dependency provenance. If we’re concerned about sandboxing macros, we’d also need to sandbox SBT plugins, runtime dependencies, etc. It’s more effective to address this at the source, with dependency management systems that avoid such attack vectors.

1 Like

What I’m mainly bringing up here is not the runtime threats (which exist everywhere), but the build time threats which are unique to the Scala ecosystem. You brought up SBT plugins, which certainly qualify; and I brought up macros in libraries which also qualify. I don’t think it would be a horrible idea to sandbox both of these things, to be honest. But I understand the viewpoint of “just make sure your supply chain is OK, and then none of these things matter.”

The trouble is, “make sure your supply chain is OK” is a tall order, and something that I’m pretty sure nobody actually does. So when I said “we’re one bad actor away from a massive security incident” – when it comes, and affects most of us, will people say “we shoulda done the work to audit our dependencies every time we updated them?” Or will people say “wait, your transitive dependencies can steal your tax returns at compile time? WAT”

2 Likes

A concrete example off the top of my head – if someone managed to slip something compile-time malicious into a PR to shapeless, for example (now, I’m pretty confident that Miles would notice this and shut it down, but for argument’s sake, say he didn’t), that would end up affecting pretty much every Scala developer, at compile time. If it was a runtime malicious thing, it would would affect only some server somewhere and would likely be quickly noticed; a compile-time malicious thing (assuming it slipped through the cracks) would likely take a long time to be noticed, and would affect every Scala developer’s computer.

It’s enough that hackers will manage to take over an account of any user with commit and tag access, to pollute the entire downstream libraries and users that use Scala Steward and Mergify.

There is an old PR that attempted to run the macros in a secure sandbox. As far as I remember different systems handled security a bit differently and I never managed to fully restrict all access from that thread.

3 Likes

Recent example from Python:

I think most Scala developers don’t just run compile on their machine, they also run test or run from sbt, without wrapping every single library they use in a security access controller, so again I don’t think this this is a credible threat model.

5 Likes

It is important to remember that a macro generates code. You might sandbox the macro which will protect the developer environment, but that will not protect the runtime environment from the code generated by the macro. If you can trust the macro being run, you can’t trust the code it generates.

Remember the C compiler “virus” that injected itself on the compiler and the operating system.

5 Likes

Right, I understand the distinction. I was specifically bringing up the attack surface to the developer, because that seems more surprising and less detectable. Attacks against developer machines are mostly what’s discussed in the article I linked. While the Maven ecosystem doesn’t have similar “install hooks”, macros could potentially be an equivalent attack vector.

A macro that doesn’t generate malicious code – but only executes malicious code at compile time – would be a pretty insidious attack.

I’ll cede @smarter’s point that a malicious library is a malicious library, and it’s not just macros that have the potential to do sneaky things on a developer’s machine. But, it does stand out as a more surprising attack vector (IMHO).

1 Like

Java actually gave up on sandboxing malicious code, because it didn’t work—hence the death of applets and the browser plugin. I would really only try to use a SecurityManager to enforce compliance of benign code with possible mistakes, or to enforce very simple policies (like preventing code from calling System.exit()). Which is actually not a terrible idea to prevent macros from doing, though if you have a macro calling System.exit(), something else is deeply, deeply wrong.

Bottom line: Do not ever call untrusted Java code, unless you’re in a virtualised sandbox or something. The JVM will not save you.

I don’t think that’s the reason. Android security model is not (much) different and it does work. On the JVM, a bug slipped through (about at the time of the Sun collapse, not surprisingly) which caused a lot of harm, and since then the platform is regarded as insecure (hence the rise of Javascript and browser based solutions) but it is undeserved IMHO.

Sorry, I didn’t mean for this to be a discussion on whether the JVM security model works or not. I (possibly wrongly) assumed it did (because it still exists, and the JVM folks are lately pretty exacting about such things IME, but who knows? It’s a big space). But TBH I have never had to rely on it, so I really can’t speak to “threat models” or “if X happened, then what’s the point of trying to guard against Y” kind of stuff.

To me, the “point” of doing such things is that if turns out to be insecure, it’s the runtime’s fault and not your fault. And maybe, if you’re lucky, it could be fixed in the runtime transparently (and if not, you’re no worse off than you started – but if you never tried to make it secure, there are definitely hopefully no changes to the runtime that can save you).

But, to finalize this thread, I’ll yield to @NthPortal (the security model won’t help) and @smarter (even if it did help, you’re definitely running dependency macro’s generated code – which would definitely be equally malicious – during your tests, so any macro security model wouldn’t help you. Assuming you have tests the first time you compile which you definitely do, and assuming you’re running them the first time you compile which you definitely are).

Out of curiosity, do macros have the power to modify code outside of their call site?

No. But call to a macro can be generated implicitly as a call of an instance of automatically derived. typeclass.

Well, the macro could really do anything it wants using IO, including modifying existing files or creating new files which contain definitions that shadow previously-used definitions, silently changing their meanings.