scala.reflect.internal.StdNames and FIPS 140

(Can’t argue this is a language bug. The scala/bug README suggests this forum as the appropriate place to raise such a feature request).

Issue:
Scala uses MD5 in scala.reflect.internal.StdNames via the Java MD5 MessageDigest. If using reflection in a Scala compiled program on the JVM in a FIPS 140 compliant environment the MD5 MessageDigest is not an allowed cryptographic algorithm.

Desired solution:
Use a hashing algorithm internally for Scala that is FIPS 140 compliant (e.g., SHA-256).

Additional:
I realize MD5 usage here isn’t in a cryptographic context. Java doesn’t seem to care and by default disables MD5 for FIPS 140 compliance.

I guess the best option would be to inline the implementation, though I am not sure how much work that would be. I don’t think we want to switch to SHA256 in this case since it would make things slower.

I don’t think performance is an issue here. But binary compatibility is, changing it would affect classfile names.

Edit: ah ah, cross post.

Having our own implementation of MD5 is even worse from a security standpoint than using the JDK implementation. At least the one in the JDK was audited. Don’t roll your own crypto!

I’m concerned about speed in this case. This code path isn’t supposed to be called often. We could even switch to SHA-512.

The problem is that we need to make sure that we’re not breaking binary/TASTy compatibility when we do that. I’m not sure what are the implications here.

Yeah, something like

class Abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc {
  class Defdefdefdefdefdefdefdefdefdefdefdefdefdefdefdefdefdefdefdefdef {
    class Ghighighighighighighighighighighighighighighighighighighighighighighighighi {
      class Jkljkljkljkljkljkljkljkljkljkljkljkljkljkljkl
    }
  }
}

Produces the classfile

Abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc$$$$$c29b78f1186cc3726d75187a2341a7b2$$$$ghighighighi$Jkljkljkljkljkljkljkljkljkljkljkljkljkljkljkl.class

It’s a public class and its name cannot change in a minor release.

I mean using our own MD5, which could just be inlined from Java existing one, would not really break a lot. And it wouldn’t be actually used for crypto.

It is a bit controversial, but I don’t see any alternative here.

Well, the alternative is the status quo: not to worsen what we have so that we can tick a box in a too-rigid certification process. :man_shrugging:

I think this is an important issue.

Scala is used especially in finance. There you need to tick all the compliance boxes. Whether it makes sense in context or not, it does not matter. That are hard regulatory requirements, enforced by law.

Copying a MD5 implementation into Scala has no implications for cryptographic security. MD5 is anyway verboten since over a decade for that use-case as MD5 is broken. So doing that should be OK. (Especially as one could make it a private function anyway, just used for the class name hashing purpose.)

But you can’t copy the JDK implementation. Not even if you translated it to Scala. (As that would still form a derived work).

Or is Scala finally going to become GPL software? I would welcome this, but I guess a few others less so…

Let’s ask a chatbot to write it, then it’s not our problem where it got it from right? :troll:

But more seriously, we can probably find an implamantation with a compatible license, e.g., bc-java/core/src/main/java/org/bouncycastle/crypto/digests/MD5Digest.java at main · bcgit/bc-java · GitHub

I guess the other solution would be to not use reflection based libraries? We can still compile the programs, just not use scala-reflect jar?

Wouldn’t that be a much smaller issues for Scala 3?

My comment on the other thread was:

If they hadn’t removed -Xmax-classfile-length and if compactify were lazier, it might have been an easier workaround, depending on the application.

An application should be allowed to supply a strategy class that does whatever shortening is appropriate for the environment.

That is very little work. Even less burden if it is a -Y option.

The justification for removing -Xmax... was binary compat and not catering to a niche application.