Explicit Nulls and java.lang.String

MarkCLewis · February 13, 2022, 12:38am

I’ve been playing with the explicit null support in Scala 3. I’m working on updates to my textbook and I’m hoping to include that as a default setting for all of my examples. I’m basing this on the belief that this is the direction the language will move in the long run. For most of my examples, it doesn’t matter. Where it does come up though is when dealing with Java libraries and one in particular, java.lang.String.

I often have students do simple parsing of strings with split, but the return type of split with explicit nulls turned on is currently Array[String | Null] | Null, which requires a lot of rather verbose handling to deal with. This makes it pretty much unusable for introductory students. In theory, I could have them turn off the explicit nulls for the sections of code that do this, but that feels like the wrong approach.

I’m not certain what the right approach is to get around this. Ideally, methods in java.lang.String (and possibly other classes in java.lang) that can’t return nulls would have a Scala signature that doesn’t include them. Almost as good would be extension methods with slightly different names that have appropriate types. This approach can’t be done for everything, but given that string literals in Scala use these methods, this is probably worth a special case.

This is something I’d be interested in helping with if I knew what approach would be acceptable.

som-snytt · February 13, 2022, 3:37am

There are just a few methods on String that I can keep straight. Maybe less than a few, not counting toString: length, charAt, startsWith, endsWith, and indexOf (not counting the one that takes an index).

For operations with a pattern, I’d say use extensions on Pattern or Regex. I usually "p".r.findAllMatchIn(s) and then I know how to use an Iterator and Match. It’s too bad Regex has split that merely delegates.

In any case, I think training on String methods just makes computers feel like puzzlers. Sometimes that is exhilarating, especially for young people.

For regex operations, it’s also clearer that the string is just data, and the regex is the code.

aborg · February 13, 2022, 10:44am

Would it make sense to use the collected information from other projects? Tool developers already created external annotations for the JDK, common libraries, for example https://github.com/lastnpe/eclipse-null-eea-augments (worth checking the linked presentation in the readme too), I am sure Kotlin also has some.
It would be nice if Scala could use these/similar annotations for its supported platforms.

MarkCLewis · February 13, 2022, 3:51pm

The thing is, this is for CS1. These are students who are working to understand basic logic for problem-solving. One of the goals is to keep the number of concepts we throw at them in check. We aren’t teaching them regular expressions. The line of code I have them use is line.split(","), where line is read from a file. They don’t know that the argument to split is a regular expression. For our purposes, it doesn’t matter and so that information would only obfuscate the goal of these lessons, which deal with processing collections.

@aborg I don’t know how familiar you are with Explicit Nulls | Scala 3 Language Reference | Scala Documentation. That is the set of flags that I’m hoping to use because I believe that they are where Scala should be heading. However, there is also interest in increasing the usage of Scala in teaching so more graduates leave school knowing Scala. I believe that for both of those to happen, commonly used methods from Java that never return null will need to have Scala-specific types that indicate that.

jducoeur · February 13, 2022, 5:22pm

This is reminding me a lot of the JavaScript library facades we use on the Scala.js side of the world, where the responsibility of the facade is to encode known information about the types of external methods – among other things, whether params or returned values can be null or undefined.

It feels like we want an enhancement to the Explicit Nulls stuff that allows you to tell the typer about extra nullity information that wasn’t in the original classfiles – basically, injecting @NonNull annotations from the outside, since the Scala side cares about this far more than the original Java side does. No clue whether there is any practical way to do that, though.

olhotak · February 13, 2022, 5:49pm

The compiler already supports reading annotations if they are in the Java code. There’s a list of the 12 different Java null annotations supported in the Doc page: Explicit Nulls | Scala 3 Language Reference | Scala Documentation

The problem is that the Java standard library is not annotated. Nobody wants to maintain a fork of the JDK and even if we did, nobody wants to compile their projects with a non-standard JDK.

We’ve talked before about some format for external annotations, that users could include in separate files separate from their Java libraries. It turns out that there are several such incompatible formats already out there. As @aborg pointed out above, Eclipse has one: https://github.com/lastnpe/eclipse-null-eea-augments JetBrains has a different incompatible one: External annotations | IntelliJ IDEA The Checker Framework has a third, also incompatible one: Annotation File Format Specification In that document in Section 5, there are some links to still more other incompatible formats.

So, which format(s) to support? Which one(s) have the most inertia? Will another new standard overtake all of them?

A separate idea that has been brought up is to have mode like unsafe nulls, but only for calls to Java methods. This is because the vast majority of issues when porting code to explicit nulls are in calls to Java.

sideeffffect · February 28, 2022, 8:06pm

Any is always better than none, right? JetBrains’ system could be a good candidate because it’s backed by a powerful Java/JVM focused company and integrates with Maven.

nafg · February 28, 2022, 8:24pm

Hasn’t anyone else done this? Has Kotlin?