[WIP] Scala with Explicit Nulls

What about compatibility with Option?

Ideally, something like type Option[T] = T|Null should be possible, to have instantaneous codebase adaptation and to avoid questions “should I use Option or T|Null now?”. As far as I understand, to achieve that, null should behave as a monad (just as None ), e.g. null.map(f: A => B) == null, and T|Null should have corresponding methods (just as Some[T] ), e.g. x: T|Null .map(f: T => K) == if (x == null) null else f(x) .

If I’m not missing anything (some corner case for monad laws?) and introducing dummy methods on null is acceptable (not sure if that can be implemented technically), that would be a perfect drop-in replacement for Option.

Edit:
With assumptions in my post, for i: Int | Null and
def foo(x): Int | Null = null,
def bar(x: Int | Null): Int = if (x == null) 0 else 42 + x,
x.map(foo).map(bar) =! x.map(foo andThen bar). So, such replacement is impossible. That opens another questions:

  1. when one should use Option[T], and when T | Null ?
  2. how chaining option.map(foo).map(bar).flatmap(baz) would look like with T | Null ? One possible answer here is to use ?-like syntax. Personally, that seems worse than Option to me, as it doesn’t have such chaining flexibility, and promotes nulls usage (by providing special support for them).

Unfortunately, this is not that simple. You should not use T | Null if T is a universally quantified type, because that T type could be instantiated to Option[U], so that the Option[Option[U]] type would expand to U | Null | Null, which would be simplified to U | Null, meaning that Option[Option[U]] and Option[U] would be undistinguishable! (this has bad consequences on parametric code)

4 Likes

Conflating Null union and options is a bad idea and I believe @sjrd had a longish post somewhere about that. That beeing said I think having .toOption on null union (probably via extension method) would be quite reasonable.

8 Likes

The answer is, you should never use T|Null unless outside interoperability forces you to, or you need to do non-premature micro-optimization. Option should remain Option. It was never meant to be a safe replacement for null, but a safe construct that avoids the need for null.

2 Likes

That can be written:

(for {
   ret  <- Option(someJavaMethod())
   tmp  <- Option(ret.trim())
   tmp2 <- Option(tmp.substring(2))
 } yield tmp2.toLowerCase()
).get

But @olhotak’s point remains about migration problems.

I also think it would be useful to generalize the flow-sensitive typing, or at least leave the door open to future generalization; that is, implement it generally (which does not seem much harder, as pointed our by @odersky), even if at first it is only enabled for null checks.

I think you’re talking about this longish post: SIP Suggestion: Add ?: and ?. syntactic sugar for more convenient Option[T] usage :smile:

3 Likes

Thanks for the detailed and thoughtful reply!

Equality

I had Dotty (as opposed to scalac) in mind when I wrote this. Because of multiverse equality, some equality comparisons are disallowed in Dotty

scala> 1 == “hello”
1 |1 == “hello”
|^^^^^^^^^^^^
|Values of types Int and String cannot be compared with == or !=

The current rule for null says: “allow equality comparisons with null if the value compared isn’t an AnyVal” (dotty/compiler/src/dotty/tools/dotc/typer/Implicits.scala at main · lampepfl/dotty · GitHub). What I wanted to communicate was that the rule should remain unchanged, even though reference types are now non-nullable.

I agree with both your points, and I quite like (x: T | Null) == null: it makes it explicit that something’s gone off with the supposedly non-nullable value x.

That said, even if unsound initialization isn’t a good-enough reason for allowing equality comparisons with null, backwards compatibility might be. I searched for places in the Dotty community build where there are equality comparisons involving null, and eq null and ne null seem to be quite common:

  • == null: 0
  • != null: 0
  • eq null: 582
  • ne null: 469

Full list of occurrences here: Equality comparisons involving null in Dotty community build · GitHub

Searching all public repos in Github shows many hits as well: Code search results · GitHub

So there seem to be two options here:

  1. Allow both ==/!= and eq/ne on null (both as an argument and receiver). This is backwards compatible, but seems to require the introduction of the magic RefEq trait (magic because it’s erased to Object).
  2. Allow only ==/!=. This has the advantage that we avoid RefEq, but now we need rewrite rool that converts all the occurrences of eq/ne null above to ==/!=, respectively.

I don’t have a strong opinion either way.

Working with Null

I agree with both of your suggestions:

  1. rename .nn to .!!, which is consistent with Kotlin (https://kotlinlang.org/docs/reference/null-safety.html#the--operator). The only question would be whether !! is already in use by a popular library.
  2. put the array implicit conversions behind an import. This is already the case, it just wasn’t mentioned in the doc: https://github.com/abeln/dotty/blob/explicit-null/library/src-bootstrapped/scala/NonNull.scala

Nullification Function

Like @smarter said, nf(A & B) is needed because it’s used in Java generics. I can’t quite reproduce the example that uses it, but it gets added here dotty/compiler/src/dotty/tools/dotc/core/classfile/ClassfileParser.scala at main · lampepfl/dotty · GitHub

nf(A | B) is not really needed and can be removed.

JavaNull

I agree with @smarter and your suggestion that users should be able to write down JavaNull, so we’ll lift the restriction.

Flow-sensitive Type Inference

As per the usage stats above, the

if (x ne null) {
// do something with x, access its fields, etc
}

pattern seems quite common. See for example

(these are just arbitrary examples off github search)

Unfortunately, the type inference isn’t able to handle some of the usages: for example, if they involve a non-stable path:

I don’t have a good sense for what percentage of the usages the type inference can handle (and hence how much value we get from it), but from what I’ve seen so far I lean towards saying we do need it. Can you think of a different way to migrate/rewrite that usage pattern?

I like that this generalizes well. Will prototype it in the current PR.

1 Like

One section I’d particularly like to get feedback on is the binary compatibility one: https://gist.github.com/abeln/9f79774bac111d99b3ae2cb9016a33e6#binary-compatibility

To restate our approach here: when loading Scala code compiled with a pre-explicit-null compiler, we leave the types unchanged. That is, we don’t apply the nf function above to Scala types (only Java types).

This has the nice property that you can update your code with minimal changes to the explicit-nulls world, before your dependencies have updated.

Notice that the “unit of update” is whatever sources are in your build. In particular, it’s not possible to have part of a project with explicit nulls and the other part with implicit nulls. It’s also not possible to decide that some dependencies will be imported in “strict” mode (explicit null) while others won’t. I think to the user this makes for a conceptually-simpler model of what types mean, but there’s less granularity/control over the feature.

Do people have any concerns/ideas around binary compatibility?

It’s used by scala.sys.process, which is bundled with the standard library

@ import scala.sys.process._
import scala.sys.process._

@ "ls".!!
res4: String = """LICENSE
build.sc
out
readme.md
requests
"""

Along with the following other operators:

!   !!  !!< !<  ### #&& #<  #>  #>> #|  #|| %   %%

I’d be in favor of deprecating scala.sys.process and/or moving it into a separate optional module. The code is awful, the API is crazy, and it has had approximately 0 progress made since it was merged into scala/scala 9 years ago un-reviewed.

8 Likes

The first step would be to add non-symbolic aliases for all these operators, I think a PR doing that would be accepted (though maybe not with RC1 so close now): https://github.com/scala/bug/issues/11133

While the work is still in process for better interoperability with Java & easier migration of legacy code, I’m wondering if it is good and possible to only produce warnings for null-related type errors?

I conjecture that warnings instead of errors will make the system more friendly to programmers and make migration of legacy code easier.

In the compiler, type mismatches are usually reported as errors. However, it seems some small tweak is possible to report warnings for null-related type mismatches.

Hundreds of warnings are not programmer friendly, what’s needed is a way to turn on/off the checks, perhaps with a language import.

Project-level and source-file level language feature switches are definitely helpful, AFAIK they are planned. Even with all these switches, I still think warnings are better than errors.

Warnings are not acceptable here, because nullability being part of the type means that it will have to be taken into account for overload resolution and implicit search. Different results can be obtained successfully, rather than one failure which could be reported as a warning.

No, anything that changes the type of things in a way that is visible to ad hoc polymorphism must report errors on failure.

2 Likes

We already loosen the typing rule by allowing selection on T | JavaNull for usability. I conjecture there are more places that we could report warnings instead of errors without impacting overload resolution and implicit search.

To some extent, we already embrace the idea that typing errors related to null should be suppressed for friendliness (e.g. selection on T | JavaNull). A natural extension of the idea is to report warnings instead errors in more places. How to do this in more places seems to be a matter of technical detail.

I think the difference here, w.r.t. other cases which would be reported as errors from now on is that these cases stem from three causes:

  • You’re doing something improper within your own code.
  • You’re doing something improper while using an explicit-null Scala library.
  • You’re doing something improper while using a pre-explicit-null Scala library.

In case 1, you have full control over it and you must fix it.
In case 2, you have the case that you’re using a library that had its nullness laid out explicitly (although maybe not as intended, e.g. forgot a |Null for a parameter. You can work around that, maybe file an issue.
In case 3, you’ll wait for the library to move to e-null, and for parameters, (2) may partially apply.

Now with Java, you are on a different level (unless you work in a mixed-source codebase): You use a library that you do not have control over, and that doesn’t support any kind of explicit-null. Yes, the Checker Framework and similar tools exist (and it seems explicit-null might take advantage of that), but they are optional. If and once explicit-null lands, it’s there and it will stay. Scala on that version won’t work without the nullabilities anymore, but for Java and its tools, that won’t ever be the case. Hence relaxing for usability there, and only there, seems to be the right decision for me.

1 Like

I think this is something we should consider only if the migration turns out to be too painful/impossible when explicit nulls cause errors. We’ll have a better idea on this as we migrate the standard library and others in the community build (we’re currently working on this).

In my experience (albeit in other languages) warnings tend to be ignored, particularly if there are lots of them (which is the hypothetical that motivates the warnings-instead-of-errors approach in the first place).

Additionally, it looks like many projects use “-Xfatal-warnings” (https://github.com/search?l=Scala&q=-Xfatal-warnings&type=Code), so if indeed that particular flag is popular warnings won’t help.

4 Likes

An option to get warnings will still help, especially in big codebases. Those projects can still drop -Xfatal-warnings when migrating to a new Scala version, especially one with so many changes, and finally get to -Xfatal-warnings. The quickest you get to something that compiles and runs with the new compiler, the quicker you can test if the code still works or you made a mistake, and start addressing a few warnings at a time.

1 Like

@abeln : What will this mean for Some(null)?

And if that’s going away, what will it mean for optimizing Option in general. I think there were efforts to do just that but Some(null) made it impossible.