[WIP] Scala with Explicit Nulls

Dveim · January 20, 2019, 10:45am

What about compatibility with Option?

Ideally, something like type Option[T] = T|Null should be possible, to have instantaneous codebase adaptation and to avoid questions “should I use Option or T|Null now?”. As far as I understand, to achieve that, null should behave as a monad (just as None ), e.g. null.map(f: A => B) == null, and T|Null should have corresponding methods (just as Some[T] ), e.g. x: T|Null .map(f: T => K) == if (x == null) null else f(x) .

If I’m not missing anything (some corner case for monad laws?) and introducing dummy methods on null is acceptable (not sure if that can be implemented technically), that would be a perfect drop-in replacement for Option.

Edit:
With assumptions in my post, for i: Int | Null and
def foo(x): Int | Null = null,
def bar(x: Int | Null): Int = if (x == null) 0 else 42 + x,
x.map(foo).map(bar) =! x.map(foo andThen bar). So, such replacement is impossible. That opens another questions:

when one should use Option[T], and when T | Null ?
how chaining option.map(foo).map(bar).flatmap(baz) would look like with T | Null ? One possible answer here is to use ?-like syntax. Personally, that seems worse than Option to me, as it doesn’t have such chaining flexibility, and promotes nulls usage (by providing special support for them).

julienrf · January 20, 2019, 1:10pm

Unfortunately, this is not that simple. You should not use T | Null if T is a universally quantified type, because that T type could be instantiated to Option[U], so that the Option[Option[U]] type would expand to U | Null | Null, which would be simplified to U | Null, meaning that Option[Option[U]] and Option[U] would be undistinguishable! (this has bad consequences on parametric code)

Krever · January 20, 2019, 1:59pm

Conflating Null union and options is a bad idea and I believe @sjrd had a longish post somewhere about that. That beeing said I think having .toOption on null union (probably via extension method) would be quite reasonable.

nafg · January 20, 2019, 2:07pm

The answer is, you should never use T|Null unless outside interoperability forces you to, or you need to do non-premature micro-optimization. Option should remain Option. It was never meant to be a safe replacement for null, but a safe construct that avoids the need for null.

LPTK · January 20, 2019, 5:36pm

Without JavaNull, the chaining becomes too cumbersome

val ret = someJavaMethod()
val s2 = if (ret != null) {
  val tmp = ret.trim()
  if (tmp != null) {
    val tmp2 = tmp.substring(2)
    if (tmp2 != null) {
      tmp2.toLowerCase()
    }
  }
}
// Additionally, we need to handle the `else` branches.

That can be written:

(for {
   ret  <- Option(someJavaMethod())
   tmp  <- Option(ret.trim())
   tmp2 <- Option(tmp.substring(2))
 } yield tmp2.toLowerCase()
).get

But @olhotak’s point remains about migration problems.

I also think it would be useful to generalize the flow-sensitive typing, or at least leave the door open to future generalization; that is, implement it generally (which does not seem much harder, as pointed our by @odersky), even if at first it is only enabled for null checks.

sjrd · January 20, 2019, 8:57pm

I think you’re talking about this longish post: SIP Suggestion: Add ?: and ?. syntactic sugar for more convenient Option[T] usage

abeln · January 21, 2019, 11:44pm

Thanks for the detailed and thoughtful reply!

Equality

I had Dotty (as opposed to scalac) in mind when I wrote this. Because of multiverse equality, some equality comparisons are disallowed in Dotty

scala> 1 == “hello”
1 |1 == “hello”
|^^^^^^^^^^^^
|Values of types Int and String cannot be compared with == or !=

The current rule for null says: “allow equality comparisons with null if the value compared isn’t an AnyVal” (dotty/compiler/src/dotty/tools/dotc/typer/Implicits.scala at main · lampepfl/dotty · GitHub). What I wanted to communicate was that the rule should remain unchanged, even though reference types are now non-nullable.

I agree with both your points, and I quite like (x: T | Null) == null: it makes it explicit that something’s gone off with the supposedly non-nullable value x.

That said, even if unsound initialization isn’t a good-enough reason for allowing equality comparisons with null, backwards compatibility might be. I searched for places in the Dotty community build where there are equality comparisons involving null, and eq null and ne null seem to be quite common:

== null: 0
!= null: 0
eq null: 582
ne null: 469

Full list of occurrences here: Equality comparisons involving null in Dotty community build · GitHub

Searching all public repos in Github shows many hits as well: Code search results · GitHub

So there seem to be two options here:

Allow both ==/!= and eq/ne on null (both as an argument and receiver). This is backwards compatible, but seems to require the introduction of the magic RefEq trait (magic because it’s erased to Object).
Allow only ==/!=. This has the advantage that we avoid RefEq, but now we need rewrite rool that converts all the occurrences of eq/ne null above to ==/!=, respectively.

I don’t have a strong opinion either way.

Working with Null

I agree with both of your suggestions:

rename .nn to .!!, which is consistent with Kotlin (https://kotlinlang.org/docs/reference/null-safety.html#the--operator). The only question would be whether !! is already in use by a popular library.
put the array implicit conversions behind an import. This is already the case, it just wasn’t mentioned in the doc: https://github.com/abeln/dotty/blob/explicit-null/library/src-bootstrapped/scala/NonNull.scala

Nullification Function

Like @smarter said, nf(A & B) is needed because it’s used in Java generics. I can’t quite reproduce the example that uses it, but it gets added here dotty/compiler/src/dotty/tools/dotc/core/classfile/ClassfileParser.scala at main · lampepfl/dotty · GitHub

nf(A | B) is not really needed and can be removed.

JavaNull

I agree with @smarter and your suggestion that users should be able to write down JavaNull, so we’ll lift the restriction.

Flow-sensitive Type Inference

As per the usage stats above, the

if (x ne null) {
// do something with x, access its fields, etc
}

pattern seems quite common. See for example

(these are just arbitrary examples off github search)

Unfortunately, the type inference isn’t able to handle some of the usages: for example, if they involve a non-stable path:

I don’t have a good sense for what percentage of the usages the type inference can handle (and hence how much value we get from it), but from what I’ve seen so far I lean towards saying we do need it. Can you think of a different way to migrate/rewrite that usage pattern?

abeln · January 21, 2019, 11:45pm

I like that this generalizes well. Will prototype it in the current PR.

abeln · January 22, 2019, 12:05am

One section I’d particularly like to get feedback on is the binary compatibility one: https://gist.github.com/abeln/9f79774bac111d99b3ae2cb9016a33e6#binary-compatibility

To restate our approach here: when loading Scala code compiled with a pre-explicit-null compiler, we leave the types unchanged. That is, we don’t apply the nf function above to Scala types (only Java types).

This has the nice property that you can update your code with minimal changes to the explicit-nulls world, before your dependencies have updated.

Notice that the “unit of update” is whatever sources are in your build. In particular, it’s not possible to have part of a project with explicit nulls and the other part with implicit nulls. It’s also not possible to decide that some dependencies will be imported in “strict” mode (explicit null) while others won’t. I think to the user this makes for a conceptually-simpler model of what types mean, but there’s less granularity/control over the feature.

Do people have any concerns/ideas around binary compatibility?

lihaoyi · January 22, 2019, 1:41am

It’s used by scala.sys.process, which is bundled with the standard library

@ import scala.sys.process._
import scala.sys.process._

@ "ls".!!
res4: String = """LICENSE
build.sc
out
readme.md
requests
"""

Along with the following other operators:

!   !!  !!< !<  ### #&& #<  #>  #>> #|  #|| %   %%

I’d be in favor of deprecating scala.sys.process and/or moving it into a separate optional module. The code is awful, the API is crazy, and it has had approximately 0 progress made since it was merged into scala/scala 9 years ago un-reviewed.

smarter · January 22, 2019, 10:39am

The first step would be to add non-symbolic aliases for all these operators, I think a PR doing that would be accepted (though maybe not with RC1 so close now): https://github.com/scala/bug/issues/11133

liufengyun · January 22, 2019, 8:16pm

While the work is still in process for better interoperability with Java & easier migration of legacy code, I’m wondering if it is good and possible to only produce warnings for null-related type errors?

I conjecture that warnings instead of errors will make the system more friendly to programmers and make migration of legacy code easier.

In the compiler, type mismatches are usually reported as errors. However, it seems some small tweak is possible to report warnings for null-related type mismatches.

smarter · January 22, 2019, 8:27pm

Hundreds of warnings are not programmer friendly, what’s needed is a way to turn on/off the checks, perhaps with a language import.

liufengyun · January 22, 2019, 8:52pm

Project-level and source-file level language feature switches are definitely helpful, AFAIK they are planned. Even with all these switches, I still think warnings are better than errors.

sjrd · January 23, 2019, 2:57am

Warnings are not acceptable here, because nullability being part of the type means that it will have to be taken into account for overload resolution and implicit search. Different results can be obtained successfully, rather than one failure which could be reported as a warning.

No, anything that changes the type of things in a way that is visible to ad hoc polymorphism must report errors on failure.

liufengyun · January 23, 2019, 8:28am

We already loosen the typing rule by allowing selection on T | JavaNull for usability. I conjecture there are more places that we could report warnings instead of errors without impacting overload resolution and implicit search.

To some extent, we already embrace the idea that typing errors related to null should be suppressed for friendliness (e.g. selection on T | JavaNull). A natural extension of the idea is to report warnings instead errors in more places. How to do this in more places seems to be a matter of technical detail.

Adowrath · January 23, 2019, 10:41am

I think the difference here, w.r.t. other cases which would be reported as errors from now on is that these cases stem from three causes:

You’re doing something improper within your own code.
You’re doing something improper while using an explicit-null Scala library.
You’re doing something improper while using a pre-explicit-null Scala library.

In case 1, you have full control over it and you must fix it.
In case 2, you have the case that you’re using a library that had its nullness laid out explicitly (although maybe not as intended, e.g. forgot a |Null for a parameter. You can work around that, maybe file an issue.
In case 3, you’ll wait for the library to move to e-null, and for parameters, (2) may partially apply.

Now with Java, you are on a different level (unless you work in a mixed-source codebase): You use a library that you do not have control over, and that doesn’t support any kind of explicit-null. Yes, the Checker Framework and similar tools exist (and it seems explicit-null might take advantage of that), but they are optional. If and once explicit-null lands, it’s there and it will stay. Scala on that version won’t work without the nullabilities anymore, but for Java and its tools, that won’t ever be the case. Hence relaxing for usability there, and only there, seems to be the right decision for me.

abeln · January 23, 2019, 6:11pm

I think this is something we should consider only if the migration turns out to be too painful/impossible when explicit nulls cause errors. We’ll have a better idea on this as we migrate the standard library and others in the community build (we’re currently working on this).

In my experience (albeit in other languages) warnings tend to be ignored, particularly if there are lots of them (which is the hypothetical that motivates the warnings-instead-of-errors approach in the first place).

Additionally, it looks like many projects use “-Xfatal-warnings” (https://github.com/search?l=Scala&q=-Xfatal-warnings&type=Code), so if indeed that particular flag is popular warnings won’t help.

Blaisorblade · February 2, 2019, 9:25pm

An option to get warnings will still help, especially in big codebases. Those projects can still drop -Xfatal-warnings when migrating to a new Scala version, especially one with so many changes, and finally get to -Xfatal-warnings. The quickest you get to something that compiles and runs with the new compiler, the quicker you can test if the code still works or you made a mistake, and start addressing a few warnings at a time.

amsayk · February 5, 2019, 11:49am

@abeln : What will this mean for Some(null)?

And if that’s going away, what will it mean for optimizing Option in general. I think there were efforts to do just that but Some(null) made it impossible.