Explicit nulledness vs. requiring proof-of-property/flow typing

megri · May 10, 2021, 4:24pm

I hang out on reddit quite a bit, probably—or decidedly if I’m being honest—too much. Today a user mused about the nullability of Option, and of how it’s weird that there’s actually three possible states: Some(x), None, and of couse null. Here’s the thread for anyone curious: https://www.reddit.com/r/scala/comments/n8z3lm/scala_noob_question_parameter_of_type_option_why/

Someone brought up the explicit nulls feature of Scala 3 and it got me thinking:
What if, instead of dictating that a variable can not hold null, we declare that a value must fulfil the property of not being null.

They’re quite similar but not the same. Proof-of-property can be used to implement explicit nulls but also more general flow typing. In essence, proof-of-property is less restrictive and more open-ended, but possibly also more inconvenient for the user. Implementation-wise they’re really just a form of tagged erased types.

opaque type Proof[T, P <: Property] = T

def strLen(x: Proof[String, NotNull]): Int = …

def sqrt(x: Proof[Double, NonNegative & NotNull]): Double = …

Obviously this becomes a bit wordy in its raw form without syntactic sugar.

These proofs could be generated automatically by the compiler as the result of an equality-check, much like what (I believe) happens for explicit nulls at the moment. The effective difference then being that instead of stripping away type information after a check, it is being added.

This scheme can work in tandem with explicit nulls by declaring that a type declaration x: String = "hello" is actually x: Proof[String, NotNull] = "hello".

Disclaimer: I’m aware of refinement types and GitHub - fthomas/refined: Simple refinement types for Scala. These thoughts are a play with the idea of how it could function in a more compiler-integrated setting. A lot of important stuff is left out: use-site technicalities and erasure; how a proof actually enables specific operations on a value; if and if so how they could be user-defined; etc. Also I am not formally educated so everything I’ve just said could be the worst idea ever.

martijnhoekstra · May 10, 2021, 4:41pm

Flow typing as compiler generated proofs sound like the obvious scala way.

A non-null proof generated by a null test could maybe also be an implicit conversion, if a: A | Null then if a null test could generate a given Conversion[a.type, A], maybe everything falls in to place automatically.

RubenVerg · May 10, 2021, 5:10pm

And the dreaded Some(None)

ysthakur · May 10, 2021, 7:35pm

But in that case, the option would be known to be of type Option[Option[?]] anyway.

sjrd · May 10, 2021, 9:53pm

Some(null) is just a special case of Some(x) for x == null. And it’s not even actually special. Particularly when your x is of an unconstrained type parameter A, in which case you don’t know what meaning your user assigns to null.

Don’t treat Some(null) as if it’s somehow a special evil thing. It’s normal.

morgen-peschke · May 10, 2021, 10:22pm

I wouldn’t say it’s normal, as it circumvents the expectations of the Option class. Granted, it’s not a particularly special evil thing, but it’s not exactly virtuous to circumvent a class invariant

sjrd · May 10, 2021, 10:24pm

There is no invariant in Option that requires the value in a Some not to be null.

morgen-peschke · May 10, 2021, 10:31pm

Fair enough, an assumed class invariant in this case.

I think it would be a conservative estimate to say that passing Some(null) as an argument would break at least 90% of code written using Option in the wild.

sjrd · May 11, 2021, 3:05am

That’s a strawman argument. I’m sure at least 90% of code manipulating Lists would break if one of their elements was null as well.

You have to consider the local contracts of particular Option[A]s and List[A]s. For each one, you have to ask locally the question: is my A here something that can be null or not? Not just because it’s a reference type in Scala 2. You have to consider a world where we have explicit nulls: would I use String here or String | Null? In the latter case, your code is broken of it doesn’t handle null. And if it’s an unconstrained type parameter T, then that can also be null because your user could instantiate T to String | Null. If your code breaks in that case for null values, whether it’s in an Option or in a List, your code is broken. And that has nothing to do with any assumed invariant of Option.

morgen-peschke · May 11, 2021, 2:40pm

It’s not a strawman so much as I could have been more precise. If you adjust Option#apply to return Some(null) instead of None, 90% of the code which manipulates Option in the wild (rather than the stuff stored inside them) breaks because the class itself breaks.

List doesn’t have that issue, because it doesn’t imply anything values being present or not, List(null, "", null) might not be particularly idiomatic Scala, but it doesn’t fundamentally break expectations about how the class is supposed to behave. Option on the other hand is explicitly about values which may or may not be present, so if Option(null) starts returning Some(null), things are going to start breaking in really unpleasant ways (my guess would be that pattern matching in particular would get really ugly).

An analogous example would be a method which returns a Try[A], but can also throw nonfatal exceptions outside of the Try. It’s not special kind of hell levels of evil, but it’s definitely not normal either - and that’s the crux of my point. If you run into Some(null), the smart money is on “something has gone horribly wrong” rather than “this is normal.”

sjrd · May 11, 2021, 2:55pm

Now you’re talking about the possibility of changing the contract of Option.apply(). That would break people’s code because they rely on the contract of Option.apply(), not because they rely on a particular invariant about the Option class itself. That is of course a big no-no.

(I argue that the problem here is Option.apply should have been named Option.fromNullable all along, but that is another debate, and probably not fixable at this point.)

morgen-peschke · May 11, 2021, 3:22pm

Modifying Option.apply was a useful thought experiment as it would be a quick way to inject a bunch of unexpected Some(null) values, not something I’d suggest actually doing, sorry that wasn’t clear. I found it useful because almost all code that manipulates Option values assumes that’s how they’re made, and the rare times that Some.apply is used, it’s treated as as assertion that the null check provided by Option.apply can be skipped due to out of band information.

Possibly this is a fault of documentation, but the contract of Option.apply is treated as the contract of Option (probably because it’s the de-facto primary entry point for Option), and I would posit that if a PR to the documentation were created that added a “by the way, don’t assume the values inside Option aren’t null”, you’d get a bunch of pushback because the general understanding is that’s not how Option works.

Coming back to this - if you have to ask this every time you use one, then Option serves no practical purpose, and you might as well go back to explicit null checks everywhere. Critically, I don’t think I’ve ever seen code like this, and it would certainly not be normal:

opt.map {
  case x if x == null => ...
  case x => ...
}

RubenVerg · May 11, 2021, 3:39pm

It does make sense in very specific places, like Java interop mostly

mpilquist · May 11, 2021, 3:41pm

For what it’s worth, my experience is completely opposite. In my experience, Some.apply is by far the most common way to introduce values of Option and any use of Option.apply is an explicit signal that someone is worried about nullability.

sjrd · May 11, 2021, 3:46pm

That’s not how they’re made in the standard library. someMap.get(key)? someOption.map(_ => null)? They will happily return Some(null) to you.

Option(x) is only meaningful if x == null means an absent value and that you want to translate it into None. That’s true when you use it on something that you receive from Java, for example. It is true at the very point of doing Option(x) because you know something about x. In most Scala-only code, this is not what you want. It is most certainly never what you want if your x is of an unconstrained type parameter, because that also means you don’t know what null means for that value. That’s why it should be called Option.fromNullable(x).

RubenVerg · May 11, 2021, 3:55pm

If you don’t know what it means, you first wrap it in an Option, or an Either, or whatever. Some(Some(whatever)) or Some(None), that way you are still representing the presence of a value while not considering the bad facets of null

morgen-peschke · May 11, 2021, 3:57pm

I would be very curious for an example where null means something other than “no value,” or where you would actually want Some(null) instead of None.

I guess another way of saying this is that, you are asserting that Some(null) is normal, so please produce some example cases of where Some(null) does not indicate the presence of a bug.

RubenVerg · May 11, 2021, 4:12pm

Potentially, you might argue in some OS APIs return values are null and you need to use a specific method to get an exception that was thrown, in that case, you might want to not do Option(result), avoid the allocation and directly go if (result != null && !os.threw()) result else os.getError()

AMatveev · May 11, 2021, 4:17pm

For example in some cache:

Some(null) - value has not set
None - value has not queried

Of course, It very depends on domain area. But ‘Null’ and ‘not exists’ is just different in my practice.

morgen-peschke · May 11, 2021, 7:20pm

Fair enough, by far the most common case for Option in my day-to-day is from JSON/Protobuf payloads or Java interop. I do see Some.apply (or rather .some) used frequently in tests to mock the above, but by a significant margin it’s stuff coming in from the outside and would generally be modeled as null in Java/Ruby/C++/etc.

I think it is worth noting that, even if the perspective is flipped, it sounds like the code you work with still follows the semantics of Option.apply wrt null, which would be entirely consistent with what I’ve seen on Reddit, StackOverflow, blog posts, etc.

As in your example, that’s more of a place to skip Option than a place where Some(null) would be a sensible value. I tend to use Objects.nonNull(result) to avoid pissing off our linter, but the two approaches are basically equivalent

If I’m understanding correctly, the null appears to be part of the cache’s API? If so, it would seem that a an GADT of Pending|Unset|Cached(_) would be less likely to cause trouble (if I’m understanding the semantics of None|Some(null)|Some(_) correctly).

I agree that the domain does matter, and I’m curious to see what domains/use cases where Some(null) is sensible that are common enough to support the assertion that it is a normal thing we should expect to see - and if it is normal, those should be pretty easy to find.