Union types in Scala 3

jimka2001 · February 16, 2020, 7:02pm

I’m mentioning Scala in the related works section of an article I’m submitting. The article is about the history of function types and subtype relations in Common Lisp.

Can someone tell me whether my statement is correct, or misleading?

Here is the relevant paragraph and citations. Note, that I’m missing the journal name for citation [1]

Screenshot 2020-02-16 at 15.11.44

Screenshot 2020-02-16 at 15.13.26
Screenshot 2020-02-16 at 15.14.36

If anyone cares to look at the article here it is in a preliminary form.
History of CL types

LPTK · February 16, 2020, 8:54pm

I’d say it’s a weird characterization of Either. Indeed, Either is not a feature supported by Scala per se (you say “the Scala language supports [it]”); it’s just a library type. What Scala does support is nominally-typed class hierarchies, which can be used to encode discriminated unions (also called tagged unions or variants); these are different from union types. Either is not special at all in this regard.

Scala 3 (note: not written “Scala-3”) supports proper union types. Here, I wouldn’t say “promises to fully support”, as it’s already supported by Scala 3’s future compiler Dotty — it’s not just a promise, but a reality.

jimka2001 · February 16, 2020, 9:40pm

Thanks for the comments. Is what I say about complement types correct?

I suppose you are correct in distinguishing the Scala language from the Scala standard library. However, all Scala books I’ve read (about learning Scala) talk about the Either type. Do you think it is realistic to suggest that someone use Scala without the standard library?

On the other hand I have seen quite a few exposés which suggest that Either represents sum types. I’ve always found this claim a bit curious.

tarsa · February 16, 2020, 10:03pm

Standard library is often (?) considered a part of programming language. Here’s what Wikipedia has to say about it:

A language’s standard library is often treated as part of the language by its users, although the designers may have treated it as a separate entity. Many language specifications define a core set that must be made available in all implementations, in addition to other portions which may be optionally implemented. The line between a language and its libraries therefore differs from language to language. Indeed, some languages are designed so that the meanings of certain syntactic constructs cannot even be described without referring to the core library. For example, in Java, a string literal is defined as an instance of the java.lang.String class (…)

Indeed many parts of Scala programming language only work when standard library is present. In addition to Strings there are:

functions as subtypes of scala.Function from stdlib
pattern matching uses Options, Tuples and Seqs
non local returns throw exceptions from stdlib
etc

LPTK · February 16, 2020, 10:39pm

You’re welcome!

What a strange question. How does it follow from anything? Regardless of the answer, it’s a moot point.

It’s a very curious claim indeed. Either is a sum type. There are many other possible sum types. I guess what makes Either slightly interesting is that it’s the simplest sum type that can be used to encode all other sum types. But singling it out as the way the Scala language supports sum types is very misleading IMHO.

morgen-peschke · February 16, 2020, 11:18pm

While that might be a fun weekend project to try and get something working with the absolute minimum subset of the standard library, I don’t know if anyone’s made a realistic go of it in any project at scale (maybe some sort of embedded Scala?).

What’s more interesting is that, at least in my experience, Scala has always been a bit weird in that a distinction is made between stuff in the Scala language, and what happens to be in the standard library at the moment.

The standard library is much more likely to change than stuff that’s part of the compiler itself (like the recent collections overhaul), and is considerably easier to replace if the community doesn’t like it (like the original actor & parser-combinator libraries).

There are probably other differences, but that’s what comes to mind at the moment.

sjrd · February 17, 2020, 11:10am

In addition to @LPTK’s comments, I’d like to point out:

Sabin [47] discusses user level extensions to the Scala type system to support intersection and union types.

I understand it’s more fun to mention a blog post that talks type theory and the Curry-Howard isomorphism than a practical implementation—especially for an academic paper—but I keep being baffled that, when talking about user-level extensions for union types in Scala 2, people repeatedly ignore Scala.js’ union type. It is a user-level extension for union types in Scala 2, that actually works, does not require extra type parameters and value parameters (that are a bit arcane/magic), and is actually used in practice and production in many codebases.

Implementation: scala-js/library/src/main/scala/scala/scalajs/js/Union.scala at main · scala-js/scala-js · GitHub
Tests: scala-js/test-suite/js/src/test/scala/org/scalajs/testsuite/library/UnionTypeTest.scala at main · scala-js/scala-js · GitHub

Mile Sabin’s final example, which is

def size[T: (Int |∨| String)#λ](t: T) =
  t match {
    case i: Int => i
    case s: String => s.length
  }

would be written as follows with Scala.js’ union type:

def size(t: Int | String) =
  (t: Any) match {
    case i: Int => i
    case s: String => s.length
  }

(the upcast to Any silences a warning for Scala 2’s type system).

jimka2001 · February 17, 2020, 11:33am

Can you give me an academic paper citing this scala.js union type? I’m happy to cite it. If there is not one, it looks like the author should publish it.

sjrd · February 17, 2020, 12:20pm

There’s no academic paper. It probably never would be accepted, because the solution is too simple and too practical, and does not advance the state of research. I’m the author, btw, and I’ve been there. I’ve had papers repeatedly rejected for that very reason.

jimka2001 · February 17, 2020, 12:38pm

Bizarre that simple solutions are not publishable
Ideally they should be preferred to complicated solutions.

LPTK · February 17, 2020, 1:44pm

Simple solutions are often preferred to complicated solutions, but they still have to make a clear scientific contribution. A “neat trick” to encode some feature is usually not a sufficient scientific contribution.

jimka2001 · February 17, 2020, 2:07pm

Depends on the reviewers of course, but a scientific paper could very well describe a problem and present a very simple solution. It could also discuss related work and argue that many other solutions are more complicated, more fragile, or lack certain mathematical properties. The paper could also contrast the usability differences between the proposed solution and other solutions.

sjrd was mentioning that many people cite Sabin, but few mention Scala.js. To me, that many people cite Sabin, gives credence to the idea that there is something publishable in Scala.js.

LPTK · February 17, 2020, 2:30pm

Yes, of course. I agree with what you say. And that does not change the fact that if you’re aiming for publishing in a good conference, you still need to have good scientific contributions (which will likely be related to these points).

I don’t follow your logic. AFAIK this blog post by Miles has not been cited in research papers. And even if it has been, it’s still just a blog post. People often make references to various online resources that are not research contributions on their own.

jimka2001 · February 17, 2020, 3:26pm

Oh, I completely understand your point now. I was thinking that Sabin was a paper. I was forgetting that it was a post. Is there a similar blog post I can reference for unions in Scala.js?

jimka2001 · February 19, 2020, 10:01am

@sjrd, Hi Sébastien, can you help me understand why the documentation refers to the Scala.js types as pseudo-union rather than simply as union?

sjrd · February 19, 2020, 10:09am

I used the term pseudo-union precisely because it is user-level; not added to the core type system. This causes some limitations in terms of usability. The most typical example is that it will report a warning that the following cases in the match are not possible:

val x: Int | String = ???
x match {
  case x: Int => println("int")
  case x: String => println("string")
}

To convince the compiler to accept the code, you need a type ascription to Any:

(x: Any) match {
  ...
}

Even in that case, you do not get exhaustivity checking (obviously).

Other than some limitations like that, | is truly an unboxed union type. A | B accepts values that are either A or B (or both) but no other value. At run-time those values are not wrapped (this can be observed by casting back to A or B if you know which one it is, or type testing with .isInstanceOf[A], both of which will work).

As a consequence, it is not a discriminated union. If a value x is both an A and a B, you cannot tell from which branch it came. In fact, | is commutative: you can assign an A | B to a B | A and conversely.

Jasper-M · February 19, 2020, 10:39am

I would say that the best argument for using pseudo-union is that this will not compile if you’re using scalajs union types:

trait Foo { def foo: Int | String }
class Bar extends Foo { def foo: String = "foo" }

Which shows that, while subtype relationships are simulated with implicit conversions, the type system still doesn’t agree that they’re really subtypes.

LPTK · February 19, 2020, 11:10am

Yes, you can also see it with implicitly[Int <:< (Int | String)] which does not compile.

So I would keep the “pseudo-union” characterization. It’s not an actual union if it does not come with the corresponding subtyping relationships, even if they can be emulated to a limited extent.

sjrd · February 19, 2020, 12:03pm

Exactly.

I’ll point out though that none of the user space union solutions provide this true subtyping. Including Miles’. So we should characterize all the user level solutions as pseudo-union.