Proposal to add kind polymorphism to the language

milessabin · March 21, 2019, 12:29pm

Hi Scala Community!

I’d like to open the discussion for the pending SIP on adding kind polymorphism to the language. You can find a brief write up here .

Summary

Scala types are partitioned into kinds. First-level types are the types of values. Higher-kinded types are type constructors such as List or Map. The kind of a type can be characterized by the top type of which it is a subtype. Normal types are subtypes of Any, covariant single argument type constructors such as List are subtypes of [+X] => Any, and the Map type constructor is a subtype of [X, +Y] => Any.

Currently types can only be used exactly as prescribed by their kind. Subtypes of Any cannot be applied to type arguments whereas subtypes of [X] => Any must be applied to a type argument. As type arguments they can only be used to satisfy type parameters of exactly the same kind.

Sometimes we would like to have type parameters that can accept type arguments of more than one kind, for instance to define an implicit value that works for parameters of any kind. This is now possible through a form of kind polymorphism based on the subtype relationships mentioned above. Kind polymorphism relies on the special type scala.AnyKind that can be used as an upper bound of a type.

Given a definition of the form,

def f[T <: AnyKind] = ...

the actual type arguments of f can then be types of arbitrary kinds. So the following would all be legal,

f[Int]
f[List]
f[Map]
f[[X] => String]

The primary use of this construct will be to index types and type classes which are naturally kind polymorphic, such as Type in the Scala 3 quote/splice based metaprogramming facility.

This proposal intentionally doesn’t provide any means for abstracting over kinds intrinsic to method signatures or class definitions (ie. variable numbers of type arguments) or any means of applying an AnyKind bounded type parameters directly to type arguments (although both of these can be encoded using standard Scala type level programming techniques).

Whilst limited in scope, we feel that this proposal is sufficiently useful and open to later extension to warrant its inclusion in Scala 3.

Jasper-M · March 21, 2019, 1:41pm

I’m very curious about the todo: insert good concise example part of the documentation, because I wonder how you could actually do something interesting with this feature.

Also, isn’t AnyKind a bit of a fake type bound, like Singleton is? Which appears to be problematic.

LPTK · March 21, 2019, 2:54pm

It’s great that Dotty is supposed to make foundations come first.

But then it would be greatest if that principle would be applied uniformly over all new features.

This proposal in particular is quite scary, considering that it seems to make Scala’s type language impredicative. In fact, I could easily find a way to ~~make the compiler crash~~ run into recursion limit errors based on that fact:

scala> class C { type F[A[_] <: AnyKind] <: A[F[F]] }
1 |class C { type F[A[_] <: AnyKind] <: A[F[F]] }
  |                                     ^^^^^^^
  |Recursion limit exceeded.
  |Maybe there is an illegal cyclic reference?
  |If that's not the case, you could also try to increase the stacksize using the -Xss JVM option.
  |A recurring operation is (inner to outer):
  |
  |  type parameters of C.this.F[LazyRef(C.this.F[LazyRef(C.this.F)])]
  |  type parameters of LazyRef(C.this.F[LazyRef(C.this.F[LazyRef(C.this.F)])])
  |  type parameters of C.this.F[LazyRef(C.this.F)]
  |  type parameters of LazyRef(C.this.F[LazyRef(C.this.F)])
  |  type parameters of C.this.F[LazyRef(C.this.F[LazyRef(C.this.F)])]
  |  type parameters of LazyRef(C.this.F[LazyRef(C.this.F[LazyRef(C.this.F)])])
  |  type parameters of C.this.F[LazyRef(C.this.F)]
  |  type parameters of LazyRef(C.this.F[LazyRef(C.this.F)])
  |  type parameters of C.this.F[LazyRef(C.this.F[LazyRef(C.this.F)])]
  |  type parameters of LazyRef(C.this.F[LazyRef(C.this.F[LazyRef(C.this.F)])])
  |  ...
  |
  |  type parameters of C.this.F[LazyRef(C.this.F[LazyRef(C.this.F)])]
  |  type parameters of LazyRef(C.this.F[LazyRef(C.this.F[LazyRef(C.this.F)])])
  |  type parameters of C.this.F[LazyRef(C.this.F)]
  |  type parameters of LazyRef(C.this.F[LazyRef(C.this.F)])
  |  type parameters of C.this.F[LazyRef(C.this.F[LazyRef(C.this.F)])]
  |  type parameters of LazyRef(C.this.F[LazyRef(C.this.F[LazyRef(C.this.F)])])
  |  type parameters of C.this.F[LazyRef(C.this.F)]
  |  type parameters of LazyRef(C.this.F[LazyRef(C.this.F)])
  |  type parameters of C.this.F[LazyRef(C.this.F[LazyRef(C.this.F)])]
  |  type parameters of C.this.F[C.this.F]

odersky · March 21, 2019, 3:35pm

That’s not a compiler crash. That’s an error message. You do have a cyclic reference, the compiler points that out, and continues happily afterwards.

If you want to insist that the compiler should always terminate in reasonable space and time no matter what illegal code you throw at it then I must disappoint you. Even much weaker type systems than Scala’s don’t lend themselves to that.

If the compiler had not rejected the program because of a cyclic reference it would have gotten around to rejecting it because you apply an anykinded type A. It simply did not get that far because that check is after Typer so you get to it only if there are no other type errors.

milessabin · March 21, 2019, 3:37pm

it seems to make Scala’s type language impredicative.

It doesn’t change anything.

Kinds are identified with other already existing types, so assuming Scala’s type system is consistent prior to this addition, it’s consistent afterwards. In particular there’s no danger of encountering Girard’s paradox (unless we’ve already fallen foul of it).

LPTK · March 21, 2019, 3:58pm

@odersky fair enough! (I’ve amended my message so it does not suggest it’s a compiler crash!)

@milessabin

I am not sure what you mean by this. Could you provide a way to encode any program that uses AnyKind into a program that does not? Conceptually I understand that we could view a F[A <: AnyKind] as a family of related types. So we could try to expand such definition into a set of non-kind-polymorphic instantiations of F corresponding to those actually used in the program, as in F0[A], F1[A[_]], F2[A[_,_]], etc. But the fact that you can have recursion and that F may appear in itself seems to defeat that scheme. What is your idea on this?

milessabin · March 21, 2019, 4:13pm

Conceptually I understand that we could view a F[A <: AnyKind] as a family of related types.

Kinds are identified with types of the form Any, [_] => Any, [_, _] => Any, etc., all of which are disjoint. AnyKind can be thought of as the union of all of these.

LPTK · March 21, 2019, 4:34pm

Thanks for the clarification. But how about my question on the existence of a mapping from extended language (Scala + AnyKind) into core language (Scala)? If you are willing to make the claim that AnyKind doesn’t change anything, I think you should try to support such claim with some form of semi-formal argument about why it’s the case.

FYI, my remark about impredicativity is mostly inspired by the pure subtype systems paper.
Quoting from that paper:

Top makes System λC fully impredicative, which has important consequences for decidability

System λC suffers from Girard’s paradox because it places no constraints on impredicativity. All terms belong to a single universe, and Top is both a member of the universe, and a type that ranges over all elements of the universe.

I am not sure whether your proposal suffers from this effect, but it does seem like AnyKind has a similar power to that of Top in their system. So convincing me otherwise would probably require slightly more than “AnyKind can be thought of as the union of all of [the existing kinds]”.
(On the other hand, you are right that it’s possible Scala already suffers from such problems anyways.)

milessabin · March 21, 2019, 5:11pm

But how about my question on the existence of a mapping from extended language (Scala + AnyKind ) into core language (Scala)?

Replace any occurrences of AnyKind with the union I mentioned in my last reply.

FYI, my remark about impredicativity is mostly inspired by the pure subtype systems paper.

I don’t have a complete answer to this question, so I’ll defer to @odersky and @Blaisorblade. However, I’ll repeat my observation that we’re not introducing any new types, subtype relationships or quantifiers, and as such I don’t see how there can be an issue with impredicativity which wasn’t there already.

If you can see your way to constructing one it’d be very helpful if you could sketch that out for us.

LPTK · March 21, 2019, 5:22pm

But that doesn’t work (currently, at least), as unions only act on first-order types:

scala> def f[A[_],B[_]]: A | B = ???
1 |def f[A[_],B[_]]: A | B = ???
  |                  ^
  |                  Type argument A has not the same kind as its bound
1 |def f[A[_],B[_]]: A | B = ???
  |                      ^
  |                      Type argument B has not the same kind as its bound

In any case, to know whether type union can be generalized to higher kinds, and whether they can be applied across kinds, we’d need DOT to support reasoning about higher kinds. AFAIK it currently does not (maybe @Blaisorblade can comment on current development), which means this proposal does not appear to be grounded on solid theoretical grounds. Or am I missing something?

eed3si9n · March 21, 2019, 6:56pm

Could we adjust the terminology here a bit please?

Scala types are partitioned into kinds . Proper types are the types of terms (such as 1, and List(1, 2, 3)). Type constructors such as List or Map accept proper types as parameters, and return other proper types (for example List[Int]). Higher-order type constructors often arise as a typeclass that accept other type constructors (for example Functor, which accepts F[_] like List).

tpolecat · March 22, 2019, 12:13am

@milessabin My interpretation of the first paragraph of your summary is that List is shorthand for the type-level function [+A] => List[A] and in this scheme [+A] => Any is the most general such function. Is this correct? Either way I think a slower explanation might be helpful because this is a very new idea for most people. Similarly an example would be very helpful.

Thanks for your work on this, I think it’s going to be really cool.

joshlemer · March 22, 2019, 2:38pm

I agree with @Jasper-M that it seems like it would be better to have a different way to express “kinds” than with a type bound. If I understand correctly, Singleton being expressed as a type bound contributes to the fact that Singleton types cannot participate in union types (1 | 2 | 3 is illegal) (please correct me if I’m wrong)(also see discussion Singleton subtyping, and can we mark scala.Singleton as experimental?). The fact that we now would like to add a second faux-type bound to the language makes me think that we should really be pursuing a different way to express kinds.

yangbo · March 25, 2019, 5:00pm

Singleton is not a well-defined type, but AnyKind can be properly defined.

trait Foo[A] is a shortcut to trait Foo[A <: Any], and AnyKind is a supertype of Any, so trait Bar[A <: AnyKind] makes Bar accept loose type parameter than Foo. AnyKind is a not-default type bound but a more generic type than Any.

The relationship between Any and AnyKind is like the relationship between AnyRef and Any. In a similar comparison def foo(a: {}) vs def bar(a: Any {}), {} implies AnyRef {}, which is the default base type for type definition, though the non-default value Any is more generic.

yangbo · March 25, 2019, 5:03pm

I think the problem for AnyKind is that we did not define the meaning of Nothing properly.

Current implementation of type inference in Scala compiler 2.x internally use Nothing for any kind, which sometimes cause bugs.

We should distinguish Nothing (bottom type for any types that are child types of Any) from NothingKind (bottom type for any types that are child types of AnyKind).

So trait Bar[A <: AnyKind] is a shortcut of trait Bar[A >: NothingKind <: AnyKind].

Blaisorblade · April 2, 2019, 4:45pm

EDIT: the TLDR is that nobody has shown that kind polymorphism is sound, but it indeed does not seem to be more impredicative than current Scala, and we don’t have strong evidence against it, although I initially feared otherwise, and I can articulate a bit what leads me to this current best guess. I’m not certain, but I have thought about this for a while before today.

DOT indeed doesn’t support higher kinds yet; the first Dotty versions actually translated higher kinds into core DOT*, but that encoding turned out to not work well enough. Getting to higher kinds should be the next step in my current efforts, and hopefully AnyKind would also be in scope, but I haven’t looked at it closely enough. The first question, in such an effort, would be whether one can formalize an intuition similar to what @milessabin presents, at least in the semantics.

But at present, kind polymorphism falls (not alone) among the features that haven’t been formalized, and for which we don’t have strong evidence of either soundness or unsoundness — as opposed to e.g. type projections (for which we have evidence of unsoundness and of multiple failed attempts at fixes).

In the absence of formal proofs, what’s our best guess?

Speculating on the chance of paradoxes

First, why not just try encoding them? Because it’s extremely hard, even when those paradoxes arise. Here’s the shortest version I know (in Agda with unsafe options):

github.com

agda/agda/blob/master/test/Succeed/Hurkens.agda

-- Hurkens' paradox [1].

-- [1] A. Hurkens, A simplification of Girard's paradox.

{-# OPTIONS --type-in-type
  #-}

module Hurkens where

⊥ : Set
⊥ = (A : Set) -> A

¬_ : Set -> Set
¬ A = A -> ⊥

P : Set -> Set
P A = A -> Set

U : Set
U = (X : Set) -> (P (P X) -> X) -> P (P X)

This file has been truncated. show original

But in other systems that takes much longer:

github.com

lpw25/girards-paradox/blob/master/girard.ml


(** Define the Type type *)

module type Type = sig module type T end

module Type = struct module type T = Type end

(** Define the absurd type *)

module type Absurd = functor (X : Type) -> X.T

module Absurd = struct module type T = Absurd end

(** Define the type of predicates *)

module Pred (A : Type) = struct
  module type T =
    functor (_ : A.T) -> Type
end

This file has been truncated. show original

Beyond the length, this is extremely dense and infamously difficult code. I still don’t fully understand the Agda version, and I only got halfway in the cited paper (“A simplification of Girard’s paradox”) before running out of time.

And BTW, I must admit that if we only had problems of that sort, it might not be a significant type safety problem. However, the risk is always that we’d get much simpler violations.

How much do we care about paradoxes

First of all, usually Girard’s paradox just lets you write programs that loop. That’s bad for a proof assistant based on Curry-Howard, like Agda or Coq, because looping programs are also looping proofs, which let you prove False; that it, programs that loop are a problem for logical soundness.

However, for Scala we only need type soundness: Scala’s not a proof assistant, and it anyway doesn’t ask users to prove that programs terminate. That only requires only a limited sort of logical soundness: a value v with type member A >: L <: U serves as evidence that L <: U, and this evidence must be logically sound. Otherwise, you could prove that Any <: Nothing and implement unchecked casts. And indeed, @sstucki has managed to adapt existing paradoxes (related to Russell’s) to violate type soundness using this avenue (tho the example doesn’t translate to Scala). Strawman of a DOT calculus with higher-kinded types · GitHub

“X <: AnyKind”, or polykinded bounded quantification

I have also wondered whether “kind polymorphism” is too impredicative, but when I talked @sstucki was relatively confident that this was not a danger. And I think I now understand why.

With or without AnyKind, types can only quantify over other types; notwithstanding the name, AnyKind does not give you “parametric kind polymorphism”: you cannot express a type like ∀ (K : Kind). F K, which would be dangerous because it’s a type quantifying over something bigger (a kind). Haskell’s kind polymorphism allows exactly that, so it might suffer from Girard’s paradox — but in Haskell there’s no known way to go from there to violations of type soundness (and in fact they have proved type soundness).

How a soundness proof could start

In my ongoing soundness proofs for DOT, the key step is defining values that can contain type members, represented as semantic types (that is, predicates on values). Theorem provers forbid doing that naively, because doing that lets you encode Russell’s paradox in the theorem prover; one must use instead a more complicated sort of predicate. However, once values can contain semantic types, higher-kinded semantic types seem a smaller step — and the union of higher-kinded semantic types for different kinds should also work.

It’s trickier to interpret T <: AnyKind: if T and U have “kind *”, then T <: U means that all values in T are also in U; one can extend this definition pointwise to higher kinds, so that T <: U at kind * -> * iff for all X : * then T[X] <: U[X], but the definition depends on the kind. For T <: AnyKind, you can use the actual kind of T for a variant of this definition (or just demand that T is a “reasonable” type in the appropriate sense).

*Implementing Higher-Kinded Types in Dotty, Scala Symposium 2016, https://infoscience.epfl.ch/record/222780/files/p51-odersky.pdf

odersky · April 2, 2019, 7:01pm

Thanks for the detailed assessment, Paolo. Even though it’s not conclusive yet, it’s very helpful.

sstucki · April 4, 2019, 11:47am

Here are my 5 cents after a brief discussion with @Blaisorblade and @LPTK on Twitter.

A practical concern: why would any programmer want to encode kinding through subtype bounds in this way? Even if this encoding is theoretically sound, should it really be used in the syntax? Shouldn’t there are least be some sugar to write proper kinding annotations instead? At least to me, these encoded kinds looks rather opaque.
I don’t know the compiler well enough to predict possible issues, but from studying the metatheory of similar systems, I’d suspect that type normalization and subtype-checking rules that rely on kind information could interact badly with AnyKind. E.g. types like List may require eta-expansion to compare them to encoded “kinds” like [+X] => Any and I assume the compiler uses kind information to do this correctly. I wonder how abstract, bounded types like Y <: AnyKind are compared to (and unified with) types like [+X] => Any. Does [+X] => Any <: Y hold? Can the two be unified? Maybe one of the Dotty devs. can say more about this.
It will be hard to prove this extension type-safe. The idea to express kinding through type bounds has been studied by DeLesley Hutchins in his PhD thesis on “pure subtype systems” (which @LPTK mentioned earlier). The metatheory turns out to be complicated. Type safety was never proved, AFAIK.
There’s also a potential problem with impredicativity (as pointed out by @LPTK) because of the connection to pure subtype systems. I initially thought that this wasn’t an issue, but the sketch given in Hutchins POPL’10 paper does seem applicable here (Hutchins’ Top type seems very similar to AnyKind). However, last I checked, terms and types in Dotty were already non-normalizing (unless someone has fixed Issue 2887), and this does not automatically make the Dotty type system is unsafe (as pointed out by @Blaisorblade) – it just makes Dotty useless as a theorem prover.