Experimental Capture Checking: New Syntax for Explicit Capture Polymorphism

Hello!

As we are gearing up for the next big milestone of the experimental capture-checking feature, we are looking for community feedback on a new syntax for explicit capture polymorphism, which is a complimentary mechanism for the default implicit capture polymorphism.

Syntax Overview

Below is a short summary of the syntax. For more extensive usage examples, the CC language reference and the test cases in this PR are a good starting point. For the grammar, refer to the updated syntax.md.

Capture-set parameters

Capture-set parameters can be defined alongside type parameters within angle brackets,
they are preceded by the soft-keyword cap (not to be confused with the universal capture set caps.cap).

def apply[cap C, T, U](f: T ->{C} U)(x: T): U = f(x)

cap-qualified parameters accept capture sets as arguments, i.e., we can regard those parameters as ranging over the universe of capture sets which are finite sets of references to capabilities written in the usual curly braces:

apply[{io},Int,Int](i => io.println(i); i*i)(7)

Just as for type arguments, capture-set arguments can be often inferred.

Similarly, capture-set parameters can receive upper and/or lower bounds which are capture sets:

def apply[cap C >: {io} <: {io,network}, cap D <: {C}, cap E >: {C,D}, T, U] ...

Capture-set members

We also support capture-set members much in the same way as type members:

trait Foo:
  cap type C >: {io} <: {io,network}
  cap type D = {network}

as well as paths in capture sets:

def poly(f: Foo)[cap C <: {f.D}](g: () ->{C} Unit): Unit = g()

Variations

Moving forward, is this a good syntax?

Capture-set notation: For regularity, capture sets must be always written in curly braces. But should there be exceptions? For example, do we really need to write singletons in braces all the time?

def poly(f: Foo)[cap C <: f.D]

trait Foo:
  cap type C >: io <: {io, network}
  cap type D = network

Consider type refinements:

def useFoo(foo: Foo {cap type C = {io}}): Logger^{foo.C} = ???

Are we overdosing on braces?

Marking capture-parameters and -members:
Declaring them with the soft modifier cap clearly delineates them from
ordinary type variables. If this is too wordy, we could partly switch back to the
old syntax design (a post-fix hat ^) for a more terse syntax:

def poly(f: Foo)[C^ <: f.D]

trait Foo:
  type C^ >: io <: {io, network}
  type D^ = network

def useFoo(foo: Foo {type C^ = io}): Logger^{foo.C} = ???

However, we expect explicit polymorphism for captures to be fairly rare in practice,
so maybe this is not a big deal if we write the cap modifier from time to time?

Discussion topic

The purpose of this thread is discussing the syntax design for explicit capture polymorphism and soliciting community feedback. We are looking forward to thoughts and suggestions about the proposed syntax.

7 Likes

I’m not quite sure how to feel about the syntax as a whole, but thinking about it, putting everything in braces feels off to me. Currently braces are used for grouping several things together, and to refine stuff. Comparatively, requiring using braces for capture sets with one value feels inconsistent. I am not against requiring defining capture sets with delimiters around them, but I am unsure if that delimiter should be braces.

2 Likes

Here are my immediate thoughts:

To me it’s surprising only C is a capability, given:

def apply(using A, B)(using a: Int, b: String)

In both clauses, both parameters are using parameters, not just the first one

This can be fixed by either mandating the capture parameters are towards the end (not a big fan):

def end[A, B, cap C, D]

Or in a separate clause:

def split[cap C][A, B][cap D]

As for the arrow, it looks kinda awkward that it’s “pointing into a wall”
What about something like:

def apply[cap C, T, U](f: T -{C}> U)(x: T): U = f(x)

Or -{C}-> or {C}->

There seems to also be some redundancy between cap and braces, I would say, go all in on only one:

trait Foo:
  type {C} >: {io} <: {io, network}
  type {D} = {network}

Or

trait Foo:
  type cap C >: cap io <: cap io, network
  type cap D = cap network

I don’t think braces are a bad choice of delimiters, as it goes with the refinement theme: Foo{val bar: T} vs Foo {io}

5 Likes

I don’t think it’s worth changing in order to make it more terse – as you say, it’s not likely to be that common, and cap isn’t exactly wordy.

But (100% subjectively) I do find the post-fix hat more visible. My eye skims a bit over the cap keyword, whereas it gets arrested by the ^ with a “wait, this is different”.

Aesthetically I like cap more. But I do wonder if the postfix ^ may be slightly more effective at distinguishing the capture types from normal ones.

4 Likes

Not that it matters a great deal, but is it ^ because it resembles a hat or a… cap?

8 Likes

That’s serendipity, the symbol was not chosen for that reason. But it works very well that way.

6 Likes

I concur with Sporarum that the problem with cap is not that it’s too wordy–using has more characters (and two syllables if you’re speaking it) and it’s fine–but because cap feels redundant given that everything is marked with braces.

I also concur with jdecour that ^ stands out far more than cap, and is a much better visual alert that something weird is going on here. There be dragons, or something pointy.

If braces actually work everywhere syntactically, for certain, then I think universal braces is the way to go. Elements are isomorphic to a single set containing that element, and there’s never a reason to have sets of anything else, so {C} does the job: fully unambiguous, and also visually striking.

However, if braces ever fail to work syntactically, then I would favor abandoning them entirely so you never need to think “oh wait, are these the special braces or the regular braces”? Not overloading braces is itself a plus. In that case, use ^ everywhere and some sort of binding operator (+ would probably be intuitive) when you want to create sets.

So

def apply[C^ >: io^ <: io^ + network^, D^ <: C^; E^ >: C^ + D^, T, U] ...

might be a little hard on the eyes, but there is no possible way you can look at that and go, “Oh, this is some other regular Scala thing”.

Still, if it works, I’d go for one of these:

def apply[{C} >: {io} <: {io, network}, {D} <: {C}; {E} >: {C, D}, T, U] ...
def apply[{C >: io <: io, network}, {D <: C}; {E >: C, D}, T, U] ...

The rule on the latter one is “braces mean you’re in capture-set-land and just stuff everything into braces, it’s all considered as sets”. (If you don’t want to muck with precedence to get , to work like that, use + or something.)

As far as I know, T >: {io} or cap io <: U has no meaning, so we don’t need to pick notation that allows those things to be expressed.

After reading the proposal in more depth, I’m not sure any of the above is actually correct–in fact, I’m quite sure that at least half of it doesn’t even make any sense.

So my actual advice is that it’s really hard to tell. There are so many things that seem unexpected, where we want the syntax to help guide people in useful directions, that I’m not really sure what we should want.

There is something weird about how captures are expressed at all–{cap} is essentially the ability to capture everything, which for non-tracked capabilities is equivalent to being able to do anything, which is Nothing. But Nothing is the universal subtype, not supertype, so writing {a} <: {a, b} seems weird.

This to me suggests more fundamental growing pains are likely than whether to use T^ or cap T to mean T^{cap}.

(Also, I don’t see how tunneling won’t break function-like traits as opposed to storage-like traits, and if tunneling doesn’t work, I’m not sure whether the syntax will need to change.)

Was a completely separate capture set block considered with curly braces instead of mixing capture set parameters and type parameters? This would be more consistent with how Scala did so far for type parameter blocks, regular parameter blocks, and given parameter blocks (while enabling having more than one block of each, as done today).
Additionally, what about using a union operator | to define a capture set?
E.g.

def foo{C1, C2 <: io, C3 >: io <: io | network}[P1, P2, P3](a1: {C1} P1, a2: P2, a3: P1 -> {C3} P3)(using Ctx): P3 = ???

What about a different keyword ctype?

trait Foo:
  ctype C >: io <: io | network
  ctype D = network

And if we go with ctype as a keyword, then we can maybe utilize it for more regularity to declare capture type parameter blocks:

def foo[ctype C1, C2 <: io, C3 >: io <: io | network][P1, P2, P3](a1: {C1} P1, a2: P2, a3: P1 -> {C3} P3)(using Ctx): P3 = ???
1 Like

First of all I like a lot this proposal. I feel it make the language more natural to use and less surprising (which is a good thing !).

My feedback on this is :

  • cap is not very clear about what it represents. Is this capture, capturing, capacity ? I think it is worth typing a bit more chars and have a better word used there.

  • In the same vein this has a whole implicit vibe to me. The same keyword added for two different use case. I would propose capturing for consumption in arglists, and capture type or capacity for definition, inclusion in trait members etc…

This way it will mirror the given / using usage people are already used to.

2 Likes

cap here is short for capability in all cases. In capture checking, this ^{a,b,c} is a capture set, and the items a, b, and c are capabilities.

That being said, cap being a keyword here, and what the universal capture set is named, is confusing. Can the universal capture set be renamed to caps if you want to maintain brevity?

So, does this mean that, from now on, the Capture-Set Kind will be a proper member of the Scala type system?

If yes, I like the

type C^
type D^

It has some similarity with the [] after the type to designate the kind. But it raises a question: how would a parametric type wear capabilities?
like that

type C[A, B]^

Or like that

type C^[A, B]

The former one looks prettier but is a bit less straightforward. The cap type doesn’t raise that question and is the most direct.

Another question is about AnyKind, would it include the Capture-Set Kind now? If yes, we’ll still have situations with

type F[K <: AnyKind]
type A <: AnyKind

Here, abstract types can still be capture-set types without the magic cap word.
So it’s psychologically dangerous to bring such a strong keyword-based distinction when it could be easily dismissed.

The third question has bothered me since the beginning of the ^{} syntax. We already have a symmetric type operator, that works as meet of the type lattice: the glorious & (formerly twitter with). So maybe it’d be more consistent to write something like io^ & network^ instead of {io, network}.
The io^ syntax I used here is again more consistent with io.type - another dependent like variable usage in types.
Generally, I agree that {io, network} is kinda prettier, but can be very surprising for a non-Scala developer that sees “block syntax” within types again, just recovered from the refinement syntax.

2 Likes

Wow, this forum delivers!

I’m not sure this’ll work well with increasing capture-set size. Plus, we might have capturing types in the domain and co-domain:

Async^{foo,bar,baz} {a,b,c,d,e,f,g,h}-> Logger^{io,network}

vs.

Async^{foo,bar,baz} -{a,b,c,d,e,f,g,h}-> Logger^{io,network}

vs.

Async^{foo,bar,baz} ->{a,b,c,d,e,f,g,h} Logger^{io,network}

I kinda like the second, but the current status quo more accurately conveys that the capture set belongs around the function:

(Async^{foo,bar,baz} -> Logger^{io,network})^{a,b,c,d,e,f,g,h}

Which btw is a big difference to traditional effect systems.

Though to be fair, with explicit parameterization, we could abstract long capture sets, at the expense of moving the long capture set to the type parameter list.

The first seems fine, the important bit is that we clearly signal that we have a context where literal capture sets are expected. So then we would define capture-polymorphic functions like this:

def apply[{C}, T, U](f: T ->{C} U)(x: T): U = f(x)

which is symmetric to the proposed application form.

IMO, the second design goes overboard with the cap modifier.

I see what you mean, but I don’t know how else we can design a syntax that is both elegant and fits the semantics

Edit: Added remarks on cap in sets vs. cap as modifier

Separated cap-Parameter Lists

Multiple posters remark that cap feels like using and could be treated like that, also suggest separating capture parameters. Indeed, the first version of the PR went into that direction.

One of the problems with stratified type/cap parameter lists is that they are not supported for classes and traits, e.g., we can’t write

class Scheduler[T][cap C <: {async}]

I’m not very fond of consecutive bracket lists. And sometimes, a type parameter can depend on a cap parameter:

class Scheduler[cap C <: {async}][L <: Process^{C}] ...

The pros of a single list are that they can mutually depend on each other:

class Scheduler[L <: Process^{C}, cap C <: {async}] ...

So maybe, the right way of doing it is putting all the cap params at the end of a single list? But then it is somewhat departing from using clauses.

Other Delimiters for cap-Parameter Lists

Also, other designs were considered to denote cap-parameter lists, like:

def apply[T, U][^ C](f: T ->{C} U)(x: T): U = f(x)
//              ^ begin with `[^`

or

def apply[T, U]{C}(f: T->{C} U)(x: T): U = f(x)
//             ^ ^ delimit with braces

The first one does not look very nice, the second one looks like a block or refinement. I think it is problematic when attached to class types:

class Scheduler{ C <: async }{ val processes: ProcessList^{C} = ??? }

Technical Constraints

Opt-in requirement: I should add that we aim for capture checking being completely opt-in, which entails that cap parameters should have no bearing when it is disabled and be easily removable during pickling. In that regard, having cap parameters separated or grouped at the end of type parameter lists is beneficial.

Limited Type-Level Computations: While capture-sets are morally their own universe and can be regarded as a new kind, we are limited by the opt-in requirements in what can be done: The capture-checker is a recheck phase running after the typer. The typer doesn’t “see” capture sets at all, hence

  • Implicit resolution cannot make selections based on capture sets.
  • Match types cannot match on capture sets, etc.

It’s not easy (impossible perhaps) supporting those without giving up the opt-in requirement. Maybe in the very far future we could depart from that…

Universal Capture-Set cap vs. the Soft-Keyword cap

The potential for confusing the top capture set caps.cap vs. cap the soft keyword is IMO fairly low, since caps.cap rarely needs to be written out by hand, e.g.,

val logger: Logger^ = ... // just `^` is a shorthand for `^{cap}` resp. ^{caps.cap}
val f: A => B = ...       // reads A ->{cap} B
type T <: Logger^
foo[{cap}]                // within braces, it can only be caps.cap

It depends on the perspective. {cap} is like Any: it is a set of capabilities that could contain anything (like an existential), not necessarily everything.
The <: relation on capture sets is subcapturing, which is morally “set inclusion plus plus”. A capture set is an upper bound on the captured capabilities (and hence the operations a computation may potentially invoke). So if I may use capability a, that implies I may use a or b: {a} <: {a,b}
It’s not to be confused with the dual perspective: An object can do things: if I can do a and I can do b, that implies I can do a as well as b: a & b <: a, a & b <: b. Ultimately, I can do nothing (the natural “top” for “can do”), and dually, I may potentially do anything (the natural “top” for “may do”).

I’m not sure, can you elaborate?

Yes, see above. To satisfactorily support dedicated cap-parameter lists, we need to generalize to curried class-parameter lists. I’m not entirely sure why this is not possible in current Scala @odersky

I like that it’s shorter than cap type and that the same keyword is usable for capture-set members and parameters. Alternatively, we could have them in sync on just cap:

trait Foo:
  cap C >: io <: io, network
def foo[cap C1, ....]

But I’m equally fine with ctype.

We had discussions around using type operators like | and & (but not full-blown type expressions) for capture sets. The pipe seems a bit heavy for just listing a sequence of set elements. The ampersand is reserved for set intersections in the future.

Regarding braces for sets: maybe we should make them optional and only necessary for cases where there is ambiguity? So above definition is fine,
but in the context of parameter and argument lists, they’d be necessary:

def foo[cap C1 >: async,io <: {io,network}, C2 >: {async,io}, C2 <: C1]...
//                ^ ok, delimited by >: <:                           ^ ok, closed by ]
foo[io,network] // ambiguous
foo[{io,network}] // ok
def single[cap C]
single[a,b,c] // technically not ambiguous 

Not really (see above).

This is something we should support either way, as it already would have uses now:

class Foo[A <: { type M }](a: A)[B <: a.M]

It of course opens a whole can of worms, but I believe it to be worthwhile

A few meta-level considerations – not necessarily critical-path (and we don’t need to derail onto them), but I’m mulling them as I think about this:

  • What does this look like in documentation to naive users? This is advanced stuff, but it’s going to show up in library signatures. Can we concisely say “if you see this, you don’t need to worry about it much”? I do this routinely for co-/contra-variance markers, which are where this comes up most often for Scala learners currently, and telling the A1/A2 developers to ignore them is almost always right in practice.
  • How do we teach this? That’s extremely meta, but I’m seriously concerned about it. I spent a fair number of years figuring out how to teach FP to new Scala engineers; this feels like a similar-but-harder problem to me. It may or may not be relevant to the syntax design.
  • Are we choosing the right level for the keyword? I’m realizing that cap reminds me of implicit in a very specific way: we’re talking about the mechanism being expressed (“capture”). It took us a number of years to realize that implicit was a mistake, and that we should focus on the usage instead, switching over to given/using. (That said, this may be fine precisely because, while I think “capture” might be the wrong semantic, “capability” might be the right one, so cap still makes sense – it’s overloaded in a useful way.)

Anyway, some food for thought. None are necessarily issues, but worth having in the backs of our minds.

2 Likes

Sure, a capture set is an upper bound, but so is a trait: you can potentially call any of its methods, but you don’t have to call any. The way it’s phrased here the natural parallel–and I think it’s important to form the parallel because if there is one it will help people learn about it!–is:

trait X { def x: String }
trait Y { def y: String }

type W = X | Y
type V = X & Y

def foow(w: W) =
  w.x  // No, I may not use w.x
  w.y  // No, this doesn't work either

def foov(v: V) = 
  v.x  // Yes, this works!
  v.y  // This too!

So from the way you discuss it, it feels like {a, b} <: {a}–if you focus on use.

On the other hand, if you focus on capturing then

// Can store
foow(new X {})  // Works!
foow(new Y {})  // Also works!

// No good
foov(new X {})  // No, where's our .y capability?
foov(new Y {})  // Also fails

So in this case, foov <: fooy <: foow. Why? Because the argument is in contravariant position, of course. (In invariant position, subtyping rules still allow you to pass a subtype because you can widen it first, so it really must be contravariant to have the right feel.)

The key, then, is that capability c is isomorphic to c => Unit in terms of ordinary types. But that mental flip is hard enough to make even with explicit labeling–A <: B means (B => Unit) <: (A => Unit) takes a bit of getting used to, but people are now I hope pretty much used to it.

The notational suggestions so far don’t seem to take advantage of this existing intuition at all.


Regarding tunneling, suppose we have

trait Hi[-A, B, +C] {
  def a1(a: A): Unit
  def a2[AA <: A](): AA
  def b1(b: B): Unit
  def b2(): B
  def c1(): C
  def c2[CC >: C](cc: CC): Unit
  def ab(a: A): B
  def ac(a: A): C
  def bc(b: B): C
}

According to the rules, c1 should pop out captures; maybe b2 should also. But we don’t know from the types whether ab, ac, or bc are using capabilities that are passed in, and Hi doesn’t capture anything itself; or whether it’s passing the capability through. You only know when you instantiate a particular Hi.

So if you instead have:

trait Hey[-A, +C] {
  def ac(a: A): C
}

you can tell that any capabilities that appear in C that aren’t in A must have been stored internally, but you can’t tell whether any capabilities in A were in fact stored for internal use or simply discarded or passed on to C; and you can’t tell whether C capabilities that appear in A were passed on or not.

So the rules for tunneling seem underspecified, especially when working in a generic context. (It seems less confusing when fully reified.)


Anyway, I think the document as a whole starts too much from this as its own thing and not enough from existing concepts.

For instance, it’s a little hard to tell whether this has a sensible mapping onto tuples of givens that are only markers. If we had a marker trait Cap[T <: Tuple], and given conversions from tuples of some arity to a different ordering of the same arity, and from an arity to every lower arity, could we awkwardly but correctly reproduce all the features, assuming we could only capture items i for which we had a Cap[i.type]? Obviously this isn’t a practical solution, but if it is a correct conceptual solution I think it may help guide the syntax.

2 Likes

On the topic of cap being a type modifier, will it be possible to mix it with other type modifiers (e.g. opaque cap type)?