Pre-SIP: Improve Syntax for Context Bounds and Givens

odersky · July 1, 2024, 2:36pm

That’s nothing new. The same conflict arises for

   foo:
     bar

The disambiguation is that in each case a : at the end of a line starts a block. If you want a type ascription on two lines, you need to write

  foo
    : bar

Sporarum · July 1, 2024, 2:40pm

Oh of course, then is there any ambiguity that would arise from using colon twice ?

I couldn’t find another

bishabosha · July 1, 2024, 2:53pm

I have a small nitpick with context bounds, could they also infer the kind of the type? would that be helpful or a hinderance?

trait Functor:
  type Self[A]
  extension [A](x: Self[A])
    def map[B](f: A => B): Self[B]

def mapAny[F[_]: Functor, A, B](fa: F[A])(f: A => B): F[B] =
  fa.map(f)

we still have to write F[_]. What if a context bound could also infer the bounds from the Self type?

def mapAny[F: Functor, A, B](fa: F[A])(f: A => B): F[B] =
  fa.map(f)

in this case you see F declared with no parameter. which is analogous to this (legal) definition:

type F >: [T] =>> Nothing <: [T] =>> Any

which is itself identical to

type F[T] >: Nothing <: Any

Ichoran · July 1, 2024, 3:06pm

Just a quick note regarding mathematical with. I don’t find this useful because given is used where Scala uses using. In mathematics, you say, “Given a positive integer b and an integer a, the fully reduced fraction corresponding to a/b is…”. You certainly do not say, “Using a positive integer b” etc… So if I start trying to think as a mathematician, I get immediately confused.

As a keyword for Scala programming, with is not too bad. It’s kind of irregular because with is supposed to mix in a trait, not allow you to provide concrete versions of abstract methods in an anonymous class. But it’s not that much of a stretch.

spamegg1 · July 1, 2024, 3:14pm

“Using” is used quite a lot actually (in a similar “contextual” way to call “background” information or definitions or facts), but let’s not get into a debate (I use Scala’s using to teach correspondence between code and proofs.)

Sporarum · July 2, 2024, 9:16am

I have a very different idea on how to solve the with irregularity problem:

Make it regular

By that I mean, allow it everywhere (in place of a block starting colon):

given Ord[Int] with
  ...

class Foo[T](x: T) with
  ...

extension (x: Foo) with
  ...

This feels weird at first, in the same way if ... then and while ... do do, but I think we would get used to it relatively fast

The one case I would maybe not use is with function parameters

foo.bar with
  1 + 1
// equivalent to
foo.bar{
  1 + 1
}

Of course this means we won’t be able to use with for qualified types, which is a shame, but I’m willing to do that trade

bjornregnell · July 2, 2024, 9:53am

Make it regular
By that I mean, allow it everywhere (in place of a block starting colon):

We discussed all that in the epic thread on optional braces with all time high number of replies and I’d not like to go down that route again and tear up the current nice brace inference rules.

If not colon at the end of the given head, then maybe another keyword just for ending given heads is better. But this is a difficult trade-off.

I still think colon is concise and does its job… Only problem is when they get too many on the same row together with other “symbol sallad”…

bjornregnell · July 2, 2024, 10:56am

I guess many newcomers to Scala that come across F[_] may find that a bit intimidating, so if we can reduce the clutter by making the compiler smarter it seems like a good idea.

Although sometimes “hidden automatics” can also be confusing if you don’t know the mechanism and semantics… But generalizing over placeholders in unnamed cases seems right to me. Would it work for more than one parameter?

bishabosha · July 2, 2024, 11:51am

yeah im proposing to copy the bounds of the Self type, so whatever you can define

bjornregnell · July 2, 2024, 12:44pm

OK, good. Then it seems like the proposal could/should get another section “Context-bounds for higher-kinded types” with your suggestion or similar… (depending on what the author thinks @odersky )

bjornregnell · July 2, 2024, 12:59pm

What about allowing both colon-style and as-style?

Downside: introduce two ways of doing the same thing.

Upside: both have its pros and cons and we need to keep the old colon-syntax anyway for a good while. And its not until used by the masses that we know exactly how this will pan out.

If both are available we can see how the idioms evolve in real-world coding over time and if one gets significantly more popular than the other we can deprecate the “looser” eventually in some distant LTS, per “empirical language design”…

Jasper-M · July 2, 2024, 1:24pm

This looks the most wrong and confusing to me. It’s a function call where you expect a type ascription.

bjornregnell · July 2, 2024, 1:43pm

Perhaps it makes sense if you read it as:

given theOptionalName: <<< the specification >>>

and a specification can be a concrete class instances or other variants of givens.

But I agree that this probably reads better in the as-style, in the case of concrete class instance

given Context() as theOptionalName

Another option is to, for named concrete class instances, always require the longer:

given theOptionalName: Contex = Contex()

but for the unnamed case allow the shorter variant, but still requiring the explicit param marker to indicate that it is an instance :

given Contex()

which is short for given Context = Context() to have an explanation of what it means in terms of the regular given Type = instanceOfThatType

Ichoran · July 2, 2024, 6:25pm

I’d like to back up a little bit and propose a more systematic regularization.

The reasoning

The task of givens is to provide a mapping from context (which may be no context) to types, and to supply an instance of the target type, and in general we may wish to name this mapping. The most general form is therefore, a 4-tuple

(mappingName, Context, TargetType, howToCreateTargetInstance)

We have four ways, as far as I know, to introduce a term name that may be visible outside its scope (assuming we count val, var, def as all variants on the same thing):

val term
val Unapplier(term)
val term @ Unapplier(_)  // or val Unapplier(_) as term
extension (...) def term

In the first case, you just name the thing. In the second, the term is bound to the result of a pattern match, and in the third the term is bound to an instance that has succeeded at a pattern match (but you aren’t using the result of the match). In the fourth, you first talk about context (in this case including a specific instance and type that you’re extending) and the introduce the term name.

There are various other ways to name terms (e.g. for comprehensions, method arguments), but these restrict the terms to their defining block. (Method argument names are a partial exception–you can’t refer to them in a general context, though, only while calling the method.)

Because named givens are intended to have their names escape the scope, any use aside from these is irregular.

Because there is no pattern match, and what is being named is the entire mapping, not the result of a search, the unapply-binding versions also are irregular. You’re not, with a given, naming what you found. You’re naming the entire process: using this context, come up with that instance of such-and-so type.

For regularity, therefore, we have only two consistent options:

given name
given [C]()(using Ctx[C]) def name

Because the form of the latter was chosen specifically to allow multiple extensions from the same context, and you rarely need multiple givens from the same context, the latter form seems unnecessary. (But the parallel to extension is very clear.)

We now need to create our mapping between context and result type and instance. This involves declaring the context types, the result types, and defining the mapping from context instances to result instance. We have three ways to create such a mapping in Scala already: methods, context functions, and the extension methods which split the method declaration:

def foo[A: Bar](using Baz[A], quux: Quux[A]): Foo[A]
(ctx: Ctx) ?=> Foo
extension [A: Bar](...)(using Baz[A])
  def foo(using quux: Quux[A]): Foo[A]

Extension methods are a particularly good parallel because–aside from the argument for the instance of the thing you’re going to extend–they perform exactly the same search-for-context that givens perform. Thus, this is the most natural parallel, though the others are worth considering also.

Now, the unnamed case is very important here because, unlike with methods, with givens there usually will be a unique one that depends on type and scope alone, and for which a name isn’t needed. Indeed, needing a name is in every case a failure of context resolution: the point is to not need a name for your context. It’s okay to “fail” in this sense–you can’t always infer what the correct behavior is!–but it absolutely needs to work well for the unnamed case.

We already have exactly this situation with closures, however. You don’t name your closures for the most part. You just give them. So the most natural parallel is context function syntax, except we need full generic types and context information.

Therefore, for the unnamed case the most regular form is

given [A: Bar](using Baz[A], quux: Quux[A]) ?=> Foo[A]

to declare the mapping. Because this goes beyond normal context function syntax to generic context function syntax by using [], one might argue => instead of ?=> is suitable. I will use this from here on.

Now, with (context) function syntax, you have both declaration and definition forms, so we’re already done. The most general and regular form of givens would be

given fooify: [A: Bar](using Baz[A], quux: Quux[A]) => Foo[A] =
  [A: Bar](using Baz[A], quux: Quux[A]) =>
    if quux.isThisThing then new Foo[A]:
      def foo() = quux.getFooFromBarContext
    else new Foo[A](...):
      def foo() = quux.getFooFromBazContext

This is, admittedly, a lot of symbol soup, but it uses entirely concepts we already have, in the exact way they’re used, with only two tweaks: (1) extending context function syntax to 0-explicit-argument generics, and (2) switching val out for given to enable context search.

Now we can start applying sugar in expected ways to see if we can recover convenient syntax for specific less-general cases.

First, we can note that because we have no arguments, the definition and declaration input forms are identical and therefore redundant. “Don’t need input” is already _, so

given fooify: [A: Bar](using Baz[A], quux: Quux[A]) => Foo[A] =
  _ ?=> ...

Having something as important as a given not have a well-defined return type seems fraught with peril, but there is a specific case of return type which is completely unambiguous: all you do is create a new (possibly anonymous) instance of a class, using the name of the class. There’s absolutely no ambiguity in that case. We can promote that to a completely general capability, so these should all be fine too:

def create(i: Int) = Foo(i)
def create(s: String) = Bar(s)

def recurse(i: Int) = new Foo(i):
  def into = recurse(i-1)

given fooify = [A: Bar](using Baz[A], quux: Quux[A]) => new Foo[A]:
  ...

If we don’t have any explicit context at all, or don’t have any type parameters, we simply leave that out

given fooify = [A: Bar] => new Foo[A]:
  ...

given fooify = [](using Baz[Int]) => new Foo[Int]:
  ...

given fooify = [] => new Foo[Int]:
  ...

If you want a particular instance to be your given, you just supply the instance, with or without a name, without type ascription if named-class type inference works, and you’ll get a stable value:

given seafood: String = "shrimp"
given contextual: Contextual = new Contextual():
  def foo: Foo = Foo()

given ctx = Context()

given String = "salmon"

given Context2()

Do we not like the extra new to enable creation of an instance of an anonymous class? Fine, just remove the need for new everywhere as long as you open the block, or use postfix with everywhere–just make it consistent, not magic only-for-givens.

val foo = Foo:
  override def toString = "FOO!!!"

given Foo:
  override def toString = "FOO!!!"

def specific(i: Int) = Foo with
  override def tooString = "foo" * i

given Foo with
  override def toString = "um...foo??"

If you have an existing stable identifier, you can simply name it and stay regular, but you can’t ascribe the type of the instance that is ambiguous. Type = value is fine, though.

val food = "bass"
given food

val some = Some(3)
given Option[Int] = some
// given (some: Option[Int])  -- abstract or forbidden

Now, I don’t know if this is nice enough, but it’s certainly regular enough. Everything is completely predictable from existing features with a couple of tiny tweaks.

One could also create a def-like parallel for named givens (or use _ for unnamed) if one doesn’t want to lean on function syntax so hard.

given fooify[A: Bar](using Baz[A], quux: Quux[A]) = Foo[A]:
  ...

given _[A: Bar](using Baz[A], quux: Quux[A]) = Foo[A]:
  ...

Indeed, one could entirely discard the 0-arg generic context function idea; you certainly don’t need both. In that case one could do without the _ on the unnamed method version; it’s mostly useful to avoid getting lost about whether you’re in generic context function-land or in method-land.

Examples

This is how it looks on the nine cases, both named and unnamed.

// Simple typeclass
given Ord[Int]:
  def compare(x: Int, y: Int) = ...

given intOrd = Ord[Int]:
  def compare(x: Int, y: Int) = ...

// Parameterized typeclass
given [A: Ord] => Ord[List[A]]:             // function instance form
  def compare(x: List[A], y: List[A]) = ...
given _[A: Ord] = Ord[List[A]]:             // unnamed method form
  def compare(x: List[A], y: List[A]) = ...

given intOrd = [A: Ord] => Ord[List[A]]:    // function assigned to term
  def compare(x: List[A], y: List[A]) = ...
given intOrd[A: Ord] = Ord[List[A]]:        // method form
  def compare(x: List[A], y: List[A]) = ...

// Typeclass with using
given [A](using ord: Ord[A]) => Ord[List[A]]:           // function instance form
  def compare(x: List[A], y: List[A]) = ???
given _[A](using ord: Ord[A]) => Ord[List[A]]:          // unnamed method form
  def compare(x: List[A], y: List[A]) = ???

given intOrd = [A](using ord: Ord[A]) => Ord[List[A]]:  // function assigned to term
  def compare(x: List[A], y: List[A]) = ???
given intOrd[A](using ord: Ord[A]) => Ord[List[A]]:     // method form
  def compare(x: List[A], y: List[A]) = ???

// Simple alias
given Ord[Int] = IntOrd()

given intOrd: Ord[Int] = IntOrd()

// Parameterized alias
given [A: Ord] => Ord[List[A]] = _ => ListOrd[A]()          // function decl + def
given _[A: Ord]: Ord[List[A]] = ListOrd[A]()                // unnamed method form

given listOrd: [A: Ord] => Ord[List[A]] = _ => ListOrd[A]() // function decl + def as term
given listOrd[A: Ord]: Ord[List[A]] = ListOrd[A]()          // method form

// Alias with using clause
given [A](using Ord[A]) => Ord[List[A]] = _ => ListOrd[A]()          // function decl + def as term
given _[A](using Ord[A]): Ord[List[A]] = ListOrd[A]()                // method form

given listOrd: [A](using Ord[A]) => Ord[List[A]] = _ => ListOrd[A]() // function decl + def as term
given listOrd[A](using Ord[A]): Ord[List[A]] = ListOrd[A]()          // method form

// Concrete class instance
given Context()

given context = Context()

// Abstract or deferred given
given Context = deferred

given context: Context = deferred

// Abstract or deferred given with parameters
given [A: Ord] => Ord[List[A]] = deferred          // function form
given [A: Ord]: Ord[List[A]] = deferred            // method form

given listOrd: [A: Ord] => Ord[List[A]] = deferred  // function form
given listOrd[A: Ord]: Ord[List[A]] = deferred      // method form

// "By-name" given (actually 0-arg here)
given [] => Context()      // function form
given _[] = Context()      // method form

given ctx = [] => Context()  // function form
given ctx[] = Context()      // method form

I don’t personally see enough advantage for the function form to want both, and I see enough disadvantage (in the otherwise insufficiently-explicitly-typed alias function forms) to want the method version.

I don’t want a helter-skelter mix of whatever feels best in each position, however.

tl;dr

If you want to be able to define function-like mappings that take generic type parameters, define that as your goal and do it; don’t create a special snowflake for givens. [T: Tc](using U) => V is a fine declaration, and [T: Tc](using U) => {block} is a fine definition.

You probably don’t need function-like mappings. Method-like definitions are probably enough. There are cases where functions look nicer, however.

If you want an easy way to create anonymous classes, do it in general; don’t make givens a special snowflake. One can treat it as a special case of bulletproof type inference: = MyClass(...) always has type MyClass so we can infer it in every case, even where normally we require explicit types, and MyClass(...): can always, as an expression, be considered the creation of an anonymous class. (Or if : is too wimpy, with. new already works.)

Sporarum · July 2, 2024, 6:59pm

There seems to be one ambiguous case in your proposal:

given a = b()

// can be
given _: a = b()
// or
given a: b = b() // where b is a class name

If for example type a = Ord[Int]

Since we tend to name values with undercase and types with upercase, it feels like there is actually no conflict, but this is not true in practice.
It is in particular widespread to name “object-like” values with an upercase, since object do, and bound types in match types have to begin with a lower case

I really like the idea of going back to basics however, and you do a really good job at it !

som-snytt · July 2, 2024, 9:10pm

What is the link for the keynote, “The Least Bad Scala”?

aepurniet · July 2, 2024, 9:47pm

This is one step away from re introducing val and def into named givens declarations and having a consistent syntax everywhere.

May be controversial, but re introducing those for named givens could regularize what follows. Givens seem to have 2 functions, to make its declaration available at using sites, and to infer if the declaration is a lazy val or a def. Removing that second responsibility would make the language feel more regular. Extending this to anonymous given vals / defs would be straightforward as starting a parameter clause (for defs) without a naming would indicate that it is anonymous very clearly. For vals I think this would be unambiguous also.

The two cons would be that it goes against the indicated preference of specifying the type first in givens, and the preference of intention over structure stated when the syntax was first proposed.

Ichoran · July 3, 2024, 12:48am

Oh, good catch! I was too focused on the complex cases and didn’t think through the simple ones. That would need a disambiguation. And it’s not just lower-cased flavors that are ambiguous; those are just the most likely to come up.

I think the constructs could both be preserved with the following disambiguation rules:

If the LHS is capital and a class of that name exists, it is a class.
If the LHS is capital and a class of that name does not exist, say the class is not found and add that if you want a capital name, you must use explicit type ascription.
If the LHS is lower case and a class of that name exists, you report ambiguity and say that you must use an explicitly anonymous given, i.e. given _: a = ...
If the LHS is lower case and a class of that name does not exist, it’s naming a variable.

This would be a new wrinkle, admittedly, but it’s not too different from the ad-hoc ways to deal with the same issue with extractors vs variable binding in case statements.

Ichoran · July 3, 2024, 1:11am

I don’t think the val and def help clarify matters, though. It’s just extra typing and the distraction of having to think through a little more about implementation details that shouldn’t matter most of the time.

The proposal(s) I gave above are concerned solely with how the feature will act, not with how it’s implemented. (Save the by-name thing, which I only included because it was in the official example.)

Sporarum · July 3, 2024, 11:00am

That would be the only place where capitalization matters, I don’t think that’s a good idea, especially if the goal is to make things more regular ^^’