Make by-name parameters just sugar syntax rather than actual types

BalmungSan · August 18, 2021, 5:39pm

This thread is motivated by a recent discussion in the gitter channel; I recommend taking a look at the discussion for having full context but I will try to provide a summary.

The TL;DR; is that the fact that by-name parameters have their own type is confusing for many users (as can be seen in the gitter discussion, and folks may recall different but related discussions over the years). Most people would expect that the following method signature: def foo(bar: => Bar): Baz would be eta-expanded into just Bar => Baz, but it actually expands into (=> Bar) => Baz, which is not only a different type than expected but also a weird one, especially because it seems is not possible to define a value of such type using the lambda syntax; and additionally, it is not clear (or at least not for me) if => Baz alone is a valid type or not.

The idea of this thread is to propose an alternative encoding of by-name parameters that emerged from the gitter discussion. The proposal attempts to make by-name parameters less weird but retaining their utility; although this proposal alone will not address the previous issue (more on that later).

First, let’s start with three assumptions:

The utility of by-name parameters is its usage syntax which gives the impression that we are using a control structure defined in the language. Of course, this alone is useless if it wouldn’t be for the laziness of the argument, however, we may just ask for a () => A to have that.
They are actually implemented as a Function0 value under the hood.
Library authors won’t have much problem applying the necessary refactor, especially if the appropriate Scalafix rules and cross-compilation guarantees are given.

With all that the idea is very simple. An argument of type => A would be seen as just a () => A in the body of the method, the only difference will be that at the usage site user will omit the () => and will just provide the body of the function (making the usage syntax exactly the same as of today). Similar as to how varargs are implemented, where an argument of type A* is just a Seq[A] with sugar syntax at the call site.
Of course, similar to : _* for varargs, we may need to provide something like : => to manually lift a () => A value into a => A argument instead of producing a () => (() => A).

The advantage of this is making the feature more simple and predictable. The disadvantage is it would be a source breaking change since now one would need to manually execute the function to get its value; e.g thunk() instead of just thunk. However, I think this is actually a good thing since it makes them safer to pass around.

Now, regarding the original issue of eta-expanding a def foo (bar: => Bar): Baz into a Bar => Baz, while this proposal doesn’t do that (since it now will eta-expand into a (() => Bar) => Baz) I think this is a better situation because the type is a regular one and the error would be clearer and also most people would already know that is what was going to happen.
Nevertheless, it may be possible to just implement an implicit conversion or an extension method to easily go from (() => A) => B into A => B; such conversion / extension method may even be part of the stdlib if maintainers agree is useful enough.

Please let me know what you all think about this.

esuntag · August 18, 2021, 7:18pm

I do think Function0 syntax makes more sense than the current ByName snytax. In general, I like being more explicit about what’s going on. Additionally, it makes it harder for a refactor (perhaps by someone less familiar with the differences) to change the semantics, since it would require a function call to access the value instead of direct access. Syntax to auto-lift parameters to a Function0 should make this change seamless at the call site.

I’m a bit more cautious on eta expanding a lazy method to a non-lazy function.

bbarker · August 18, 2021, 10:14pm

It’s certainly a good point, but is it possible this would only happen in contexts where a strict variant is needed? If there is room for error, I too would be cautious about the defaults.

Jasper-M · August 18, 2021, 10:41pm

I don’t really see why the usage in the body of the method has to change. The same syntax sugar as at the use site can apply there: you can omit the (). The only thing that has to change is the eta-expansion. Plus it should also be pretty easy to just allow the => A syntax for function parameters as well.

martijnhoekstra · August 19, 2021, 10:24am

The body in the method is the thing that makes the current feature worthwhile over just having () => A in my opinion. Dropping that functionality would make the feature a lot less compelling. It’s what to me makes the feature worth the messiness of whether => A is a real type or not for all types A (and if it is, whether => => A is a real type or not)

BalmungSan · August 19, 2021, 3:39pm

/cc @Jasper-M

I personally don’t think the value of the feature is to be able to use the by-name parameters a just defs inside the body of the method that defines them. I rather think the value is in the lazyness and the clean syntax at use site.

// I totally agree we don't want to do this:
opt.getOrElse(() => expesiveComputation())
val fut = Future(() => {
  expesiveComputation()
})
val io  = IO(() => {
  expesiveComputation()
})

// Instead of this:
opt.getOrElse(expesiveComputation())
val fut = Future {
  expesiveComputation()
}
val io = IO {
  expesiveComputation()
}

// And that is what this proposal wants to preserve.
// However, I doubt changing the implementations would hurt:
sealed trait Option[+A] {
  def getOrElse[B >: A](default: => B): B =
    this match {
      case Some(a) => a
      case None => default() // Instead of just default
    }
}

object Future {
  def apply[A](body: => T)(implicit executor: ExecutionContext): Future[T] =
    unit.map(_ => body()) // Instead of just body
}

object IO {
  def apply[A](thunk: => A): IO[A] = {
    Delay(thunk, Tracing.calculateTracingEvent(thunk.getClass))
  }

  // Inseatd of:
  def apply[A](thunk: => A): IO[A] = {
    val fn = () => thunk
    Delay(fn, Tracing.calculateTracingEvent(fn.getClass))
  }
  // Actually the current implementation is different due to a change in the encoding,
  // see this: https://github.com/typelevel/cats-effect/issues/2225
}

The IO example is interesting because they actually want a Function0 to being able to manipulate the thunk as just a value without triggering a computation by accident.
(I think something similar happens in the stdlib LazyList but I didn’t check all the code to be totally sure).
In conclusion, I believe that having to manually call the by-name in the implementation is not only a big deal, but also is something that could be useful.

In general, my humble opinion is that it is better to keep things simple and, in that vein, my proposal for by-name parameters wants to be as similar to varargs as possible. Again, just being a () => A plus sugar syntax for callers.
Thus I don’t think is important (nor good for the sake of regularity) to allow its usage inside the method definition to be different; the same way varargs don’t provide any funny syntax, it is just a Seq and you use it like any other Seq, a by-name is just a Function0 and you use it like any other Function0

But if the majority of the community disagrees with me I am happy to change the proposal.

LPTK · August 19, 2021, 6:04pm

It would be nice if the syntax was changed and generalized as follows:

def foo0(def x: Int) = ... x ... x ...
foo0: (=> Int) => R


def foo1(lazy x: Int) = ... x ... x ...
foo1: (=> Int) => R

// ^ semantically equivalent to:

def foo1(def x: Int) = { lazy val tmp = x; ... tmp ... tmp ... }


def foo2(var x: Int) = ... x ... x ...
foo2: (Int) => R

// ^ semantically equivalent to:

def foo2(x: Int) = { var tmp = x; ... tmp ... tmp ... }

I reckon the lazy x: T variant is actually the one that would be most often used, rather than def x: T, which is the only thing we have today (through the weird x: => T syntax).

I recall reading that a long, long, time ago, there were discussions about allowing syntaxes around these lines in Scala.

charpov · August 19, 2021, 6:23pm

I like it a lot, especially the differentiation between def and lazy.

var is more confusing. It looks like a form of “by-reference” argument passing, but it’s not (no possibility to modify the external x inside foo2). What’s a good use case for it?

martijnhoekstra · August 19, 2021, 6:34pm

This is great!

I wonder if the var variety leads people to think that if they pass in a value that’s a var, modifying it within the method will also modify the source var.

var x = 1
def foo(var y: Int) = y += 1
x == 2 //expected? Sometimes desired?

C# has ref that does something like that

charpov · August 19, 2021, 6:46pm

Not if you don’t call foo on x first.

martijnhoekstra · August 19, 2021, 7:19pm

Understandable have a great day

Ichoran · August 19, 2021, 11:45pm

This is a key feature, unfortunately. It means that you can transparently refactor between x being just a value, and x being computed at need. The point is that usually you don’t care in the logic which it is, just like def x: Int can be used the same way as val x: Int even though in one case logically you are accessing a field and in the other case you’re calling a method.

But I fully agree that it’s a weird wrinkle, and I’d much rather have everything to do with => A explicitly be syntactic sugar for () => A. I would just extend that sugar to having x desugar to x() when you say x: => A instead of x: () => A.

And then you’d get around the current inconsistency of not being able to say val x: => Int = sin(0.12581).toInt, despite def foo(x: => Int) working just fine. I also like @LPTK’s idea of lazy x: A being sugar for _x: () => A plus lazy val x = _x(). Not wild about any of the others, though. In retrospect def might be a better way to go, but we already have => A and I don’t think it’s worth the churn to try to switch.

LPTK · August 19, 2021, 11:53pm

Yeah, I realize the var version is a terrible idea now, as it looks like something it’s not.

I just remembered Pascal even has the very same syntax to express the “mutate the argument” semantics:

procedure xorSwap(var left, right: integer);
begin
	left := left xor right;
	right := left xor right;
	left := left xor right;
end;

https://wiki.freepascal.org/Variable_parameter

rssh · August 20, 2021, 6:36am

Interesting, that a context function without parameters (now it’s not allowed) can be used as a proposed new by-name parameter. i.e. param: =>B can be a shortcut for () ?=> B

LPTK · August 20, 2021, 6:56am

Actually, today you can already roll your own by-names using a dummy implicit function parameter type such as Unit.

This even allows you to choose whether you want to allow the implicit application syntax at the definition site or not:

// If you don't do anything, you need explicit application via `using`
def foo[R](f: Unit ?=> R) = { val tmp = f(using ()); ... }

// Or you can enable implicit application with a `given`:
given Unit = ()
def foo[R](f: Unit ?=> R) = { val tmp = f; ... }

// In both cases, the usage sites remain implicit:
foo(println("Hello!"))

Jasper-M · August 20, 2021, 7:16am

Yeah I think that’s one of the features we really don’t want in Scala

lihaoyi · August 21, 2021, 5:20am

I like @LPTK’s proposal

odersky · August 21, 2021, 10:24am

Historically, => T was an annotation on a method parameter, not a type. Then it got generalized bit-by bit to allow some type-like usages, as in (=> A) => B. I was not too happy about that trend because I felt it muddled the concepts but went along since the use cases made sense.

To get back on firm ground we have two choices: go backward or forward.

Go backward: Treat => T as syntactic sugar on a method parameter type that expands to () => T except that arguments to such parameters don’t have a parameter section. That’s @BalmungSan’s proposal. I like it since it’s clean and quite analogous to repeated arguments T*, which also expand to something else (i.e. Seq[T]) everywhere except that the actual argument is written differently.
In retrospect, I feel this would have been a better way to generalize the parameter annotation to types.
One downside is a lot of code would have to be rewritten to add all the () arguments to call by name parameters. Also the proposal would change the behavior of existing code, for instance when a call-by-name parameter is passed to a function that expects an Any.

But by now we also have the option to go forward. Make => T a full type by treating it as a shorthand for () ?=> T. Right now the latter is not accepted. We could either tweak the rules to make it accepted or expand => T to something like Dummy ?=> T where a unique given instance of type Dummy is always available. (My preference would be to tweak the rules so that () ?=> T is accepted.) One consequence would be that you could write

List[=> Int](
  { println("1"); 1 },
  { println("2"); 2 }
)

and the argument expressions would be evaluated each time a list element is accessed, instead of when they are passed. So, a lot more power, which is not to say that’s always a good thing.

But I believe overall going forward would be a lot simpler now than going backward.

chaotic3quilibrium · August 21, 2021, 2:39pm

What are the undesired effects of LPTK’s proposal (Make by-name parameters just sugar syntax rather than actual types - #7 by LPTK)?

As far as the desired effects are concerned, it appears LPTK’s solution communicates the intention to the future source code reader/maintainer/client-caller more unambiguously.

charpov · August 23, 2021, 5:43pm

Call by-name seems to be pretty standard (e.g., Evaluation strategy - Wikipedia). Outside of Scala, “contextless function” means nothing to me. I do agree with a possible mix-up with “named arguments”.

Personally, I like the idea of using NoArg ?=> T and having an implicit NoArg value defined somewhere (and NoArg can even be NoContext ).