Better support for optional parameter lists preceding code blocks

A common pattern in many languages is:

foo{
  ...block...
}

foo(bar = blah1, qux = blah2){
  ...block...
}

This is commonly seen in XML:

<foo bar="blah1" qux="blah2">
   ...block...
</foo>

But is also common in modern Kotlin where the last parameter of a function is treated as a block:

fun doSomething(name: String = "default", action: () -> Unit) {
    println("Hello, $name")
    action()
}

doSomething("Kotlin") {
    println("Inside lambda")
}

doSomething{
    println("Inside lambda")
}

Or Swift:

func doSomething(name: String = "World", action: () -> Void) {
    print("Hello, \(name)")
    action()
}

doSomething {
    print("No name passed")  // uses default "World"
}

doSomething(name: "Alice") {
    print("Name passed explicitly")
}

Scala does not have support for this. It is possible to hack around to kinda-sorta make it work, as we have done in the Mill build tool:

def Task[T](t: => T): Task[T]

final class NamedParameterOnlyDummy private[mill] ()

def Task[T](
    t: NamedParameterOnlyDummy = new NamedParameterOnlyDummy,
      persistent: Boolean): ApplyFactory = new ApplyFactory(persistent)
class ApplyFactory private[mill] (val persistent: Boolean) {
  def apply[T](t: => T): Task[T]
}

def foo = Task{ ... } // hits the first overload
def foo = Task(persistent = true){ ... } // hits the second overload

However, this approach has downsides. Apart from the t: NamedParameterOnlyDummy sentinel value, the fact that this involves overloading also means that target-typing and a bunch of other type inference features don’t work. In contrast, the approach taken by Swift and Kotlin do not involve overloading at all, and simply let you pass the last parameter in any parameter list as a separate curly-brace block.

Of course, it is always possible to spell this syntax:

def foo = Task{ ... } // hits the first overload
def foo = Task(
  persistent = true,
  block = { ... }
)

Or

def foo = Task{ ... } // hits the first overload
def foo = Task{
   ...
}(given persistent = true)

But this is verbose an unintuitive. In general, the concept of ā€œblock of code, with some optional config in the headerā€ seems pretty universal across languages. So although Scala’s equivalent syntaxes are semantically equivalent, it is syntactically very awkward and doesn’t feel familiar to use.

Is there anything we can do to simplify this use case in Scala?

5 Likes

Would this be OK, or did I miss something..?

scala> object o {
     |   def doSomething(name: String = "default")(action: => Unit): Unit = {println(name); action}
     |   def doSomething(action: => Unit): Unit = doSomething()(action)
     | }
// defined object o

scala> o.doSomething { println(0) }
default
0

scala> o.doSomething("me") { println(0) }
me
0

The issue with that snippet is that it falls down in cases where name and action have the same type. e.g.

Welcome to the Ammonite Repl 3.0.2 (Scala 2.13.16 Java 21.0.8)
@ object o {
    def doSomething(name: String = "default")(action: => Unit): Unit = {println(name); action}
    def doSomething(action: => Unit): Unit = doSomething()(action)
  } 
defined object o

@ o.doSomething{println(1)}
default
1

@ o.doSomething{println(1); "hello"} 
cmd3.sc:1: missing argument list for method doSomething in object o
Unapplied methods are only converted to functions when a function type is expected.
You can make this conversion explicit by writing `doSomething _` or `doSomething(_)(_)` instead of `doSomething`.
val res3 = o.doSomething{println(1); "hello"}
                        ^
Compilation Failed

It’s kind of odd that adding a "hello" after the println(1) causes the wrong overload to be taken, and resulting in a compiler error. The only way to avoid this ambiguity is via the NamedParameterOnlyDummy sentinel value I used above, and even then having null or ??? at the end of the block could result in the wrong overload being selected

The Kotlin and Swift snippets do not have this ambiguity because both languages syntactically distinguishe that last param via curly braces. I guess I’m wondering if there’s some way we could make Scala similarly robust and unambiguous as well

Furthermore, the Kotlin and Swift snippets do not involve overloading. Overloading causes a bunch of problems in Scala as well (type inference, target-typing changes, implicits, etc.) and I would like to be able to express this pattern in Scala without using overloading if possible.

1 Like

I’m against such change.
Both Kotlin and Swift work because they don’t treat Block as ordinary expression, instead bracets are treated as a clousure.
Case doSomething{println(1); "hello"} is completely valid in Scala and when reading this I can interpreted as either

doSomething(name = {
 println("I'm invoking do something") 
 "hello" 
})

or

locally:
   println("I'm invoking do something") 
   doSomething(name = "hello")

which is unique to Scala and cannot be represented in other languages. It allows for complex, lazily constructed arguments scoped only to function invocation.

We should not change the current status - it might require more boilerplate for DSLs, but it’s consistent within the language, is easier to read and reason about both for developers and tools.

6 Likes

I think the easiest way to do this is via multi-identifier definitions.

Just like methods can have multiple parameter blocks, and multiple generic type blocks, why not multiple identifiers?

def pick[A](a0: => A) when(p: Boolean) otherwise(a1: => A): A =
  if p then a0 else a1

In particular, if we allow the do identifier in the final position (or if we don’t, but have a convention of using some other common word), we get

object o:
  def forSomething(name: String = "default") do(action: => Unit): Unit = ???
  def forSomething do(action: => Unit): Unit = ???

and then there’s no ambiguity. forSomething do { println(1); "Hello" } is completely different than forSomething("eel") do { println(1); "Hello" }.

The parsing rules would not, I think, need to change in a particularly hard-to-understand or hard-to-implement fashion because the lexing is probably identical (one would have to think carefully about how to handle dotted notation–is it o.forSomething do or o.forSomething.do or something else?) and the rest of the parsing is pretty straightforward; an ambiguity would still be an ambiguity if there was both a foo(x) that returned something with a baz method and a foo(x)baz(y) multi-identifier method.

This would also solve the lack of a braceless style–you rewrite your multi-parameter-block methods with multiple identifiers, too.

def fold[A](left: L => A) into(right: R => A): A = ???

fold: l =>
  foo(l)
into: r =>
  foo(r)

There are some downsides and awkwardness, which I acknowledge but won’t explore here, but it does get two birds with one stone.

6 Likes

Wow, that’s fascinating – basically a built-in way to do micro-DSLs for control flow.

No clue how feasible it would be (there are some obvious edge cases in terms of precedence and such that would need to be nailed down), but it feels very ā€œScala-esqueā€ – a general way to implement something that is generally very hard-coded in most languages – and I’m really intrigued by the potential there…

That would be great. Often, I encounter naming issues where I rely on the user to use a named parameter, as it is not 100% clear from the method identifier in which order the arguments should be past. Unfortunately, I don’t have a specific method in mind, but I have often been tempted to introduce a Builder for readability purposes, to achieve something like ā€˜pick … when … or …’, but I have often decided against it as the boilerplate code would not have justified the effort.
Such a syntax would be very useful in such cases

I once suggested something similar in a thread of old; perhaps the time for this particular beast has now come: Clause Interweaving, allowing `def f[T](x: T)[U](y: U)` - #11 by odd

Mixfix notation is a very intriguing concept. Early-on in the design of dotty I was seriously tempted to add it. But I fear that today it would be too much of a change. Lots of challenges for tooling – not just compilers, but also IDEs and doc tools. And it’s such a fundamentally different (and appealing!) way of writing things that it would feel strange not to rewrite the stdlib and other libraries to use it. Also, it makes the syntactic choices multiply and we should try to counter-balance that by dropping other features, such as regular named arguments.

So, all in all, this looks like a very intriguing feature for a new language. But to add it to Scala today, it looks like too much of a change.

There’s another way we could make the original use case work, if we had keyword-only arguments. Hers’s a strawman:

def doSomething(named_only name: String = "default")(action: => Unit) 

The rules could be that, if a parameter list consists only of keyword-only arguments, then the whole parameter list can be elided in the call and () be inserted for it. That would happen whenever a function with a keyword-only parameter list is not followed by a keyword-argument list. So a call like

doSomething { myAction }

would be expanded to

doSomething() { myAction }

and then to

doSomething(name = "default") { myAction }

On the other hand,

doSomething(name = "Scala"){ myAction }

would be left as-is. A simple reference to

doSomething

would be expanded to

(body: => Unit) => doSomething(name = "Scala")(body)

i.e. keyword only parameters will always be inserted, never abstracted in an eta expansion. That makes sense, since the lambda resulting from an eta expansion could not take keyword-only arguments anyway.

I have no good idea how to specify keyword-only arguments, though. The named_only modifier is a placeholder for a hopefully better alternative to be discovered. Python uses *, which is a bit cryptic.

8 Likes

I’m not sure there’s any perfectly non-cryptic way to do it, but here are two ideas.

(1) Use explicit as a soft identifier.

doSomething(explicit name: String = "eel")(f: => Unit): Unit

(2) Use : before the identifier name. It should be completely unambiguous in parsing, gives a bit of a type-like feel to it which puts one in the right mindset of it being required (i.e. it conceptually has the type that it is called name), and when written out looks like the name is highlighted somehow (even though the RHS is the colon for type specification):

doSomething(:name: String = "eel")(f: => Unit): Unit

I’m not entirely sure that the arguments against mixfix are fully compelling. Named tuples are a pretty substantial change, and one could argue that they’ve burned the substantial change budget, but they’re also precedent for making changes that make you reconsider a lot of past library design.

The tooling issue is an important point too, though again that didn’t stop named tuples, and also if it’s done carefully the compiler could present mixfix calls as a chain of synthetic classes to the IDE, so that perhaps(a) or(b) could resolve as Prefix$perhaps$$or with an or method on it or somesuch. So it would need to be carefully investigated but I don’t think it’s completely out of the question.

(Edit: the need to support infix notation also makes the perhaps foo(7+2) or "eel" notation less of a stretch for IDEs and the like. But it also raises the potential for symbol chains that are difficult for humans to parse, much like infix notation is.)

However, I do tend to agree that it is a change of sufficient magnitude to probably be inadvisable. I like it a good deal because of the generality and flexibility, but I don’t like very much that we then have a million questions about things like why by is a method on Range instead of Int having to(n)by(m) mixfix method, when the latter is clearly preferable because 1 to 10 by 2 by 3 is perfectly legal and yet perfectly unguessable as to what it does. So maybe that is actually an argument for mixfix, but it also kind of grates.

Regardless, I think the must-be-named argument proposal is a good alternative, a feature that would be nice to have regardless, and solves the how-to-have-effectively-optional-parameter-blocks issue here.

3 Likes

That’s a good point that keyword-only params would allow us to avoid the ambiguity, and if we made a keyword-only param-list with full defaults elidable that would satisfy the original use case completely. There’s already precedence in that (implicit foo: Foo) param lists are elidable when every param has a default value.

I’d say that for a naming perspective for keyword-only params, it’s a choice between named keyword since those are the terms used on a regular basis. Of the two, named probably fits better in Scala, since keyword has other connotations (reserved words for the language), and we don’t have the **kw or **kwargs naming convention present in some other languages

3 Likes

FWIW, Ruby uses :, and I think some Lisp variants do too, which :name: String would be reminiscent of. (Prefix colon rather than postfix.) Or we could go Python-style with an argument list separator, and use : as the separator. def doSomething(:, name: String = "Hi")(f: => Unit). Arguably = is more indicative of the necessity to use names, however, because you use it like doSomething(name = "eel"). So def doSomething(=, name: String = "Hi")(f: => Unit) might be more intuitive.

I don’t think the concept of required name via keyword is common enough for named or keyword to be more obvious than something else. The most common thing by far is a separator symbol in Python, and everything else is obscure. So if we go with a keyword, the keyword should be whatever we deem makes the most sense.

However, logically, the separator symbol fits the use case better. So upon reflection, my favorite is = on its own to mean ā€œeverything past here must be passed using =ā€.

Seeing how Mojo has been mentioned somewhere, I thought it’d be interesting to share their design: function arguments may contain, in the following order

  • Required positional arguments.
  • Optional positional arguments.
  • Variadic arguments.
  • Required keyword-only arguments.
  • Optional keyword-only arguments.
  • Variadic keyword arguments.

By default, an argument can be specified by position or keyword, however

  • arguments are position-only when followed by the / special argument, i.e. fn foo(x: Int, /) may only be invoked as foo(0)
  • arguments are keyword-only when preceded by the * special argument, or are following variadic arguments, so fn foo(*, x: Int) may only be invoked as foo(x=0), or fn foo(*xs: Int, x: Int) as foo(1, 2, 3, 4, x=0)

This design also extends to parameters, which are compile-time terms specified between brackets: fn foo[params...](args...), with the addition that some parameters may only be inferred, in which case they are followed by //, e.g. in fn repeat[dt: DType, //, n: Int](x: Scalar[dt]) -> SIMD[dt, n] only the parameter n can be specified repeat[8](Float32(1.1)).