SIP: Curried varargs

LPTK · August 11, 2019, 7:19am

I remember finding out, after some profiling, that my program was spending a huge amount of time just creating arrays wrapped in Seq and tearing them down immediately just because I used List(x) instead of x :: Nil (and the like). Now I’m instinctively afraid of varargs and always use :: even though it’s uglier.

I like this proposal because it could be used to make List.apply not stupidly inefficient, compared to using :: and Nil.

tarsa · August 11, 2019, 9:06am

One solution that already works is just encoding curried varargs explicitly as curried varargs:

object FastList {
  def main(args: Array[String]): Unit = {
    println(FastListBuilder.done())           // List()
    println(FastListBuilder(1).done())        // List(1)
    println(FastListBuilder(1)(2).done())     // List(1, 2)
    println(FastListBuilder(1)(2)(3).done())  // List(1, 2, 3)
  }
}

class FastListBuilder[A] private {
  private val listBuffer = scala.collection.mutable.ListBuffer[Any]()

  def apply[B >: A](nextElem: B): FastListBuilder[B] = {
    listBuffer += nextElem
    this.asInstanceOf[FastListBuilder[B]]
  }

  def done(): List[A] = listBuffer.toList.asInstanceOf[List[A]]
}

object FastListBuilder {
  def done(): List[Nothing] =
    new FastListBuilder().done()

  def apply[A](firstElem: A): FastListBuilder[A] =
    new FastListBuilder()(firstElem)
}

A relatively big drawback of this encoding is somewhat ugly client code. Instead of FastListBuilder(1, 2, 3, 4, 5) we need to have e.g. FastListBuilder(1)(2)(3)(4)(5).done() Such syntax is ugly enough that probably nobody uses it for the most common use case which is building collections.

Another drawback is exposed mutability. Each consecutive apply mutates the collection builder and we lose referential transparency.

yangbo · August 11, 2019, 3:25pm

I would not consider “ugly” as a drawback. Anyway, you could use some pretty symbols for explicit currying, like List < 1 :+ 2 :+ 3 >. The drawbacks I see are:

Explicit currying is not source level compatible with existing Scala code.
Currying in Scala is more verbose than currying in Haskell or OCaml in term of syntax.
Explicit curried collection initializers loss the symmetry with pattern matching.

szeiger · August 12, 2019, 9:52am

Do you have any performance numbers for this design? In theory it should require fewer allocations but that may not matter if Hotspot can elide the allocations of short-lived wrapper objects entirely.

The proposed translation of : _* is problematic. Currently this is the most efficient way of calling a varargs method. Simply pass an existing Seq and it becomes a straight-forward method call with no extra allocations. With this proposal it would become the least efficient way: It always requires recreating the sequence, plus the external foldLeft call makes it impossible for the curried varargs method to optimize based on the existing size or array-based internals of the sequence.

It would be nice if the proposal also addressed pattern matching. Can/should this be done by currying or does the new scheme keep the existing varargs pattern matching, thus making it less uniform.

morgen-peschke · August 12, 2019, 3:11pm

At alternate way of translating :_* could be to translate it to a method with a default implementation,
so it could be overridden if there’s a more efficient way of handling it known to the programer.

For example:

f(a, b, c:_*)

Would be translated into this:

f.applyBegin
  .applyNext(a)
  .applyNext(b)
  .applyNextSeq(c)
  .applyEnd

The default implementation of applyNextSeq could be something along these lines:

def applyNextSeq[A](as: Seq[A]): ListBuilder[A] = as.foldLeft(this)(_.applyNext(_))

But, if the programmer knows a better way of doing this, it could be overridden (using the ListBuilder example from above):

class ListBuilder[A] {
  def applyNext[B >: A](b: B): ListBuilder[B] = ???
  def applyNextSeq[B >: A](b: Seq[B]): ListBuilder[B] = ???
  def applyEnd: List[A] = ???
}
object List extends Curried {
  def applyBegin[A]: ListBuilder[A] = ???
}

I could be missing something, but this is basically a simple desugaring, and I think that changing pattern matching to be isomorphic would require deeper changes to the compiler.

yangbo · August 12, 2019, 10:48pm

I think the correspondence between pattern matching and initializers in Scala is meant to be achieved at the grammar level. I did not see the symmetry at the bytecode level between unapply, unapplySeq and apply.

This proposal does not affect the correspondence at the grammar level. In fact, the original intention of this proposal is to keep the same translation rule for both pattern matching and initializers in name based XML literals when optimizing the performance of that proposal. Explicit currying cannot achieve the goal.

yangbo · August 12, 2019, 10:56pm

I like your idea, and I have updated the proposal for applyNextSeq. applyNextSeq also resolves the type inference problem in Scala 2 due to foldLeft.

yangbo · August 12, 2019, 11:15pm

There are some discussion at https://github.com/scala/scala/pull/6093, indicating how traditional varargs hurt the performance. Note that this proposal enables zero cost abstractions, eliminating all wrappers at compile time, with the help of @inline annotations and value classes, achieving similar performance to macro generated code.

morgen-peschke · August 13, 2019, 4:48pm

One thing that really interests me about this proposal is that, combined with shapeless’s HList, it could provide a way to cleanly abstract over arity by using applyNext to build the HList, before doing something with it in the applyEnd

@milessabin could probably come up with something better, but this looks workable:

class ArbitraryArityFunction[Args <: HList, RetVal](f: Args => RetVal) extends Curried {
  def applyBegin: HListBuilder[HNil] = new HListBuilder[HNil](HList())
}
object ArbitraryArityFunction {
  class HListBuilder[In <: HList, Args <: HList, RetVal](val in: In, f: Args => RetVal) {
    def applyNext[S](suffix: S)(implicit P: hlists.Prepend[In, S]) : HListBuilder[P.Out] = 
      new HListBuilder[P.Out](in :+ suffix, f)

    def applyEnd(implicit ev: In =:= Args): RetVal = f(in)
  } 
}

milessabin · August 14, 2019, 9:08am

@morgen-peschke shapeless has several gadgets which are probably relevant to your example. It has type classes which can convert between arbitrary arity functions and single argument functions which take an HList argument. shapeless’s ProductArgs and RecordArgs traits might also be relevant.

I think we can most likely do better in Dotty.

Jasper-M · August 14, 2019, 9:37am

Though if dotty removes all autotupling you lose a tool for arity abstraction.

yangbo · August 24, 2019, 6:30pm

It been two weeks since the initial post. I have created a pull request for this proposal https://github.com/scala/docs.scala-lang/pull/1487 . @sjrd Shall we move forward?

yangbo · August 26, 2019, 8:24am

Dotty’s name based pattern matching extract values by _1 to _N methods. If we instead use head and tail to extract values, then pattern matching becomes “curried” automatically. The change will not break unapply methods that return Option[TupleN], because in Dotty a Tuple is just an HList. This head/tail based pattern matching also unifies unapply and unapplySeq, as it is compatible with Seq and enables rest pattern for tuples as well.

The only drawback is that it does not perform well for collections that has a slow tail method (e.g. most of mutable collections). Even if an unapply author returns a view of a mutable collection, it still creates more temporary objects than the current Dotty implementation.

ghik · August 28, 2019, 8:09am

I really like this proposal.

Apart from applyNext and applyNextSeq, I would also like to see applyNextNamed which would be used in case of named arguments.

class StringMapBuilder[V] {
  def applyNextNamed[V0 >: V](key: String, value: V0): StringMapBuilder[V0] = ???
  def applyEnd: Map[String,V] = ???
}
object StringMapBuilder extends Curried {
  def applyBegin[V]: StringMapBuilder[V]
}

makeMap(a = 1, b = 2)
would translate to
makeMap.applyBegin.applyNextNamed("a", 1).applyNextNamed("b", 2).applyEnd

Also, I think it would be good to bring this work under some common denominator with scala.Dynamic

yangbo · August 28, 2019, 8:54am

If applyNextNamed is added to this proposal, I would suggest to deprecate all methods except selectDynamic in Dynamic, making Curried and Dynamic be orthogonal.

ghik · August 28, 2019, 9:30am

Technically they are not methods which can be deprecated, because there’s no fixed signature. Also, updateDynamic would also have to stay. But we could possibly get rid of applyDynamic and applyDynamicNamed.

yangbo · August 28, 2019, 5:18pm

Currently, a.f = v is translated to a.updateDynamic("f")(v) if a is a Dynamic, and a.f(b, c) = v is converted to a.selectDynamic.update(a, b, c). If a.f.update is a Curried, it should be further translated to a.f.update.applyBegin.applyNext(a).applyNext(b).applyNext(v).applyEnd.

The rule is not ideal for two reason:

Inconsistency between assignment with or without arguments.
As a library author, you cannot distinguish indices from the value to assign. Suppose you have an N-dimensional array of int where N is determined at runtime, when applyNext is invoked with an Int, it does not know whether it is an index of a value.

I wonder if we could separate indices from the value, like, translating a(b, c) = v to a.update(b, c)(v).

morgen-peschke · August 29, 2019, 12:40am

Separating the indices from the values could be done, if I understand things correctly, though it’d be a bit tedious.

class CurriedDynamic extends Dynamic {
  def updateDynamic: Dynamic0 = new CurriedDynamic.DynamicApply(this)
}
object CurriedDynamic {
  class DynamicApply(instance: CurriedDynamic) extends Curried {
    def applyBegin: DynamicBegin = this
    def applyEnd: String = s"$instance.updateDynamic called"
    def applyNext(method: String): Dynamic0 = new Dynamic0(instance, method)
  }

  class Dynamic0(instance: CurriedDynamic, method: String) extends Curried {
    def applyEnd: String = s"$instance.$method() called"
    def applyNext(arg0: Int): Dynamic1 = new Dynamic1(instance, method, arg0)
  }

  class Dynamic1(instance: CurriedDynamic, method: String, arg0: Int) extends Curried {
    def applyEnd: String = s"$instance.$method($arg0) called"
    def applyNext(arg1: String): Dynamic2 = new Dynamic2(instance, method, arg0, arg1)
  }

  class Dynamic2(instance: CurriedDynamic, method: String, arg0: Int, arg1: String) extends Curried {
    def applyEnd: String = s"$instance.$method($arg0, $arg1, $arg2) called"
  }
}

On the plus side, I believe this may limit what can and cannot compile:

val a = new CurriedDynamic()
a()          // "a.apply() called"
a.z()        // "a.z() called"
a.y(1)       // "a.y(1) called"
a.x(1, "Hi") // "a.x(1, \"Hi\") called" 

// I don't think these would compile:
a.f(1, 2)
a.f(1, "Hi", 2)

morgen-peschke · August 29, 2019, 1:05am

The last bit brought up an interesting thought: how would we expect something like this to behave?

object Calculate extends Curried {
  def applyBegin: Root = Root

  object Root extends Curried {
    def applyNext(name: String): Operation = name match {
      case "const" => new Operation.WaitingForConst
      case "sum" => new Operation.Sum(0)      
      case unrecognized => new Const(Left(s"Unrecognized operation: $unrecognized"))
    }
  }

  sealed trait Operation extends Curried
  object Operation {
    class WaitingForConst(val ignore: Boolean = true) extends AnyVal with Operation {
      def applyNext(value: Int): Result = new Const(Right(value))
    }
    class Sum(val accum: Int) extends AnyVal with Operation {
      def applyNext(next: Int): Sum = new Sum(accum + next)
      def applySeq(next: Seq[Int]): Sum = new Sum(accum + next.sum)
      def applyEnd: Left(accum)
    }
  }

  class Const(value: Either[String,Int]) extends AnyVal with Curried {
    def applyEnd: Either[String, Int] = value
  }
}

While the intent of the code is for something like this to work:

Calculate("const", 4) == Right(4)
Calculate("sum", 1, 2, 3) == Right(6)
Calculate("pi") == Left("Unrecognized operation: pi")

How this would actually happen is unclear, or what would happen in these cases:

Calculate("const")
Calculate("const", 1, 3)
Calculate("sqrt", 8)

Which leads me to believe that traits extending Curried should probably be disallowed

Katrix · August 29, 2019, 2:14pm

I don’t think the problem in the above example is the trait part, but instead that you require that the translation be performed at runtime.

More specifically, Curried as I have understood it, like Dynamic, is structural in it’s behavior (you only implement the parts you want to support, instead of overriding methods from the Curried trait).

If so, then it becomes clear that none of the bad examples you provided would compile, as it can’t find the method to execute.

The trait Operation has no applyEnd or applyNext method, so the curried calls become compile time errors.

You could probably do something like you want above if we had Dotty’s inline defs.