I hope this request/discussion is not misplaced, but I wonder whether there has been any discussion about making the tail-call-optimization more robust in Scala 3? The case that I have run into is dealing with mutually recursive local functions. I can understand why compiling mutually recursive non-local function is not compatible with JVM, but a set of local mutually tail-recursive local functions can be compiled to labels and gotos, right?
Here is an example of some code I recently wrote which results in stack overflow. The programmer can avoid the stack overfly by hand-inlining the local functions, or ad-hoc trampolining, at the expense of human-readability.
I’m interested to hear what the developers think.
val digits = Set('0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '-')
val endOfLine = Set('\r', '\n')
val whiteSpace = Set(' ', '\t') ++ endOfLine
def parseListContent(chars: List[Char], clause:Clause, clauses:CNF): CNF = {
def skipSpace(): CNF = {
parseListContent(chars.dropWhile(whiteSpace.contains(_)), clause, clauses)
}
def skipToEOL(): CNF = {
parseListContent(chars.dropWhile(!endOfLine.contains(_)), clause, clauses)
}
def readNumber(): CNF = {
val (num, tail) = chars.span { c => digits.contains(c) }
val number = num.mkString.toInt
if (number == 0)
parseListContent(tail, Nil, clause.reverse :: clauses)
else
parseListContent(tail, number :: clause, clauses)
}
chars match {
case 'p' :: _ |
'c' :: _ => skipToEOL()
case c :: _ if digits.contains(c) => readNumber()
case c :: _ if whiteSpace.contains(c) => skipSpace()
case Nil if clause.nonEmpty => (clause.reverse :: clauses).reverse
case Nil => clauses.reverse
}
}
Just for the sake of completeness, here is an example of how the programmer can refactor the function to make it compatible with the tail-call-optimization of the scala compiler.
def parseListContent(chars: List[Char], clause:Clause, clauses:CNF): CNF = {
chars match {
case 'p' :: _ |
'c' :: _ => parseListContent(chars.dropWhile(!endOfLine.contains(_)), clause, clauses)
case c :: _ if digits.contains(c) => {
val (num, tail) = chars.span { c => digits.contains(c) }
val number = num.mkString.toInt
if (number == 0)
parseListContent(tail, Nil, clause.reverse :: clauses)
else
parseListContent(tail, number :: clause, clauses)
}
case c :: _ if whiteSpace.contains(c) => parseListContent(chars.dropWhile(whiteSpace.contains(_)), clause, clauses)
case Nil if clause.nonEmpty => (clause.reverse :: clauses).reverse
case Nil => clauses.reverse
}
}
Note also that this programmer-driven manual inlining strategy fails in the case that local functions need to call both themselves and the host functions such as shown below. Nevertheless, the local function nature of the code should still permit it to be tail-call optimized.
I didn’t realize that trampolining was built-in to the std library. But even if it weren’t included in the standard library, the programmer could implement an ad-hoc trampolining API so that rather than calling the local function f1, I return a closure which when called will call f1, but such an API would have to have a way to say "I’m finished, don’t re-call me, just return this value. So the API would be something akin to right-recall/left-return.
Thanks, I’ll have a closer look, but according to the documentation it does not do what I described above. I.e., it optimizes two methods in the same scope which call each other, but it does not (at least according to the documentation) optimize a local function which calls the host function. Although the documentation does say that PRs are welcome.
I don’t know how many people here follow cloJure, which is a lisp dialect on the JVM which allows apparently a very nice functional paradigm. The developer of cloJure didn’t implement tail call elimination, giving the reason at that time that the JVM does not support it. In the mean time the generation of developers who’ve used the language now believe that it should never be added. They have all sorts of reasons to justify it not being in the language.
They have a system of trampolining, which is a bit easier in a language which is not strictly typed. They claim that using trampolining makes tail call tail recursive functions more explicitly so, and that the JVM does an excellent job of optimizing their application level trampolining.