Add future with timeout to scala.concurrent?

justinhj · December 2, 2019, 9:18pm

Is there any interest in adding a future with timeout functionality to the scala.concurrent? I see this asked by beginners often and the answer is usually for them to seek out the function in an external library such as Akka. I have a suggested implementation which uses java.util.Timer to manage the timeout on a background thread.

The function signature is:
def futureWithTimeout[T](future : Future[T], timeout : FiniteDuration)(implicit ec: ExecutionContext): Future[T]

It would probably live in the Future companion object. The user will receive a Failure with java.util.concurrent.TimeoutException if the timeout occurs before the future completes.

This does not include any cancellation.

Here’s a gist with the code:

gist.github.com

https://gist.github.com/justinhj/e2eb081af1f8a341f957e2f8bc4e9686#file-futuretimeout-scala

FutureTimeout.scala

package justinhj.concurrency

import java.util.concurrent.TimeoutException
import java.util.{Timer, TimerTask}

import scala.concurrent.duration.FiniteDuration
import scala.concurrent.{ExecutionContext, Future, Promise}
import scala.language.postfixOps

object FutureUtil {

This file has been truncated. show original

Cheers,
Justin

Sciss · December 2, 2019, 10:02pm

The problem is you are not cancelling the future when the time out is reported. That can lead to all sort of ill behaviour. What you probably want is a future that can be aborted. Java’s Future has such cancel method by the way. I’m not sure what a good API around that would be for Scala, since that means the future’s body must check the cancellation state. I wrote something along these lines in a future sub-class called Processor, but I don’t consider it the best possible formulation / API; it’s a bit ugly.

justinhj · December 3, 2019, 12:21am

Yes that’s certainly an issue. Since many beginners seem to want this facility I wonder if it’s worth giving it to them in this simple form where it is understood that the future is not cancelled. Of course you need to weigh that against the potential foot guns it provides.

As your (neat) example shows, the api and implementation becomes a bit more complicated when cancellation is provided.
J.

julienrf · December 3, 2019, 8:35am

I agree that this operation is currently missing from the standard library. I also wholeheartedly share the concerns about resource leaks. I’m afraid this would require a major redesign of Future, though:

In the meantime, I encourage you to try monix, cats-effect, or zio, which all provide tasks that can be cancelled after some timeout.

AMatveev · December 3, 2019, 9:50am

Actually it is not so simple as it seems.
You cannot just call interrupt() in general.
Workers can use shared files and socket in production. If someone call interrupt while a worker is reading from a shared file. The shared file will be closed. It will crash the server.
This is a reason while we use custom features and thread pools to implement interruptible workers.

Jasper-M · December 3, 2019, 10:51am

In the same family of small footgun utilities is javascript’s setTimeout function.

viktorklang · December 3, 2019, 12:14pm

It’s important to distinguish between the cancellation of a task (i.e. avoiding to execute some logic, or aborting the execution of some logic) and providing a value after a certain amount of time (they are orthogonal).

The “only” way to add timeout-functionality is to introduce some sort of clock, in which case several design constraints are introduced:

How accurate is this clock?
How high resolution does this clock have?
How scalable is this clock?

In my experience there is no one-size fits all for these things, which is also why there is no default-global Timer in the JDK either.

The L*A*R*S in Akka is a pretty decent tradeoff, which is typically why I recommend using that, but if users want very high accuracy or very high resolution, then it is not optimal.

Worth noting is that there has been some interesting research papers on timer implementations the past couple of years, but they are unfortunately still on my to-read pile since I’ve been working on other things.

szegedi · December 3, 2019, 3:15pm

I have to admit that I like Scala’s design of futures much more than I like Java’s, and one of big differentiators that I like is exactly that Scala futures aren’t cancellable by any client code receiving a Future object.

Scala very clearly separates the concerns into a responsibility for completing a future (with success or failure, one of which can admittedly be a timeout exception): that’d be Promise. On the other hand, you can safely hand out a Future object to callers because they can’t affect its completion. A future in Scala is simply a monad (yes, I just said that) for a value that might be available now or in the future (including the case of not being available ever.)

If you want a timeout on a future, it’s relatively easy to code that up by making a promise, trying to complete it with that future, and separately scheduling a timeout that when it fires, it also tries to complete the promise by failing it with TimeoutException failure. Finally, return this promise’s future in place of the original future.

Of course, it won’t magically cancel the execution of whatever the task is (or tasks are!) that are working to fulfill the future (even in my above timeout-with-promises scenario now you have at least two tasks: the one working to fulfill the original future, and the timeout task).

Cancellation is a completely different concern and if it’s necessary it should be something exposed as a separate API by the original producer of the future somehow independently of the future object itself, I think.

The main issue here is that not every Future has a single thread somewhere toiling to produce its result that you can easily cancel. Some Future objects represents the end result achieved through transformations, chaining, and joins (e.g. Future.sequence, Future.foldLeft and friends) of multiple other asynchronous computations. What would “cancelling” the ultimate result future of all these async computations mean? It’d be a pretty big support burden to provide correct and intuitive cancellation semantics as well as ways to implement those semantics.

I think that in the long term, the current Scala design spares a lot of headache by not requiring the burden of a correctly implemented cancellation. I recognize there might be valid reasons to cancel execution of an async scheduled computation, I’m just saying that such functionality should probably exist somewhat orthogonal to the Scala Futures and might often be a bespoke implementation depending on what the computation itself looks like.

At the end of the day, async computation tasks are separate from objects representing their outcomes.

curoli · December 3, 2019, 4:13pm

Why would interrupting a thread close a file?

As far as I know, the JVM will close a file when the file descriptor is no longer reachable.

justinhj · December 3, 2019, 4:42pm

java.util.Timer documentation offers that it “does not offer real-time guarantees”, so we can’t pass any guarantee to the user about resolution or accuracy, but it does claim to scale to thousands of concurrently scheduled tasks under normal conditions.

I can take your point, though, that these trade offs will not work for everyone.

That’s a great solution but often they are not using Akka and don’t want to bring in a large dependency. For reference this problem was mentioned to me by a colleague who trains a lot of beginners in Scala and how to timeout a future is very common first question once they start using some concurrency.

That is the implementation I am suggesting above. My question is, should we tell beginners that it is relatively easy to code, or just point them to the function they are looking for?

AMatveev · December 3, 2019, 5:43pm

I can not say about a file exactly.
We had caught server crash on interruption on infinispan cache.
The infinispan uses FileChannel (Java Platform SE 7 )
The FileChannel implements:
InterruptibleChannel (Java Platform SE 7 )
The documentation says in InterruptibleChannel
if a thread is blocked in an I/O operation on an interruptible channel then another thread may invoke the blocked thread’s interrupt method. This will cause the channel to be closed

The infinispan cache closing brakes other workers completely.

curoli · December 3, 2019, 6:09pm

I see your point.

On the one side, they say FileChannel is “safe” to be used by multiple threads concurrently, but at the same time, interrupting one thread blocking on it will close the FileChannel, so it seems to me that sharing a FileChannel between threads that can be interrupted may not be so useful. Maybe the lesson is that FileChannels should not be shared between threads after all?

AMatveev · December 3, 2019, 6:33pm

It is retorical question for me.
How can we guarantee that there are no InterruptibleChannel interface in the 250 megabytes of libraries which are needed by us.
I am unsure either that implementing infinispan without shared filechannel is possible at all.
But I am sure that nobody will allow to merge this changes to main infinispan project.

curoli · December 3, 2019, 7:13pm

You are right. There should have been a standard way to deal with this robustly, and it is a shame that there is not.

szegedi · December 3, 2019, 8:03pm

We can provide such a function, as long as users understand the difference between a future timing out versus a computation being cancelled.

morgen-peschke · December 3, 2019, 8:52pm

A straightforward way to do this may not be providing timeout directly, but rather something more along these lines:

Future.first(
  asyncFoo(),
  Future.delay(bar, 20.seconds)
)

This way may be more clear that the value returned by the first Future to complete is taken, without implying the longer lasting one will be canceled.

viktorklang · December 4, 2019, 8:39am

Not only that, the j.u.Timer executes the TimerTask on the Timer Thread, which means that every TimerTask will delay the next TimerTask—so scaling to thousands of tasks is a highly theorhetical limit.

Agreed, in this case, perhaps the best thing would be to consider to evolve ExecutionContext to have a delay-method. But that also means having some default implementation of a clock in the stdlib.

julienrf · December 4, 2019, 11:49pm

Yes, that’s what I think too. Ideally, I’d like to remove things from Future, which seems to already handle too many concerns. I don’t think it should deal with ExecutionContexts (this should go to Task too, along with cancellation). I’d love to see Future be just a more ergonomic continuation type.

viktorklang · December 5, 2019, 11:58am

It’s highly unlikely that Future will be reduced in scope, especially not anything fundamental.

However, ever since the beginning of SIP-14 I’ve wanted to have a supertype to Future called something like Eventually[+T] which could be the type you’re looking for in the case of wanting to integrate a Task-like construct into the stdlib.

mdedetrich · December 5, 2019, 12:06pm

We just use akka.pattern.after everywhere, since akka invariably ends up in our dependency list due to transitive dependencies, it hasn’t been a huge problem.

Agree that we should have something equivalent already in Future to do timeouts.