Upates to scala.concurrent.Future wrt breaking changes

Ah yes, I see what you mean (this is what I meant by aggressive change btw)

But its true that map isn’t special, however its used a lot. In fact because of this, frameworks like akka-http had to implement a fast-future variant of Future (with the main difference being that the map function doesn’t use an existing EC).

I do wonder if its easy to separate it out cleanly, map is by far whats used the most (i.e. its called in all for comprehensions)

I think for comprehensions call flatMap and filter. And at least some of the other methods are also called often, e.g. recover/recoverWith. It doesn’t make sense to optimize some and not others, if the optimization breaks source compatibility anyway.

Yes true, although it makes more sense for flatMap to take an ExecutionContext, because this is where an ExecutionContext (by default) makes sense. If you want to merge the result of 2 Future values, this is where the main usecase of ExecutionContext comes in (since these future values typically are behind Http/Database calls)

I think map (and maybe recover) are the most common. The point is that map is almost always just updating the literal value inside a Future, where using an ExecutionContext is redundant

Point taken, the idea is we should use the defaults that make sense

FYI, that’s not exactly true — since Scala 2.12 we have another abstract method — transformWith, which is basically flatMap with error handling.

Also Future should not be considered an abstract data type that can be inherited — even though I like it the way it is, because it allows for alternative implementations — because you’re going to break flatMap’s default implementation which discards intermediate references in long chains and that needs to use internals for doing that — so by overriding Future, unless you really know what you’re doing, you’re probably going to end up introducing memory leaks in those flatMap (tail-recursive) chains.

Scala’s Future is awesome when used in combination with Monix’s Task, the two being complementary.

A Task works like a lazy Future, being a Future generator if you will.

See my presentation from Scala Days 2017 — https://www.youtube.com/watch?v=bZO-c-yREJ4

What about detecting if the EC passed is in fact the same one the original Future is running on (assuming it was constructed by Future.apply or equivalent), and doing the optimization only then?

1 Like

Theoretically speaking the JVM should optimize this automatically (since implicit parameters get desugured to plain method parameters) however being the JVM, its not always apparent when inlining happens and when it doesn’t. Thing like method bodies being too large can suddenly trip the JVM into not optimizing something

Alternately whole program optimization (i.e. Dotty deep linker plus maybe also the new optimizer in 2.12.x) could theoritically optimize this away, but I am not really sure thats the case

There are also 2 seperate issues here, one is optimization and the other is intent. Having an ExecutionContext in the method parameter implies that you need it (and you use it) for the computation, but if your computation is just changing the value inside the Future and you don’t really use the executionContext. In this case providing the executionContext is not entirely being correct about the intent of the typical usage of map.

In any case if optimisation can elide the ExecutionContext away in this situation then it shouldn’t really be an issue, however I am doubteful this is going to be reliable on platforms like the JVM

Yeah I am up for making Future final/abstract as detailed in the document because

  • It does improve performance sigficantly in a lot of cases
  • I haven’t seen anyone extending Future apart from addressing the performance problems which we are trying to solve in the first place.

There are however other issues with this at least if we make future final, i.e. we have CancelableFuture which Monix uses (and this extends Future), however maybe it makes sense to put CancelableFuture into scala.concurrent (at least that we now know have a stable concrete of it) however I suspect such a change may not be popular (especially considering that people want to move stuff away from Scala stdlib)

Monix extends it for CancelableFuture which is a valid use-case.

I’m not necessarily for having a CancelableFuture in the stdlib, I’m quite happy to have it in Monix.

However I don’t see people moving away from the stdlib, nor should they. Scala is a hybrid language and it needs a Future for imperative code. And I quite like having that Future in the stdlib.

Yeah my point is, if we decide to make Future final for performance reasons (this is a valid reason) then we need to investigate all of the current reasons for extending Future in the current Scala ecosystem.

So far I see 2 legitimate cases

  • Stuff like akka-http FastFuture (deliberately implement for performance reasons, i.e. it has a map which doesn’t require an ExecutionContext). Problems like this should be solved in the first place with performance improvements in Future
  • Stuff like CancelableFuture in Monix. On this note, one of the reasons why Twitter Future still exists is because the Scala Future can’t be cancelled (I think other reasons are also performance related)

If Future is ever made to be final we basically need to make sure that we don’t kill current legitimate cases for extending Future, which means that we may need to add stuff like CancelableFuture into scala.concurrent.Future

I once needed to extend Future. I wanted to represent a set of events using Future values, and one of them was an event that would never happen, i.e. the future would never complete. This Future value became a resource leak from all the continuations linked to it, because there was generic code that operated on any event passed to it. So I extended Future to override onComplete to do nothing.

In a sense, a Future.never is the dual of the always-already-completed Future.unit.

CancelableFuture from Monix already has this, monix/monix-execution/shared/src/main/scala/monix/execution/CancelableFuture.scala at ec266e1a167cdf956e692725a2b2016e79a71141 · monix/monix · GitHub

Yet another reason is building already completed values, also from Monix:

sealed trait Ack extends Future[Ack] { /* ... */ }

case object Continue extends Ack { /* ... */ }

case object Stop extends Ack { /* ... */ }

The nice thing about this setup is being able to return a straight Continue when a Future[Ack] is expected.

Btw, I am not convinced that making Future a final class will improve performance.

Hi everyone,

Thanks for raising this conversation, Matthew.

Fortunately/disappointingly (depending how you want to view it) I’ve already implemented most, if not all, of the viable optimizations here, which also include JMH benches: https://github.com/viktorklang/scala-futures/tree/wip-optimizations-√

It’s still a work in progress, however I am rather confident that there’ll be some nice, non-breaking, performance improvements coming out of this.

Sorry for the long story below, it’s only relevant if you enjoy some Future/Promise backstory/rationale:

I, personally, have realized that it is very important that, before I suggest what I consider to be improvements, what something is trying to achieve, as I can make something like Future blazing fast if I am willing to compromise on resource-safety, fairness, determinism, memory-footprint, extensibility, compatibility etc.

What Future/Promise has achieved—from my experience using it, and my interactions with users online and offline—is ubiquity. From what I can tell, it is used by practically every Scala developer out there, which is rather cool, but it also means that it must change only in very responsible ways.

For the casual reader of this thread, you may not know why the following things are as they are, so I thought I’d take the time to outline, from memory, what is intended to be achieved by the following decisions:

ExecutionContext: by having the piece of code which wants to compute things having to specify where, leads to: determinism (no longer racing between completer and invoker), resource-safety (added logic cannot poison the pools which produce the values), fairness (which is up to the ExecutionContext implementation to deliver), extensibility (it’s easy to integrate with most execution engines / thread pools), compatibility (it has very few methods so easy to keep compatible)

Future/Promise: By having a separation between read-capability and write-capability, it is much easier to reason about what code wants to be able to do, and what code does.

Absence of cancellation: This was very consciously decided, if Future can be cancelled it is no longer read-only, which means that any reader can mess up the other readers’ reads if their Future is shared. This leads to tons of defensive copying, and worse, it is no longer clear in the code what will happen, or which defensive copies are actually needed.
Also, semantically, a Future is a placeholder for a value that might not yet exist, and as such, it doesn’t really make sense to make it cancellable—a Task is something which could be cancelled, or perhaps something like a SubmittedTask, anyway, I digress.

I guess what I’m trying to say is, I think Future will be possible to improve, performance wise, in some cases by quite a lot, and in some cases perhaps rather modestly. All of this without breaking source compatibility. (And I’d be extremely cautious to introduce user-breaking changes, just because it is so widely used.)

Also, I’d think a Task-like abstraction/construct in the stdlib would be a good thing, to provide for that nice bridge between a lazy and a strict construct.

Cheers,

4 Likes

Allowing map to use a Future’s ExecutionContext violates encapsulation. An object might be performing computations using its own, private ExecutionContext, and exposing the results of those computations in Futures. If map can/will use that private ExecutionContext, then it exposes what was meant to be private, and may violate invariants of the ExecutionContext (for example, it may be a java.util.concurrent.Executors.newSingleThreadExecutor(), and allowing outside computations could slow down computations the object is doing).

1 Like

Hi @viktorklang,

I’m very happy to hear that Future is receiving a performance upgrade.

Future is indeed ubiquitous and we really should give more credit to its design, a very good design if you take a look at what other platforms have, my favorite past time of late being to complain about JS’s Promise.

On cancellation, your reasoning is good, which is why I wouldn’t include a CancelableFuture by default in the standard library —in the case of Monix the CancelableFuture was built for usage with Task or with Observable. In that context you get cancelable future refs at the “edges of the FP program”, when you call runAsync (our equivalent for Haskell’s unsafePerformIO), which ideally happens only once in main (or as few times as possible) within the program.

I do think that cancellability leads to more safety due to having the ability to close resources and avoid leaks on race conditions, which are inevitable with concurrency — the simplest example is being able to describe a timeout operation that does not leak. But cancelling a future does indeed mean that the developer has to encapsulate the usage of that Future and not share it with other consumers without due diligence. So point well taken.

Personally I’m against including a Task-like abstraction in the stdlib. This is because including such an abstraction in stdlib involves stagnation and discourages the use of any alternatives.

For Future I’m pretty sure that you had plenty of prior work to analyze and decide on the implementation we have now. For Task however we are still deciding on what the best encoding is, while having a continuous debate about responsabilities / boundaries, i.e. should it take care of concurrency concerns or not, which really boils down to if it should it be cancelable or not and whether it should have a notion of some underlying “execution context” or not — and there are very good arguments on each side of the fence.

So as long as the dust doesn’t settle on what the best approach is, including it in the standard library would do more harm than good. Plus such an abstraction really works best within the context of an FP library like Cats or Scalaz. So if we don’t get the Monad type class to go along with it, which would open another can of worms, maybe it’s best to leave it out for now.

1 Like

On this note, I don’t think that Future had as much prior (at least it wasn’t any more prior work considering the current place where Monix Task is). In fact I would argue that Monix Task is probably more developed compared to when Future was included into the Scala stdlib (as evidenced by a lot of the low hanging fruit performance improvements which haven’t been done yet).

I also think its quite unquestionable that having a ubiquitous async type Future has overall been a tremendous benefit to Scala. Even if the implementation is not perfect, its inclusion has bolstered a massive ecosystem and has also greatly helped Scala.js interopt with Javascript (i.e. Future maps very nicely in the Javascript ecosystem and we can use the same async type on both platforms with almost identical behaviour).

I guess I am of the unpopular opinion around here, but I think there should be a standard lazy async type (lazy version of Future, i.e. Task) in the standard library. In fact not having this is what has caused so much chaos and confusion in the Scala ecosystem particularly for people who want a purely functional solution. There have been many competing types of Task, entire libraries have had to be redesigned because of this flux (i.e. see Http4s going from scalaz task to FS2). While its true that competition is healthy because we eventually find a good solution, I think we have already passed a point where the solution/s is good enough,.

I however digress because there is a move (with good reason) to put stuff out of the stdlib and into the SP (and Future) may end up following suite (in any case my original argument would still apply, a lazy Task would reside in the same place where Future ends up being). There are however technical issues which have been stated, ie. the Scala stdlib no having typeclasses for many of the purely functional types which current Task relies on (now using cats for this functionality). Furthermore the scope would have to be agreed upon, one of the reasons I believe that Future was so successful is that it was quite limited in what it did. It didn’t handle streaming, nor did it have interfaces for things like Observable. It was strictly just representing an async computation with an interface for fairness/execution context and safety regarding stacktraces. We then have other libraries building ontop of this (akka-stream/akka-http and Monix Task) which is actually the situation we want.

If there was a lazy type of Task that would be included in SP/stdlib, i think it would have to have a similar restricted scope.

@mdedetrich you gave the example of FS2 which is undergoing a painful transition to Cats.

It’s not necessarily a good example for the necessity of having a Task in the standard library, because that transition to Cats would have happened anyway and it would have been painful anyway, because the availability of a Task isn’t the only problem that a library like FS2 has.

I think we have a problem of causality — if we have both Cats and Scalaz in the ecosystem with different implementations and community management, that’s because we cannot agree on how to best do things.

By including something in the standard library you’re basically forcing users to agree on one true solution. Sometimes that’s valuable, however that’s also the Microsoft-mentality of blessing implementations and of discouraging community-provided solutions.

And while that might be desirable, if we take a close look at the landscape, we can notice that the metaphorical Bazaar has been winning for quite some time against the Cathedral — think Java / JVM versus C# / .NET, Clojure versus Scala, JavaScript vs everything else.

Yes, Future is a success. The standard collections are a success. But this sampling is biased, because if we take a look at things like JSON / XML parsing, the landscape is filled with the corpses of failed implementations in standard libraries that nobody wants, but have to endure due to legacy code.

Speaking of which I don’t like the idea of the Scala Process that much — being basically about making the standard library more modular and expanding it with what are considered to be the needed batteries, which is good for adoption, but a SP project is stdlib nonetheless, with all the downsides that brings.

Personally I don’t believe in batteries included and pretty much ended up hating it in every language that was advertised as having batteries included, like Python or Java.

A note. The Scala Platform process is compatible with the Bazaar idea you mention. The Scala Platform provides libraries that want to help all Scala developers. There can very well be a Typelevel Platform, or a Scalaz Platform, or Your Organization’s Platform if you want, and they can complement themselves.

The Scala Platform modules are community-driven and can be adapted and evolved over time, unlike the official Scala Standard library (or soon to be known as Scala Core in 2.13). In fact, no developer employed by Lightbend or EPFL provides support for the modules of the Scala Platform. We want to create solid ecosystems around the libraries that join the Platform, and also encourage companies to fund/sponsor them.

I don’t want to gear this discussion to another matter, but a clarification here is important. If you want to continue discussing this, please consider opening a new thread. :smile:

Some issues were technical, but a lot were also politicial/NIH syndrome. There isn’t for example really significant differences between fs2 Task and scalaz7 Task. In any case I don’t really think its constructive to go over the whole cats/scalaz debacle.

It has also failed miserably in a lot of other environments, i.e. have a look at Haskell (3 different string types, which all library authors have to target because its a single string type) or at Scala having ~7 different type of JSON AST’s (i.e. have a look at slick-pg and how many JSON libraries it needs to target GitHub - tminglei/slick-pg: Slick extensions for PostgreSQL)

In Scala its actually an even worse problem because we do dependency management via binaries, and because of the tree your dependencies form its really easy to get into dependency hell due to binary compatibility.

These really aren’t good examples.

The JSON library was included as a demo to test parser combinators, it wasn’t really included as a goal of being a “good JSON library”.

WRT to xml, I am not sure if you are talking about XML literals or the scala xml library. If you are talking about the latter, considering how complicated XML is it actually was quite a good library especially for its time.

In any case it comes down to how critical functionality is, and I believe async primitives (at least if we are dealing with modern programming where this is very common) there should be some basic implementation that is good for general use case usage.

Note that as @jvican has said, this doesn’t prevent community contributions nor does it mean that the SPP doesn’t support a bazaar approach. In any case this is getting a bit off topic and should be discussed in another thread