Spark as a Scala gateway drug and the 2.12 failure


From looking at this on a surface level (note that I don’t really use Spark that much) I can think of some recommendations for a postmortem/retrospective

  • It sounds like there is an argument for putting the closure cleaning (or a part of it) inside Scala stdlib somehow (or maybe the compiler, not sure which abstraction level works best). This may put more effort onto the Scala compiler team, but it turns out that implementing the closure cleaner properly required effort from the Scala compiler team anyways. At least if the closure cleaner is part of the official Scala release, it will always be available when Scala gets released and it appears that this isn’t as Spark specific as we think it is (i.e. look at Fink)
  • Alternately it seems like it may be a good time to revisit Spore’s ( which were deliberately designed to solve this problem. I know there were some technical reasons as to why they couldn’t be completely finished, but there might be some argument in getting Spore’s over the hurdle so they can actually be used (or maybe even part of Scala itself)? As far as I understand, if you use Spore’s instead of Scala’s closure, you don’t even really need a closure cleaner since with Spore’s you have to explicitly define which variables get included inside a closure.

As you said, and its important to re-iterate, there were very real reasons for this delay. It was an incredibly technical problem, so much so that it required Scala compiler engineers to get it over the fence, so I don’t think we should over dramatize what happened.


Will closure encoding change dramatically after Scala 2.12? Scala 2.12 biggest change was leveraging and integrating with lambdas support in Java 8, i.e. closure’s encoding was rewritten from scratch. In Scala 2.13 the biggest change is another collections’ redesign.

IMO we should just wait and see how long it takes to adapt Spark’s closure cleaner to Scala 2.13. If it takes several months then integrating closure cleaner with core Scala would be warranted. Otherwise the huge delay in supporting Scala 2.12 could be treated as a rare event.


I don’t think there’s anything that’ll need to be changed in the closure cleaner for 2.13.