Filtering/blocking scope for an inline (possibly even anonymous) function def

chaotic3quilibrium · March 17, 2019, 4:08pm

I picked up a Scala “want” from this video: https://www.youtube.com/watch?v=QM1iUe6IofM

In it, he talks about being able to write an in-place function which is blind to (blocked from) all the scope context surrounding it, only allowing the function to operate on whatever is passed within its parameter list(s). Apparently, this doesn’t exist in any current language, which I found fairly amazing.

My own context for wanting this is, I ran into a pernicious bug in my own code where, due to the nature of the self-relation, I needed to selectively reverse this and that within a method to vastly reduce the search space, mapping them to left and right. And when refactoring for the change, I missed a single implied preceding “this” on a method (where it needed to now be “left.”). Not knowing about the miss, the undesired effect was it produced a quizzical and unexpected combinatorial explosion on the resulting use. After many hours of head scratching, I finally lifted the entire method out to a completely new empty “object”. And then the method immediately popped out as unresolved.

So, given this idea of having a function definition cancel all scope assumptions and only rely on its passed parameters would have easily caught this. And, such a function definition also has the refactoring bonus of encouraging a “referentially pure function” which might grow to be something larger and more useful as the context within which it was produced continues to expand and grow.

Given complexity of all the many things being “injected” into a function in Scala, having something reverse the assumptions would be very valueable to me. Additionally, there could be a refactoring idea where this kind of function could be produced from an existing inline def with a “force all parameters explicitly” which would then produce a pure referential function where everything upon which it depends can be seen in the parameter list. It would be a nice mechanism for those of us who have to do maintenance following other advanced coders to see what kinds of dependencies with which we are working. This is like the “implicit” problem, writ large.

nafg · March 17, 2019, 8:14pm

Spores?

chaotic3quilibrium · March 17, 2019, 10:45pm

What does “spores” mean? Reference link?

nafg · March 18, 2019, 12:08am

Sorry, was on mobile.

It’s a project that seems to have stagnated and been restarted a few times. The most recently updated site seems to be http://scalacenter.github.io/spores/spores.html which says that it’s not available for scala 2.12.

However it sounds a lot like what you described.

oscar · March 18, 2019, 1:58am

It would be really great to see spores be a supported part of the language in dotty/scala 3. I still think spores can be useful for distributed systems applications (spark, scalding, flink), but also for cases where you want to be sure you don’t accidentally pin some large objects in memory in a service harming GC.

With the new generic tuples, you could imagine giving a path type to the closure:

trait Spore[-A, +B] extends Function1[A, B] {
  type Closure <: HList
}

conside:

val x: Int = ...
val greeting: String = ...
spore {
  val xc = x
  val gc = greeting
  { num: Int => s"$gc: $xc + $num == ${xc + num}" }
}

if that had type: Spore[Int, String] { type Closure = Int :: String :: HNil } and the Spore companion object had some way to help you write serializers:

object Spore {
  def serialize(s: Spore[Int, String])(s: Serializer[s.Closure]): Array[Byte] = ...
  def deserialize[A, B, C](cls: Class[_ <: Spore[A, B] { type Closure = C}], s: Serializer[C], bytes: Array[Byte]): Try[Spore[A, B]] = ...  
}

Of course the Class there is ugly and can perhaps be improved, but the idea would be some kind of handle of what code the spore refers to so the runtime can manufacture a spore instance from that handle plus a C value.

I don’t think you can do this in a library in a way that I have seen. The way this is handled in spark/scalding/flink/etc… is to use java serialization or kryo serialization on the function instances. It very often works, but when it fails is a frustration. If we had this built in, at the first sign of trouble you would have a clear solution: use a spore for the fn in question.

chaotic3quilibrium · March 20, 2019, 7:30pm

While I find Spores very interesting, that is much more complex than for what I was requesting.

I’m looking for a simple way to write a simple function which has no more context than the function’s parameter list(s). IOW, I want it to explicitly NOT include any surrounding context or scope. IOW, the explicit purpose is to eliminate all possible external “injected” context.

case class A(b: Int) {
  val c: Int = b * 2

  def aMethod(x: Int): Int =
    x + 7      

  opaque def pf(b: Int = c): Int = {
     // neither this.b, c, this.c, this.aMethod, aMethod() are available here
     b * 2
  }
}

My goal is to have an oasis of simplicity amongst the stew of complexity.

jducoeur · March 20, 2019, 7:44pm

I’m not understanding the motivation. A function like that normally gets put in the companion object, which seems to have the effects you’re looking for. Why is it a priority to put it inside the class instead?

chaotic3quilibrium · March 21, 2019, 10:44pm

Even in a companion object, a function still has things being “made available” breaking the clean “interface” between the function definition and the function implementation. And pushing a nested function into a companion object is a fairly heavy operation requiring the companion object be defined. For a case class, that is a pain as it knocks out the implied companion object requiring all the “automatically provided implementation” be explicitly implemented.

The idea is to be able to “clear a space locally” from all of the surrounding explicit and/or implied context. It not only simplifies reasoning about the contained implementation code, it also would prevent the contained implementation code from becoming accidentally “contaminated” from up-scope refactorings. While it isn’t an often need, having a way to clearly define the interface (function definition) and the implementation (function body) with strong boundaries would definitely be valuable to me in those cases where I need it (like I described in my original post - it would have been a lifesaver - and putting it in the companion object exposed implementation details I didn’t want to be surfaced at the object level).

nafg · March 22, 2019, 2:24am

The best way to do that in “plain old Scala” perhaps is to use a top level class. You can think of the constructor parameters as being analogous to function parameters. So it’s very clear what a method in it depends on. In the future the plan is to be able to define top level functions. Those would not have much to close over.

ashwinbhaskar · March 23, 2019, 2:04pm

I am in support of this feature. It greatly saves me the the headache of coming up with an appropriate name for the function and guarantees that there are no side effects. This the syntax described in the video

a = use x, y {
 //returns from the user block
  return 3
}

chaotic3quilibrium · March 23, 2019, 11:54pm

While what you are saying works, you do realize there are multiple steps of overhead to do this, just to create a function which has minimal injected external context.

The first is that I must now define a class and instantiate it just to create the context for defining the function to call. It also means that I have to carry around the reference, or reinstantiate the instance every time I’m in the context which needs to call the function. That would be unacceptable overhead if this function happens to be in a multi-nested inner loop.

The second is that now I am dealing with exposing an internal implementation over a cross-class boundary. There are performance and security concerns. While they may appear to be negligible, they remain part of the vulnerability surface.

Next, I now have to come up with TWO names; one for the function and one for the temporary holding class. That’s exactly two more names than I want to be required to come up with while in the flow of working.

Finally, this solution seems like moving back towards Java in that it is boilerplate.

It would be vastly more effective to just allow me to specify a donut hole of “you must be explicit about all dependencies of the implementation in the function’s parameters’ interface”.