Proposal for programmatic structural types

sjrd · February 8, 2019, 4:15pm

Hi Scala Community!

This thread is the SIP Committee’s request for comments on a proposal to change how Structural Types work in the language. You can find all the details here.

Summary

Scala already supports structural types, which look like the following:

type Foo = {
  val x: Int
  def m(a: Int, b: Int): Int
}

and can be used as follows:

class Bar {
  val x: Int = 5
  def m(a: Int, b: Int): Int = a + b
}
val foo: Foo = new Bar()
println(foo.x)
println(foo.m(4, 5))

Behind the scenes, the implementation of the actual field accesses and method calls are hard-coded in the compiler, using platform-dependent techniques (reflection on the JVM, special linker features on Native and JS).

This proposal extends the above structural types mechanism so that the implementation can be programmatically defined in user-space.

The standard library defines a trait Selectable in the package scala, defined as follows:

trait Selectable extends Any {
  def selectDynamic(name: String): Any
  def applyDynamic(name: String, paramClasses: ClassTag[_]*)(args: Any*): Any =
    new UnsupportedOperationException("applyDynamic")
}

An implementation of Selectable that relies on Java reflection is available in the standard library: scala.reflect.Selectable, and will be special-cased by the compilers for JS and Native to support existing uses of structural types from Scala 2.

selectDynamic takes a field name and returns the value associated with that name in the Selectable. Similarly, applyDynamic takes a method name, ClassTags representing its parameters types and the arguments to pass to the function. It will return the result of calling this function with the given arguments.

Given a value v of type C { Rs }, where C is a class reference and Rs are refinement declarations, and given v.a of type U, we consider three distinct cases:

If U is a value type (i.e., it’s a val or a def without ()), we map v.a to the equivalent of:

v.a
   --->
(v: Selectable).selectDynamic("a").asInstanceOf[U]

If U is a method type (T11, ..., T1n)...(TN1, ..., TNn) => R and it is not a dependent method type, we map v.a(a11, ..., a1n)...(aN1, aNn) to the equivalent of:

v.a(arg1, ..., argn)
   --->
(v: Selectable).applyDynamic("a", CT11, ..., CTn, ..., CTN1, ... CTNn)
                            (a11, ..., a1n, ..., aN1, ..., aNn)
               .asInstanceOf[R]

If U is neither a value nor a method type, or a dependent method type, an error is emitted.

We make sure that v conforms to type Selectable with (v: Selectable), potentially introducing an implicit conversion, and then call either selectDynamicorapplyDynamic`, passing the name of the member to access, along with the class tags of the formal parameters and the arguments in the case of a method call. These parameters could be used to disambiguate one of several overload variants in the future, but overloads are not supported in structural types at the moment.

Limitations

var fields are not supported.
Dependent methods cannot be called via structural call.
Overloaded methods cannot be called via structural call.
Refinements do not handle polymorphic methods.

Implications

Overall, this proposal is almost a strict extension of the existing cases of structural types, so few code should be impacted. However, as it stands, this proposal drops support for var fields and overloaded methods, compared to Scala 2.

Opening this Proposal for discussion by the community to get a wider perspective and use cases.

oscar · February 8, 2019, 4:38pm

As you mention, currently on the JVM structural types are implemented with reflection. I’ve always thought it should be possible to use InvokeDynamic to implement them and actually give good performance, but I haven’t looked at it super carefully.

I wonder can you comment if there is anything in the new proposal that would hamper that, or if I am mistaken that InvokeDynamic could be used to improve performance of scala structural types.

smarter · February 8, 2019, 5:23pm

This is already the case in 2.12: Use invokedynamic for structural calls, symbol literals, lambda ser. by retronym · Pull Request #4896 · scala/scala · GitHub, Dotty doesn’t do this yet, but I don’t think there’s anything fundamental that’ll prevent us from implementing it.

Krever · February 8, 2019, 6:22pm

As this is brought for discussion, I would like to ask two questions:

Does this impact type-checking in any way?
Were record types considered in designing this change? I tried to summarize my understanding of the current state of record types in scala in this post but got no response. AFAIU, structural types and record types are pretty close concepts. Especially I’m interested in how they play together with intersection types. Would the following compile in dotty?

type Foo = {
  val foo: Int
}
type Bar = {
  val bar: String
}
type FooBar = Bar & Foo

case class MyFooBar(foo: Int, bar: String, baz: Long)

val x: FooBar = MyFooBar(1, "a", 2L)
x.foo
x.bar

I expect this may not be in scope of the proposal but it also feel like a good place to get some answers

If the above does (or will) work, what would be the current and final performance impact? Could we expect this typescript-like style to be zero-cost?

Ichoran · February 8, 2019, 8:50pm

I don’t like the proposal at all, at least not without a really awesome explanation of why this is a critical feature.

The biggest drawback of structural types has been that they have historically been an unexpected performance landmine. Throwing the implementation back to the user allows for even bigger performance landmines as you get a custom implementation that works well for one thing and clobbers everything else (that you probably didn’t realize was there).

I think there’s cause to remove structural types entirely; if you need the functionality, then instead of silently converting you could explicitly bar.asStructure[Foo]. (This is probably best done with an automatic proxy mechanism, not reflection.)

I also think there’s cause to come up with the fastest general-purpose implementation of structural types, and bake those into the compiler to ensure best-in-class performance.

But the current proposal seems to have only downsides. It doesn’t even work, apparently, to allow customization as it has to be “special-cased by the compilers for JS and Native to support existing uses of structural types from Scala 2”. If the initial number of use-cases for a new general mechanism is exactly 1 (the old specific mechanism, for one of three specific cases), that is a really serious blow against its value.

Again, maybe there is some awesome motivation here that hasn’t been made explicit, but so far I am not a fan.

jducoeur · February 8, 2019, 9:36pm

IMO, this post kind of buries the lede.

For me, the fact that this mechanism is going to underlie old-style structural types is a minor detail – I never used them anyway. I appreciate that what was compiler magic is going to become less magical, but it’s not something I was desperately worried about.

The exciting part to me is the potential as shown in the linked Dotty docs for being able to represent dynamic data with natural Scala syntax. This looks like a really clean and elegant way of dealing with the omnipresent problem of interacting with DB fields and suchlike. People have been hacking ways to do this for ages; having a clear and official approach is a big win, IMO.

I can see where @Ichoran is coming from here, and I agree that from a performance POV this sort of thing is dangerous. That said, structural typing has always been a bad idea for performance-critical code. But for the sort of application-level / business-logic POV I tend to mostly work in, where performance is a relatively minor consideration, it’s great – flexible, clean, easy to use and to build libraries with.

So I’m strongly in favor – but the docs should probably make the point that (like most power features) this should be used with care, and libraries based on it should document their performance…

soronpo · February 8, 2019, 10:43pm

Does this SIP in anyway affect the ability to do T {} like in shapeless (if that is still required)?

smarter · February 8, 2019, 11:01pm

Empty refinements, and refinements containing only type declarations keep their current behavior. However, in this case the T {} trick is abusing an implementation detail more than a specified behavior, I’d rather we add an explicit mechanism to the language to do narrowing than try to replicate that (but this is a completely separate discussion).

soronpo · February 8, 2019, 11:24pm

I agree

gkossakowski · February 9, 2019, 2:11am

That PR implements just looking of java.lang.Method instance via invokedynamic but the actual invocation still goes through a reflective call.

REPL verification:

Welcome to Scala 2.12.6 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_121).
Type in expressions for evaluation. Or try :help.

scala> :paste
// Entering paste mode (ctrl-D to finish)

class Foo {
  def abc(x: { def xyz(a: Int): Int }): Int = x.xyz(12)
}

// Exiting paste mode, now interpreting.

<console>:12: warning: reflective access of structural type member method xyz should be enabled
by making the implicit value scala.language.reflectiveCalls visible.
This can be achieved by adding the import clause 'import scala.language.reflectiveCalls'
or by setting the compiler option -language:reflectiveCalls.
See the Scaladoc for value scala.language.reflectiveCalls for a discussion
why the feature should be explicitly enabled.
         def abc(x: { def xyz(a: Int): Int }): Int = x.xyz(12)
                                                       ^
defined class Foo

and then:

:javap Foo
public int abc(java.lang.Object);
    descriptor: (Ljava/lang/Object;)I
    flags: ACC_PUBLIC
    Code:
      stack=6, locals=4, args_size=2
         0: aload_1
         1: astore_2
         2: aload_2
         3: invokevirtual #79                 // Method java/lang/Object.getClass:()Ljava/lang/Class;
         6: invokestatic  #81                 // Method reflMethod$Method1:(Ljava/lang/Class;)Ljava/lang/reflect/Method;
         9: aload_2
        10: iconst_1
        11: anewarray     #4                  // class java/lang/Object
        14: dup
        15: iconst_0
        16: bipush        12
        18: invokestatic  #87                 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
        21: aastore
        22: invokevirtual #91                 // Method java/lang/reflect/Method.invoke:(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
        25: goto          34
        28: astore_3
        29: aload_3
        30: invokevirtual #95                 // Method java/lang/reflect/InvocationTargetException.getCause:()Ljava/lang/Throwable;
        33: athrow
        34: checkcast     #97                 // class java/lang/Integer
        37: invokestatic  #100                // Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
        40: ireturn

the invocation itself could be implemented via invokedynamic but that’s a separate task.

gkossakowski · February 9, 2019, 2:20am

Following up: I think fast structural types invocation would be a fun and a fairly self-contained project to work on. I don’t have spare cycles to roll up my sleeves. If there’s somebody young reading this thread and looking for a fun VM/compiler project, ping me on Twitter and I could sketch out for you a plan how to attack this. High amount of praise from community guaranteed to whoever ships this.

Ichoran · February 9, 2019, 4:07am

The linked record-type-like-thing is cool! I’d love to have that. But I’m not convinced that this capability needs to go back and impact existing structural types.

I just don’t think the base structural types should have anything to do with that, not unless it works great with (1) not reflection on the JVM; (2) something decent on JS; (3) something decent on native. Right now the plan seems to be to use a poor implementation on the JVM and not even try for JS and native.

Otherwise we’re baking in a feature conceived to benefit low-performance logic into a spot where it can continue to surprise people. At least the way it is now, there is some hope that structural types will someday become super-awesome instead of slow and difficult. (C.f. length vs length(). Which, by the way, will get even worse with the proposal to not even handle overloaded methods.)

odersky · February 9, 2019, 11:39am

I considered removing structural types, but it turned out there were too many use cases to be able to do that.

Improving the current, reflective structural types implementation is a possibility but is besides the point. I believe structural types are used rarely not primarily because they have bad performance - there are lots of other things falling in the poorly performing category which are nevertheless used widely e.g. monad transformers with trampolining. It’s just that there are not that many use cases where they make sense. So, we could make them much faster but it would be effort spent on a rather marginal usecase. I am happy to accept PRs doing that, but won’t do it myself.

But there is another big usecase that this proposal enables: Library defined, typed access to dynamic data structures. I.e. what people typically want records for. This is not possible with existing structural types but is made possible now.

So the elegance of the proposal is

it repurposes structural types for a much more interesting use case
it opens possibilities that were done in an ad-hoc way using records before
it still supports the existing use of structural types. It’'s just that instead of a language import
import scala.language.structuralTypes you now achieve the same with a real import import scala.reflect.Selectable.reflectiveSelectable.

Ichoran · February 9, 2019, 11:52pm

Having statically typed access to dynamic data structures sounds great.

Is there a writeup somewhere about why we need old-style structural types at all?

I can’t speak for everyone, but my experience with structural types was poor from the outset, which is why I never used them. There were two related cases where they seemed handy.

Unifying disparate logic in a type-safe but semi-ad-hoc way, as an alternative to boilerplate-heavy typeclasses, especially when I don’t have control over what’s being given to me.

Unfortunately, this immediately ran into problems: differences that were irrelevant when writing code became showstoppers, like

def closeMe(closeable: { def close: Unit }): Unit = closeable.close

class A { def close: Unit = {} }
class B { def close(): Unit = { println("Closing!") } }

closeMe only works on A, not B. This thwarts the use of structural types for this use case. If I have control of all the code, I can unify it. But the point here is to unify cases where I may not, so it’s a failure.

(This was literally the first thing I tried with structural types–I wanted to close things with a close method, but couldn’t reliably.)

Efficient selection of arbitrary subsets of well-defined capability, again as an alternative to typeclasses.

This one is immediately sunk by the terrible performance. For example,

case class Vec2(x: Double, y: Double) { def L2 = x*x + y*y }
case class Vec3(x: Double, y: Double, z: Double) { def L2 = x*x + y*y + z*z }

class len(v: { def L2: Double }) = math.sqrt(v.L2)

This works, but it’s so slow compared to the original operations that it’s a terrible idea to use it. Instead, you should create a trait or typeclass that abstracts out the L2 functionality.

With two big strikes against, my decision about structural types was to basically never use them on purpose; and until .reflectiveCalls went in, occasionally and very unwelcomely accidentally used them when doing new Foo { def whatever } and then calling whatever. There are lots of cases where structural types for ad-hoc abstraction make more sense than tedious boilerplatey typeclasses, but when structural types are slow and fragile (almost entirely because of implementation details) and typeclasses are fast and robust, typeclasses win every time.

So, anyway, for structural types I see this proposal as digging them yet further into the ground. Now, not only does the first example fail if you don’t get your parens to agree, this one fails too:

class C {
  def close(retry: Boolean): Unit = ???
  def close: Unit = {}
}

And the second use case, which could be rescued by some clever compiler work, is probably pretty much buried by requiring it to be library-level. (Maybe I’m mistaken and there’s a good user-level implementation.)

So I don’t really see the point of keeping old-style structural types around at all. Just make all the old syntax fail, and require the new syntax for a different imagining of structural types. E.g. Reflect { def close: Unit = {} }.

Stuff like

class Foo { def customary(i: Int) = i + 1 }
val x = new Foo { def novel(i: Int) = i*i*i + 2*i*i + 3*i + 4 }

would no longer have anything to do with structural typing, but rather would be an (ordinary) anonymous subclass, with normal rules about how method calls work: either you can’t get at the method at all when it’s anonymous, or it’s just a regular method that you have access to as part of an implicitly-generated and therefore weirdly-named but absolutely-normal class type.

Anyway, my objection to the proposal isn’t about what it enables. I think what it enables is great. I just think it leaves behind a trail of historical crud in the language supporting a feature that causes more problems than it solves, and probably makes it harder to improve that aspect of the feature.

smarter · February 9, 2019, 11:59pm

Compatibility. Unless you can think of an automatic way to rewrite all the existing code that uses structural types ?

Ichoran · February 10, 2019, 12:57am

Without an investigation of what structural types are being used for, I can’t suggest a rewrite. The new spec already requires some manual rewriting, so it’s already not taking source compatibility too seriously. Maybe type Foo = { def ... } to type Foo = scala.reflect.Structural { def ... }, where reflect.Structural is a best-effort user-space implementation for each of JVM, JS, and native?

smarter · February 10, 2019, 1:18am

If you’re referring to the import change, I think we can just alias scala.language.structuralTypes to scala.reflect.Selectable.reflectiveSelectable to mitigate that.

Ichoran · February 11, 2019, 4:09am

No, I mean the reduced feature set.

sjrd · February 11, 2019, 10:31am

Ironically, the fact that we had to support structural types in Scala.js forced us to introduce dedicated support in the Scala.js linker. And then we exploited that dedicated support, through structural types, to implement some stuff in user-space that I don’t know how I would implement otherwise.

For example, we can enhance the API of some JDK classes with Scala.js-specific public methods, which are then only accessible if you cast an instance to a structural type that defines that method. We can provide an implicit class in our library that hides this cast away, to provide Scala.js-enhanced APIs.

They’re also used in some cases to port libraries that rely on some reflection on the JVM. To some extent, the same technique can be used to support those libraries.

Whether they are problematic from a performance point of view or not, structural types now fulfill a need that is not otherwise addressed in Scala.js.

That doesn’t mean that we should have designed them in the same way if the requirements had been identified first, of course.

sjrd · February 11, 2019, 10:34am

Regarding the reduced feature set, TBH I am myself not very happy about it. Not that I personally use the stuff that is dropped, but I really don’t see a compelling reason why they shouldn’t be supported, and the compatibility argument alone ought to be enough to keep them.