Proposal for programmatic structural types

I considered removing structural types, but it turned out there were too many use cases to be able to do that.

Improving the current, reflective structural types implementation is a possibility but is besides the point. I believe structural types are used rarely not primarily because they have bad performance - there are lots of other things falling in the poorly performing category which are nevertheless used widely e.g. monad transformers with trampolining. It’s just that there are not that many use cases where they make sense. So, we could make them much faster but it would be effort spent on a rather marginal usecase. I am happy to accept PRs doing that, but won’t do it myself.

But there is another big usecase that this proposal enables: Library defined, typed access to dynamic data structures. I.e. what people typically want records for. This is not possible with existing structural types but is made possible now.

So the elegance of the proposal is

  • it repurposes structural types for a much more interesting use case
  • it opens possibilities that were done in an ad-hoc way using records before
  • it still supports the existing use of structural types. It’'s just that instead of a language import
    import scala.language.structuralTypes you now achieve the same with a real import import scala.reflect.Selectable.reflectiveSelectable.
2 Likes

Having statically typed access to dynamic data structures sounds great.

Is there a writeup somewhere about why we need old-style structural types at all?

I can’t speak for everyone, but my experience with structural types was poor from the outset, which is why I never used them. There were two related cases where they seemed handy.

  1. Unifying disparate logic in a type-safe but semi-ad-hoc way, as an alternative to boilerplate-heavy typeclasses, especially when I don’t have control over what’s being given to me.

Unfortunately, this immediately ran into problems: differences that were irrelevant when writing code became showstoppers, like

def closeMe(closeable: { def close: Unit }): Unit = closeable.close

class A { def close: Unit = {} }
class B { def close(): Unit = { println("Closing!") } }

closeMe only works on A, not B. This thwarts the use of structural types for this use case. If I have control of all the code, I can unify it. But the point here is to unify cases where I may not, so it’s a failure.

(This was literally the first thing I tried with structural types–I wanted to close things with a close method, but couldn’t reliably.)

  1. Efficient selection of arbitrary subsets of well-defined capability, again as an alternative to typeclasses.

This one is immediately sunk by the terrible performance. For example,

case class Vec2(x: Double, y: Double) { def L2 = x*x + y*y }
case class Vec3(x: Double, y: Double, z: Double) { def L2 = x*x + y*y + z*z }

class len(v: { def L2: Double }) = math.sqrt(v.L2)

This works, but it’s so slow compared to the original operations that it’s a terrible idea to use it. Instead, you should create a trait or typeclass that abstracts out the L2 functionality.

With two big strikes against, my decision about structural types was to basically never use them on purpose; and until .reflectiveCalls went in, occasionally and very unwelcomely accidentally used them when doing new Foo { def whatever } and then calling whatever. There are lots of cases where structural types for ad-hoc abstraction make more sense than tedious boilerplatey typeclasses, but when structural types are slow and fragile (almost entirely because of implementation details) and typeclasses are fast and robust, typeclasses win every time.

So, anyway, for structural types I see this proposal as digging them yet further into the ground. Now, not only does the first example fail if you don’t get your parens to agree, this one fails too:

class C {
  def close(retry: Boolean): Unit = ???
  def close: Unit = {}
}

And the second use case, which could be rescued by some clever compiler work, is probably pretty much buried by requiring it to be library-level. (Maybe I’m mistaken and there’s a good user-level implementation.)

So I don’t really see the point of keeping old-style structural types around at all. Just make all the old syntax fail, and require the new syntax for a different imagining of structural types. E.g. Reflect { def close: Unit = {} }.

Stuff like

class Foo { def customary(i: Int) = i + 1 }
val x = new Foo { def novel(i: Int) = i*i*i + 2*i*i + 3*i + 4 }

would no longer have anything to do with structural typing, but rather would be an (ordinary) anonymous subclass, with normal rules about how method calls work: either you can’t get at the method at all when it’s anonymous, or it’s just a regular method that you have access to as part of an implicitly-generated and therefore weirdly-named but absolutely-normal class type.

Anyway, my objection to the proposal isn’t about what it enables. I think what it enables is great. I just think it leaves behind a trail of historical crud in the language supporting a feature that causes more problems than it solves, and probably makes it harder to improve that aspect of the feature.

2 Likes

Compatibility. Unless you can think of an automatic way to rewrite all the existing code that uses structural types ?

Without an investigation of what structural types are being used for, I can’t suggest a rewrite. The new spec already requires some manual rewriting, so it’s already not taking source compatibility too seriously. Maybe type Foo = { def ... } to type Foo = scala.reflect.Structural { def ... }, where reflect.Structural is a best-effort user-space implementation for each of JVM, JS, and native?

If you’re referring to the import change, I think we can just alias scala.language.structuralTypes to scala.reflect.Selectable.reflectiveSelectable to mitigate that.

No, I mean the reduced feature set.

Ironically, the fact that we had to support structural types in Scala.js forced us to introduce dedicated support in the Scala.js linker. And then we exploited that dedicated support, through structural types, to implement some stuff in user-space that I don’t know how I would implement otherwise.

For example, we can enhance the API of some JDK classes with Scala.js-specific public methods, which are then only accessible if you cast an instance to a structural type that defines that method. We can provide an implicit class in our library that hides this cast away, to provide Scala.js-enhanced APIs.

They’re also used in some cases to port libraries that rely on some reflection on the JVM. To some extent, the same technique can be used to support those libraries.

Whether they are problematic from a performance point of view or not, structural types now fulfill a need that is not otherwise addressed in Scala.js.

That doesn’t mean that we should have designed them in the same way if the requirements had been identified first, of course.

1 Like

Regarding the reduced feature set, TBH I am myself not very happy about it. Not that I personally use the stuff that is dropped, but I really don’t see a compelling reason why they shouldn’t be supported, and the compatibility argument alone ought to be enough to keep them.

Reducing the feature set is to get back to acceptable complexity. This is for spec, implementation, and users trying to understand the library.

There’s also another aspect here. So far, the lions share of proposals that are discussed here and that will be discussed in the future were invented, spec’ed, and implemented by myself. I can’t clone myself, so have to prioritize what I can do. I can say with confidence that re-designing legacy structural types again is not something I will do, nor will I impose this on somebody else as a task.

So, any wish to change structural types from what they are now would have to come with a firm commitment to do the work.

I guess my real question here is: why were these features drop in the reimplementation in dotty? To the eye of someone who has never looked at the dotty Typer (i.e., my eyes), at least vars and overloaded methods don’t seem like they would impose any additional work (maybe an if here and there, but nothing fundamental). If I’m right, then I suppose I could even implement them myself. If I’m wrong, maybe you have a short enough explanation of what’s problematic?

Refinements in Dotty are name-based. I.e. a refinement is of the form

T { name: S }

Structural types are built from such refinements. It’s awkward to encode vars or overloading in this framework. In Scala-2, refinements were essentially some sort of anonymous class, so it would have been harder to drop these features than to keep them.

I think there are a lot of disadvantages:

  • It seems very useless to use records without var
  • I think it is very important having a way to access record’s data very quickly(like java invokedynamic)
  • I think the real killer feature of dynamic invocation is absence of binary incompatibility.

So I do not understand why this proposal is for type, I think It should work for trait either.
For example

   def newInstance[T](implicit tag: TypeTag[T]):T&Dynamic =  new DynamicProxy(tag)

In such implementation it will be very useful at least for me.

I would like to express my strong support for the proposal.

Building on the proposal, Olof Karlsson, a former student of mine, implemented extensible records and performed a comprehensive experimental evaluation, comparing with (1) old-style structural types, (2) case classes, (3) trait fields, (4) Shapeless, and (5) Compossible.

You can find the full results in the following paper:

Olof also gave a talk about this work at Scala Symposium '18:

What’s exciting about our records design is that it supports width and depth subtyping as well as type-safe extensibility also in a polymorphic context, using context bounds. Example:

  def center[R <: Record : Ext["x",Int] : Ext["y",Int]](r: R) = r + ("x", 0) + ("y", 0)

Here, the context bound : Ext["x",Int] expresses the requirement that it is safe to extend a record of type R with an “x” field of type Int.

Right now, our implementation requires a small extension of Dotty which synthesizes instances of an Extensible type class, which is used in the above context bound using the Ext type lambda:

  type Ext[L <: String, V] = [R <: Record] => Extensible[R, L, V]

Our records implementation easily supports more than 200 fields (important in some enterprise settings), in a scalable way, outperforming even case classes in some benchmarks (with more than 60 fields). See the above paper for microbenchmarks as well as a case study parsing JSON-encoded commit events from the GitHub API into typed records and processing them.

Implementation of the Dotty extension:

The performance evaluation uses a new benchmarking source code generator, called Wreckage, built on top of the JMH microbenchmarking harness:

5 Likes

I know this isn’t an ideal response, but I have never found a solution that’s best served with structural types. It’s always been either myself doing it wrong, or it’s been a case of a missing typeclass that represents the structure. I have used extensible records and similar mechanisms, and I’ve used Dynamic.

So my genuine question is if we actually need structural types in dotty? I know we’d need them for no-think porting of scala2 code to dotty, but are they actually required for dotty in and of itself?

Of course there are use case for it.
For example:

   //We have prototyped something like this in macros. . 
   def doSql[T](sql:String):Result[T] = ???
   
   def main():Unit = {
          doSql[{id:String;name:String}]("select '1' id, '2' name").foreach{r => 
                println(s"${r.id},${r.name}");
           }
    }

But in large datasets I prefer to use index access, so I do not use such method.
And for me, the dynamic invocation for traits(with var) would be more useful.

def main():Unit = {
          case class IdName(id: String, name: String)
          doSql[IdName]("select '1' id, '2' name").foreach{r => 
                println(s"${r.id},${r.name}");
           }
    }

I’m not saying this is by definition an improvement, but it achieves the same thing without the structural type.

I will not argue about static or dynamic linking. I would prefer to use dynamic one in this case but it is just a holy war.
I have a question. Why do not we drop anonymous classes, functions etc. I am sure any language would be turing complete?

Defining a class per query is unnecessary. We can use literal types to write record types. SIP-23 gave this example:

Under this proposal we can express the record type directly,

type Book =
  ("author" ->> String) ::
  ("title"  ->> String) ::
  ("id"     ->> Int) ::
  ("price"  ->> Double) ::
  HNil

The query example then becomes:

doSql["id" ->> String :: "name" ->> String]("select '1' id, '2' name") ...

1 Like

It seems, such complicated abstraction for simple pojo very good ilustrates some lack of dynamic capabilities.

1 Like