On-demand specialization

informarte · June 14, 2020, 10:11am

I am working on Yuck, a solver for combinatorial optimization problems specified using the MiniZinc language.

Obviously, performance is an issue for me, and hence I try avoid the usual pitfalls, like for loops and inadvertent boxing/unboxing in places where it matters. Still, I sometimes run in situations where I feel myself longing for something like C++ templates. I am well aware of the @specialized annotation but it is not good enough:

It only works for primitive types.
AFAIK, the number of specializations grows exponentially with the number of specialized type parameters. However, in my practice, I only need specializations for certain parameter combinations.
Nearly nothing is specialized in the Scala library (I guess to avoid code bloat), so when you use containers in specialized code, there will be a lot of boxing and unboxing, and the code will run way slower than non-specialized code that uses boxed values from the start.
In the case of libraries, not the user decides what to specialize but the provider.

So what about on-demand specialization? What if we allow for @specialized annotations in new statements? For example:

val intList = new List[@specialized Int]

Of course, overuse of this feature would result in code bloat, so it should be used only where performance matters.

So what do you think? Is my idea viable? Has this been tried before? Would it be easy to implement or would it take a lot of time and effort?

odersky · June 14, 2020, 11:21am

I think it’s an interesting idea. As far as I know it has not been tried before for Scala. There are a lot of open questions, however.

1 . What is the type of intList in val intList = new List[@specialized Int]? According to the rules we have, it would be List[Int]. But then nothing is gained, since methods of lists would still take boxed arguments. On the other hand, if one wrote instead

@specialized class IntList extends List[Int]

that would work better. Also it would make it clear that a new class is being defined, which will take up a lot of code space since it has to duplicate everything in List.

2 . Should specialized classes be final? It sure would make things simpler, but I am not sure whether it would be too restrictive.

3 . What is the return type of the :: method in IntList? Is it IntList or List[Int]? I assume the former.

4 . But what about the map method in IntList. It would have type [B](f: Int => B): List[B]. So, if you call it with a function of type Int => Int, it would still give you a List[Int], not an IntList. How can we get an IntList instead?

5 . What about user-defined adaptations? Can I override methods of List in the IntList class? Can I add fields or other methods? What are the rules for this?

6 . Sometimes we need to specialize more than one class together. How do we achieve that/

7 . How to do downstream optimizations? For instance, the sum method on List has this signature:

def sum[B >: A](implicit num: Numeric[B]): B

When specializing IntList we need to realize that the Numeric implicit will always be resolved to the Int instance of Numeric. How is this done? Then, we need to make use of this knowledge to eliminate Numeric and inline the call to plus. So, it seems to do a good job we need to resort to a full-blown optimizer.

I believe the implementation effort required would be quite substantial. The best bet is probably to compile and specialize from Tasty. In summary I think this would be a cool and very challenging project to explore.

tarsa · June 14, 2020, 1:33pm

One of objectives of http://openjdk.java.net/projects/panama/ is support for ad-hoc memory layouts. It’s incubating as a part of already released http://openjdk.java.net/projects/jdk/14/ (https://openjdk.java.net/jeps/370) and a revised implementation will be released as a part of http://openjdk.java.net/projects/jdk/15/ (https://openjdk.java.net/jeps/383).

Jasper-M · June 14, 2020, 5:25pm

I thought that the old dotty-linker/whole-program-optimizer did most of this.

odersky · June 15, 2020, 8:10am

Dotty linker did a whole program analysis to find opportunities of specialization. The optimizers of Scala.js and Scala native do something similar, but with less detailed types. What’s proposed here is different: Instead of a whole program analysis it’s the user who indicates what should be specialized.

RichType · June 15, 2020, 12:58pm

@informarte you may want to look at my work here on specialised Immutable Arrays for both simple value types and compound value types. This allows for unboxed collections of extends AnyVal types as well, allowing type safety with efficiency. You can also map and flatMap between the different specialisations. map and flatMap each using their own implicit Builder typeclass.

If you only need a bit of specialisation for small to medium sized collections though, its fairly straight forward to create your own stand alone IntList trait that doesn’t box.

curoli · June 15, 2020, 1:06pm

List is sealed and has only two sub-classes, :: and Nil. A lot of code relies on this. If I specialize List on demand, would this also specialize its subclasses?

Jasper-M · June 15, 2020, 1:32pm

It feels similar in the sense that the compiler would have to do a full analysis and specialization of all the code that is transitively depended on by the code that the user indicated.

RichType · June 15, 2020, 5:13pm

Just out of interest, does anyone have any idea how much efficiency loss is due to boxing and how much is due to heap allocation and garbage collection? I don’t know how much the JVM is able to avoid unnecessary heap allocation.

informarte · June 15, 2020, 7:16pm

@RichType, regarding the costs of boxing, it depends what you box and how often. Booleans and integers between -128 to 127 are cheap because cached instances will be used, everything else might slow down your program considerably. My rule of thumb is to avoid boxing (including Some) by all means in performance-critical inner loops; to this end I check the source code of the libraries I use and I also look at the resulting byte code.

informarte · June 16, 2020, 7:20pm

Thanks a lot everyone for the feedback on my idea so far!

@odersky raised the question about the type of:

val intList = new List[@specialized Int]

When thinking about specializations, I think it might be helpful to distinguish between the Java POD type int and the box java.lang.Integer.

My intuition is that intList should have type List[int], not List[Int], and hence:

The :: class gets specialized, too, and stores an int.
The :: method returns List[int].
The result type of map depends on the type parameter B of map: If B is Int, then the result has type List[java.lang.Integer]. If B is int, then the result has type List[int].
To use sum, we need an instance of Numeric[int]. If nothing else is available, we fall back to scala.math.IntIsIntegral and specialize it.