Enabling -Yrangepos by default

During a short discussion with @SethTisue regarding Adriaan’s suggestion that -Yrangepos be enabled by default going forward, I’ve begin to enumerate the remaining work that needs to be done:

  • scala/bug#10706, relating to the extension method stabilization PR (scala/scala#5999)
  • scala/scala#6222, and a few similar cases (noted on the PR) relating to the current behavior of Position#equals
  • A bug, which I’ll file when I’ve minimized it, surrounding the implementation of SIP-34 which seems to be mostly the same as the above
  • A few instabilities in error message ordering (probably ignorable)
  • Position validation errors should probably be warnings, and I’d argue that the check itself should be fully suppressable (since it requires two full traversals of the AST, one after parser and one after typer, and doesn’t affect semantics)

I haven’t run a community build with -Yrangepos on, and I haven’t yet audited the positions tag, so there may be more issues lurking.

I’d be glad to work on enabling it by default for 2.13. Seth recommended I create a thread here for suggestions and feedback before getting seriously started.

Thanks!
–harrison

(sorry about the lack of issue links; discourse won’t let me add more than 2 links it seems.)

3 Likes

Has someone benchmarked the impact of enabling -Yrangepos in real-world projects? I think it would be interesting to know this information so that we can better reason how this would affect the ecosystem :smile:

1 Like

That’s reasonable. I suspect that, with the position validation disabled, the performance impact shouldn’t be too great. I’ll run a benchmark with rangepos on and validation off, and see what changes.

I think that position validation shouldn’t be a normal warning, because usually it indicates a compiler bug that the user can’t do anything about. Running a validated (community-)build as a way of ensuring that everything works nicely would probably be better than issuing strange warnings to users.

Can’t we just run the community-build with the option both on and off and compare?

AFAIK the community build is ill-suited to timing, because it compiles each project exactly once and with no warmup. I’m running a comparative benchmark on better-files with compiler-benchmark right now; I’ll see if that gives meaningful results.

Here is a semi-scientific benchmark of turning on rangepos unconditionally. There seems to be a slight performance drop on the order of tens of milliseconds. I haven’t put any effort into checking if there’s optimization opportunities in the rangepos code; I suspect that there are, and I’ll try to optimize it as this progresses.

How do I read this benchmark to find the before and after numbers? In my own benchmarking (which unfortunately I can’t share) I found > 5% performance hit and this may indicate I need to revisit that benchmarking.

Ach, good catch; I pasted the wrong thing. Updated.

The top results are after; the bottom before. The “after” code also removes the position validation pass which -Yrangepos previously inserted after parser and typer, because I expect to put it behind a development flag, as mentioned above.

Am I correct in interpreting this as “the performance impact of the proposed change” then, not “the existing performance impact of enabling -Yrangepos”? If I am reading this correctly another way to interpret “tens of milliseconds” is about 5% based on this benchmark.

another way to interpret “tens of milliseconds” is about 5% based on this benchmark.

yep, that seems right as of that change. I suspect most of that is spent in atPos, which got a bit more complicated. Again, I’ll look into whether it can be sped up (almost certainly it can). If this does impose a significant performance penalty, that’s something that will need to be considered.

5% was enough for it to be rejected by default when I looked at it. I will be curious what you come up with for performance improvements though as something should be possible there. For comparison, Clang generates their version of a semanticdb (positions, symbols, everything to power an IDE) in under 5% performance overhead so for just positions you may find some big wins.

The community build is problematic for measuring compiler performance as the overall runtime is mostly other things (resolving dependencies, compiling sbt build definitions, running tests). You could imagine adding instrumentation to try to isolate out the different components, but that work hasn’t been done.

Our preferred way to get this kind of data is GitHub - scala/compiler-benchmark: Benchmarks for scalac . The included codebases aren’t as numerous or diverse as what we have in the community build, but it’s good enough.