Improving the Scala Debugging Experience

ergys · May 13, 2020, 2:01pm

(The Scala Center team is dedicated to providing regular and transparent community updates about project plans & progress. In this forum we created a new topic category to allow quicker orientation going forward: “Scala Center Updates”. Even though all feedback is welcome, we keep the right to make executive decisions about projects we lead. Overview of all our activities can always be found at https://scala.epfl.ch/records.html)

Dear Scala Contributors,

Based on the Advisory Board proposal (SCP-022), the Scala Center is working towards improving the Scala Debugging Experience. To that end, we would like to share with you the reasons why this is necessary and the approach that we are about to take.

Motivation

The JVM-based Scala Compiler produces JVM class files. Those class files contain debug information in the form of a LineNumberTable, which essentially associates the line numbers of the source file to offsets in the bytecode, for the one source file that produced the class file (the original source file). This information is enough if all of the source code in the current class file comes from that one original source file.

Unfortunately, the LineNumberTable falls short when the current class contains source code from other files as well. One such scenario is inline functions. Inline functions can and usually do originate from files other than the class file being compiled. The LineNumberTable was not designed to include information from other source files, because Java does not allow method inlining at the language level. Instead, the JVM and the JIT compiler take care of inlining at runtime. However, in the case of Scala, inlining is possible at the language level and, in some cases, is important for performance.

The result is that, when a function is inlined, then during a debugging session the debugger will have no way of determining its source position. This leads to poor debugging experience. To remedy this situation, scalac has to generate extra debug information, which must be stored in the produced class files. The debuggers need to be aware of this information, in order to exploit it and provide a complete debugging experience.

The Kotlin approach

The Kotlin language faces the same challenge for inline functions. Here is how it works around it. Kotlin exploits the JSR-45 specification, in order to embed extra debug information into the produced class files. JSR-45 info is stored into a special class attribute called SourceDebugExtension and allows to embed information from multiple source files, which is what is needed for inline functions. However, Kotlin does not follow the specification to its core. This is because JSR-45 was meant for languages that compile to Java, before compiling to JVM bytecode. Neither Scala nor Kotlin follow this approach.

Therefore, Kotlin uses the JSR-45 information encoding format and placeholder in the class file, but the strata that it defines do not correspond to Java output. Instead, it defines two correlated strata: Kotlin and KotlinDebug.

The Kotlin stratum contains the mapping from (a) the original source file and (b) the references to other source files because of inlining, to a virtual output where the extra lines due to inlining are added to the end of the original source file lines. This allows Kotlin to embed the extra lines to the output class file.
The KotlinDebug stratum maps the inline positions to the original source, so that the debugger can trace through them.

In other words, the Kotlin stratum consists of an indirection, that points to the other source files from where there are inline functions, while the KotlinDebug stratum indicates the positions where the functions are inline.

What about Scala?

The approach adopted by Kotlin will serve as the ground for the approach that we are going to implement for Scala. The arguments in favour of this are the following:

The JSR-45 format is well-defined and is already implemented in various debuggers. Therefore, this code could be reused as a base to implement the debuggers for Scala.
Even if a debugger does not understand the special meaning of the extra strata in the SourceDebugExtension attribute, it will still be able to extract some information and use that, i.e. existing debuggers will require no modification.
The solution is elegant and easy to implement.

Limitations

The JSR-45 format only allows to store line information. Storing column information is not supported by the format. If column information is deemed to be important, and since we are not planning to follow the specification to its core anyway, we could modify the format to include column information, but this requires careful design, so that we remain compatible (to the largest extent possible) with existing debuggers that can read JSR-45 information.

Roadmap

We are currently in the “research and design” phase. We wanted to share this information with you, so that we can get your feedback and opinions. Based on that, we will move onto the implementation phase, which we expect to commence next week and complete by the end of June 2020.

Looking forward to your suggestions for making the Scala debugging experience better!

- The Scala Center team

rkrzewski · May 13, 2020, 3:01pm

I’m wondering if it is possible to provide source locations for code generated by macros? Because that would be amazing!

hepin1989 · May 13, 2020, 3:24pm

Will this improve the scala-async’s debug experience too?

povder · May 14, 2020, 7:58am

I think it would be a game changer for Scala.js if debugging experience was improved on that platform to be on par with the JVM one.

will-sargent-eero · May 18, 2020, 10:41pm

There’s a need for logging and debugging that goes beyond the information added to the class files – the macro generated implicits generated from https://github.com/lihaoyi/sourcecode#overview for example, turn out to be extremely useful for adding in the appropriate entry/exit logging statements and adding argument values. Can JSR-45 information be made available to the compiler generally?

jrudolph · June 10, 2020, 9:59am

One consideration would be that column information will amount to a lot of data because you would include it for any subexpression in a code file. Also that data might not compress easily (basically because code length of expressions has quite some entropy). I guess people might find a <10% increase in jar sizes for that feature acceptable but not a > 50% size increase (just making up numbers).

ergys · July 3, 2020, 1:27pm

This should be possible, but it is out of the scope of the current enhancement, which basically aims to provide a way to trace through inline locations. It could make for an interesting future enhancement, though.

If scala-async does inlining, then yes!

In that case, I guess that the scala-async classes/artifacts will have to be rebuilt and republished when the feature is complete.

At the moment, there are no plans for this. But I think that the information contained in JSR-45 strata is anyway limited when compared to the information provided by lihaoyi/sourcecode.

ergys · July 20, 2020, 1:49pm

Quick update: there is now a draft PR, which adds JSR-45 information generation capability to the Scala compiler (version 2.13.x).

You can find an overview of the implementation and the generated info on the PR itself. In a nutshell, the generated information is nearly identical to what the Kotlin compiler generates.

We would love for all involved parties interested in the feature (debugger, code coverage engineers, etc.) to consult the PR and the information generated, try it out and let us know of any issues or comments.

ergys · July 21, 2020, 9:11am

That is a good point.

With the current implementation, the effect on scala-library and scala-reflect is the following:

JAR	before	after	change
`scala-library`	5866464	6164668	+5%
`scala-reflect`	3608830	3828346	+6%