Towards better error messages in Scalac

+1 for decoupling them

Paul, that’s very interesting. Do you have an example of any language and compiler that do that?

It’s a great initiative. I really like the idea of having unique IDs, especially for warnings. This would allow warning levels and individual on/off switches. For an example of how to document error messages, have a look at the C# compiler error reference. They’ve been doing it for ages for all their compilers.

Regarding range positions, I think it’s a distraction. In terms of real value for users they add very little. Once you see the point where the error starts, knowing where it ends won’t make you suddenly understand the error you had no idea about. Given their complexity and memory penalty I think they’re not worth pursuing in this context. The place where they might help is IDEs, but highlighting the current token is usually good enough.

I have some concerns about a hard schema for error messages: it might be too painful to evolve. You’d probably do that by stuffing additional information in a “details” string field, leading back to clients parsing that string (or break the schema and clients built on top of it). On the flip side, the compiler is mature enough to have a relatively stable error message structure by now, so who knows.

However, I think syntactic improvements in how errors are presented miss the biggest problem with errors:
those that exhibit “spooky action at a distance”: a top-level implicit not found due to a failure deep inside the implicit search tree. You’d actually want to see the innermost error (or errors), but you’re presented with something like this:

Cannot materialize pickler for non-case class: List[model.Command]. If this is a collection, the error can refer to the class inside.
[error]     AutowireClient[Api].exec(cmds).call().toRx.map {
[error]                                        ^
[error] one error found

So, besides better syntax, I think semantic improvements (provide more meaningful context) will be the real win. There’s an entire PhD thesis on that subject, though the overhead of the additional tracking made it prohibitive to include in regular Scala. Maybe a simplified version of -Xlog-implicits trace limited to the current error would be a good start.

5 Likes

I agree with you, these tasks are better split up. That’s what I meant with:

I think the best way to move this forward is that someone works on the compiler infrastructure, and then other contributors improve actual error messages (one per PR).

The idea is that contributors work on the Error => String part where one developer focuses on the ADT design. I guess that your comment emphasizes the need for pluggable message interpreters rather than only having the stock one, am I right? (especifically for different uses like IDE consumption I suppose)

I think protobuf would make it easier to evolve — not sure how painful that would be but in Zinc we use protobuf for essentially the same task and we find it pretty stable. It’s a part of the schema that hasn’t been changed in ages too.

The error schema should decouple the actual details (as you mentioned) and all the metadata, so that clients can reconstruct messages from it. But I strongly encourage clients not to parse the details: they should just show the details as they are.

I agree, it’s already been linked before but GitHub - tek/splain: better implicit errors for scala shows implicit resolution chains when an implicit fails. I’ve never used it myself, but I see the value of doing so and doing some work to make the errors easier to follow.

I think it would be cool to also suggest which imports are missing for the use of an extension method. Imagine I use the extension method map but it’s not in scope because I forgot to import it. I’d like the compiler to suggest me the imports I need to add to my file for the code to compile (this, technically, may prove quite challenging because it requires the compiler to know where all these extension methods are and what their signatures is).

Yes, Rust has many error messages that benefit from this semantic improvements. I personally don’t mind the syntax, and that’s something that can be worked on after the core reporting abstractions are redesigned.

I think @psp idea is extremelly important. It would make possible to let people choose the verbosity of error message with a flag, experiment on several rendering (and use an plugin which better fits their personnal taste).

You can even thing at one more indirection level so that their is an user actionnable Error ADT => Error ADT step between the compiler error analysis and the compiler internal error management. It would make possible to finally take care of selectivetly ignoring warning messages like deprecation one ( https://github.com/scala/bug/issues/7934 ), or even selectively change the error level of class of errors (“I want that non-exhaustive pattern matching are ALWAYS errors, because why the hell aren’t they?”)

I’d like to add a few questions to help Scala Contributors give us feedback. Without Community feedback, we only have partial ideas and not actionable items. :slight_smile:

  1. What are the improvements in error reporting that you can imagine?
  2. Can you name a few of them and tell us how the compiler would suggest you solutions?
  3. What is a good error message for you?

I want this discussion to shed some light on the best way to see this initiative through, and how this change would be welcome by every single developer in our Community. After that, we can create a plan. :rocket:

Your proposition are - for me - very good. I’m also prefering the Rust still to the Elm style, but I don’t thing I’m sure why.
There was same nice discussions on twitter (at least) when @felixmulder talked about the subject for Dooty around end of 2016. See for ex https://twitter.com/FelixMulder/status/776828995232989184 (but there was a lot more, IIRC). Perhaps could bring some insights about what he gathered, to?

For, me one important point for an error message reporting is to be able to quickly saw what is the difference between expected and current thing. In fact, whatever help me diff at a glance :slight_smile:

Typically, when using a type intensive lib (Shapeless, Freek, etc), you don’t care about the 15 same type, you want to see the actual difference between what was provided and what was given. @psp had a lot of tweets / resources on the subject, but unfortunately, that get deleted.
For that, it is also generally preferable to keep alias type than fully resolved in the summary (and only have the fully resolved in a detailled message).

An other example is when you have a case class and you missed a parameter, the compiler should be able to point to the missing one, something like (not actual syntax/presentation I would like :):

foo(a,    c) 
Error: missing b?
1 Like

Whatever you do, please do not break the parsable format that sbt already uses or you will break Emacs and I do not have the bandwidth to fix that.

Also, bare in mind colour blind and blind developers. Do not rely on colour to replace text content. You may want a trial group (btw there are the sorts of legacy things can to existing systems have spent years optimising, and a rewrite / redesign will need to do the same again)

I am all for better messages, especially for implicits, but I see no reason to change the layout by default. For opt in, do whatever you want.

1 Like

The error format will be broken. The point is to have this by default for every Scala developer, not to force them to remember a flag they can use to get more readable errors.

When you parse error messages, you’re relying on an implementation detail of the compiler, you cannot expect that the format the compiler uses will be the same forever. It’s like parsing the output of scalac -Y to know which flags does the compiler support. It probably works short-term, but it’s a terrible idea.

As I’ve said before, the idea is to provide cross-language readers/writers via schema files or protos. You could use GitHub - brown/protobuf: Common Lisp implementation of Google's protocol buffers for a fast migration or ask other emacs users to do it. We’re talking long-term, there’s still time before 2.13 is released, and remember that making sure external tools can consume error messages is part of the plan.

2 Likes

I’m not sure what you have in mind as far as tooling integration, but maybe
one possibility is to have a flag that will cause scalac to output only
json or something. That might make it easier to migrate tools that parse
stdout.
Also if the LSP (e.g. in sbt 1.1.0) would use the json version rather than
the human-friendly version, perhaps GUIs could have a better way of
presenting errors than is possible on the console (without having to do
special parsing).

Well, I guess 2.12 will just have to be my last version of Scala, unless somebody contributes support for 2.13.

No other language has broken their output format so mercilessly. It is NOT an implementation detail, it is documented output. I would appreciate if updating downstream tooling was part of the funded scope, since the rest of us are being forced to donate significant time to play catchup at no benefit to us.

I tend to agree with @fommil that compatibility is important and we should not waste precious time playing catch-up. Of course, there might be very important improvements that are impossible without breaking existing clients, but I’m optimistic about that not being the case here.

What’s the bare minimum that needs to be kept in order for emacs and other tools to keep working? Would something like <filename>:line: message be compatible enough, where message is “unparsed” and includes the code snippet? This is certainly enough for Eclipse and VS Code to keep working until more advanced info is available via json or whatever it is.

$ scalac -d /tmp private.scala 
private.scala:15: error: not found: value X1
    case Some(y @ X1) =>
                  ^
one error found
1 Like

Yes something like that would be good. The regexes we’re using could be added to the tests to make sure the new format does not break downstream. https://github.com/ensime/emacs-sbt-mode/blob/master/sbt-mode.el#L210-L229

Needs to be full file name to avoid ambiguity, on the same line, because Emacs has trouble with multi-line error messages. Emacs can render the short version to the user if anybody cares enough about aestetics.

There was an interesting talk at Off the Beaten Track 2018 (co-located with POPL), named Explaining Type Errors.

The gist of the idea was that we should be moving away from “static error messages” and towardsinteractive error explanations.” Different people from different backgrounds look for different things in error messages, and displaying too much information would make the errors too verbose. Instead, we should be able to click and expand dependent error explanations (to see the potential sources of the main error), query the types of the terms involved, display the implicits in scope, access the documentation on certain features, etc.

I think this would work particularly well with implicits, which are currently pretty painful to debug, but for which you typically don’t want to display all available information (the entire resolution trace) to the users.

This is one more reason to use error ADTs as the core representation of errors, and then let interpreters do the job of presenting them to the users. It seems that it would be feasible to keep a “backward compatibility” interpreter that generates error messages the old way, so as to avoid breakage from legacy tools.

Two notable reactions to the OBT’18 talk were:

  • Martin mentioned that keeping track of precise error sources (as presented in the talk, which was about a Hindley-Milner-style system) may not scale, as the compiler is typically already under some memory pressure; re-running a more precise type-checking algorithm only if an error is encountered is an option, but it means users have to wait even longer to get feedback (also, I think it may be harder to maintain both the fast and precise checkers in parallel).

  • Someone involved in the implementation of Rust mentioned that having precise errors, and in particular precise position spans, was far from trivial – it involved a great deal of adaptation to the compiler and the addition of dedicated runtime structures.

2 Likes

It should be possible to supply a Reporter that emits legacy messages, as mentioned in the previous comment. That should be a design goal.

Another easy goal is to support filtering of messages, as also mentioned previously.

Another possible goal is support of downstream linters. I’d like my pluggable linter to work with my pluggable message suppression system and other reporting components so that clippy can offer advice about lints.

An example wart in the current API is the “re-run with different options for more information” message, which has to be customized for different tool environments. It’s easy to slip into vendor lock-in.

3 Likes

Most important information for me is filename, line and column of the error.

Another productivity boost is to place error message on the new line, not always do I have whole screen devoted to reading error messages and as soon as I have to scroll or wrap lines productivity rapidly degrade.

[error] src/main/scala/com/foldright/LineReader.scala:6:12:
[error] not found: value line
[error]     while (line != null) {
[error]            ^

Disclaimer: I do use Emacs + highly customised sbt-errors-summary

3 Likes

I am looking forward to better error messages in scala but please do consider emacs users and maintain compatibility.

1 Like

It’s not bad at all, I already do that in imclipitly using scalameta to find possible enrichments in the codebase and generate scala-clippy advice for missing enrichments. As long as we agree on a format then library authors can include a file in resources of enrichments they offer, then when a missing method is found the compiler can read the resources directory and display enrichments offered by the libraries on the classpath.

The version using scala-clippy can automatically generate advice based on the code in your project to catch messages like

val foo = Option(5).valueOr(0)

value valueOr is not a member of Option[Int]
and augment the message with:
Clippy advises: You may need to import com.helpful.OptionEnrichments._

If a nice ADT was exposed then it would be easy to augment the advice from an IDE to just add the import.

btw, hiding full pathnames in sbt in emacs is now documented in the ensime docs.

http://ensime.github.io/editors/emacs/sbt-mode/#related-customizations

1 Like

My 5 cents: Fine-granular warnings silencing

This should be easy to implement but would help a lot.