Tips on Fixing Scala2 Bug 12820

TLDR …

  1. I’m looking for feedback on my work to fix scala/bug#12820 in the scala2 scaladoc tool.
  2. The gist is when an external class belongs to a named java module then we must add it as prefix to the javadoc URL.
  3. This is something we already do for jdk URLs. The method by which we figure out the module name needs a fix.
Expand for the rest of the fact-finding I've done

0. Implementation

There are two things I’m unsure about. I’m a first time contributor to this project.

  • Is there an “owner” of the scaladoc tool (no table entry)?
  • Is there a good way to get the java module of a Symbol? I use reflection on the javaClassName. And this doesn’t appear to work if I generalize to external library javadocs.

1. Java Modules and Javadocs

I’ve found that if java classes belong to a named module then generated javadocs use that module’s name as a prefix.

For example if in a sandbox …

mkdir java; mkdir java/org; mkdir java/org/alice; mkdir java/org/alice/bob;
echo "module org.xavier.zach { exports org.alice.bob; }" > java/module-info.java
echo "package org.alice.bob;" >> java/org/alice/bob/Interface.java
echo "public interface Interface {}" >> java/org/alice/bob/Interface.java
javac java/module-info.java java/org/alice/bob/Interface.java
javadoc -d javadocs -sourcepath java -subpackages org
tree javadocs

We get output like …

javadocs
[...]
├── org.xavier.zach
│   ├── module-summary.html
│   └── org
│       └── alice
│           └── bob
│               ├── Interface.html
│               ├── package-summary.html
│               └── package-tree.html
[...]
└── type-search-index.js

This example illustrates that …

  • The module name doesn’t have to cover the names of the java classes it contains (this surprised me).
  • The module name is a prefix in the path to the class’s page.

2. Method for Module Name

We currently fetch the module name from the class path entry that loaded our symbol. We can add logging to MemberLookup.scala to see what these class path entries look like on the example Scala repro from the ticket.

override def findExternalLink(sym: Symbol, name: String): Option[LinkTo] = {
  [...]
  classpathEntryFor(sym1) flatMap { path =>
    /* + LOGS HERE */ println(s"MATHIAS ${sym1.javaClassName} ${path}")
    if (isJDK(sym1)) {
echo "/** Example: [[java.util.concurrent.atomic.AtomicReference]]. */" >> Example.scala
echo "class Example" >> Example.scala
mkdir scaladocs
../build/pack/bin/scaladoc Example.scala -d scaladocs | grep "MATHIAS" | awk '{print $2,$3}' | sort | uniq

Here we can see what symbols we looked up external links to,
and what entries on the classpath they came from.

java.lang.Class /usr/local/Cellar/openjdk@17/17.0.12/libexec/openjdk.jdk/Contents/Home/jmods/java.base.jmod
java.lang.String /usr/local/Cellar/openjdk@17/17.0.12/libexec/openjdk.jdk/Contents/Home/jmods/java.base.jmod
java.util.concurrent.atomic.AtomicReference /usr/local/Cellar/openjdk@17/17.0.12/libexec/openjdk.jdk/Contents/Home/jmods/java.base.jmod
scala.Boolean /Users/math-ias/Desktop/scala/build/pack/lib/scala-library.jar
scala.Int /Users/math-ias/scala/build/pack/lib/scala-library.jar
scala.Long /Users/math-ias/scala/build/pack/lib/scala-library.jar
scala.Unit /Users/math-ias/scala/build/pack/lib/scala-library.jar
scala.package$ /Users/math-ias/scala/build/pack/lib/scala-library.jar

Since java9 stdlib classes have been put in a number of named java modules.
We can see there’s a naming convention to name the JMOD file after the module name, but I don’t think this is binding.

Furthermore, if we generate docs with the release flag …

../build/pack/bin/scaladoc Example.scala -d scaladocs -release 16 | grep "MATHIAS" | awk '{print $2,$3}' | sort | uniq

And we zoom in on the stdlib classes we can see they are loaded from a ct.sym file.

java.lang.Class /usr/local/Cellar/openjdk@17/17.0.12/libexec/openjdk.jdk/Contents/Home/lib/ct.sym
java.lang.String /usr/local/Cellar/openjdk@17/17.0.12/libexec/openjdk.jdk/Contents/Home/lib/ct.sym
java.util.concurrent.atomic.AtomicReference /usr/local/Cellar/openjdk@17/17.0.12/libexec/openjdk.jdk/Contents/Home/lib/ct.sym

I found an article on this special file The Anatomy of ct.sym — How javac Ensures Backwards Compatibility - Gunnar Morling.
My understanding is that ct.sym is a space efficient way to port newer java standard libraries to older java runtimes.

3. Addendum

That bit about older java runtimes makes me think it’s not a bug to link to the current JDK javadocs rather than the one specified with the release flag. I think even if you are running an older JVM, you are getting the latest JDK version of that stdlib class.

3 Likes

Good question, maybe @SethTisue can help clarify that? Maybe the same issue exists in Scala 3 and a similar fix can work for both?

External libraries won’t be present on the classpath of the JVM running the compiler, so this won’t work indeed.
The associatedFile of the Symbol would be your best bet, but when loading a class with -classpath in either javac or scalac, it won’t be treated as coming from a module, so this wouldn’t be enough. javac uses --module-path for this, but Scala (either 2 or 3) doesn’t support --module-path currently: Support JEP-261 Module System · Issue #529 · scala/scala-dev · GitHub.
In the example you have in Scaladoc generates incorrect link to JDK API docs with `-release` option · Issue #12820 · scala/bug · GitHub I guess you could change the url passed to -doc-external-doc to include the module name?

1 Like

There isn’t, I’m afraid. Some of us Scala 2 committers have at least a little knowledge/experience with it, but nobody still working on the project is actually expert on the Scaladoc tool.

Regardless, it’s clear you’ve made good headway on your investigation, so hopefully we can help get you unstuck. I’ll reply further on the scala/bug ticket and/or your draft PR.

1 Like

Thanks for the replies.

but Scala (either 2 or 3) doesn’t support --module-path currently

I’ve been evaluating creating a small piece of support for reading modules from the class file format since I’ve found that ct.sym is effectively a JAR (zip archive) with effectively class files (sig files). Maybe one day it could be a part of greater module support. :sweat:

I’ll reply further on the scala/bug ticket and/or your draft PR.

Thanks, I think your comment was helpful and we might be able to move a small fix forward.

1 Like