Tips on Fixing Scala2 Bug 12820

TLDR …

  1. I’m looking for feedback on my work to fix scala/bug#12820 in the scala2 scaladoc tool.
  2. The gist is when an external class belongs to a named java module then we must add it as prefix to the javadoc URL.
  3. This is something we already do for jdk URLs. The method by which we figure out the module name needs a fix.
Expand for the rest of the fact-finding I've done

0. Implementation

There are two things I’m unsure about. I’m a first time contributor to this project.

  • Is there an “owner” of the scaladoc tool (no table entry)?
  • Is there a good way to get the java module of a Symbol? I use reflection on the javaClassName. And this doesn’t appear to work if I generalize to external library javadocs.

1. Java Modules and Javadocs

I’ve found that if java classes belong to a named module then generated javadocs use that module’s name as a prefix.

For example if in a sandbox …

mkdir java; mkdir java/org; mkdir java/org/alice; mkdir java/org/alice/bob;
echo "module org.xavier.zach { exports org.alice.bob; }" > java/module-info.java
echo "package org.alice.bob;" >> java/org/alice/bob/Interface.java
echo "public interface Interface {}" >> java/org/alice/bob/Interface.java
javac java/module-info.java java/org/alice/bob/Interface.java
javadoc -d javadocs -sourcepath java -subpackages org
tree javadocs

We get output like …

javadocs
[...]
├── org.xavier.zach
│   ├── module-summary.html
│   └── org
│       └── alice
│           └── bob
│               ├── Interface.html
│               ├── package-summary.html
│               └── package-tree.html
[...]
└── type-search-index.js

This example illustrates that …

  • The module name doesn’t have to cover the names of the java classes it contains (this surprised me).
  • The module name is a prefix in the path to the class’s page.

2. Method for Module Name

We currently fetch the module name from the class path entry that loaded our symbol. We can add logging to MemberLookup.scala to see what these class path entries look like on the example Scala repro from the ticket.

override def findExternalLink(sym: Symbol, name: String): Option[LinkTo] = {
  [...]
  classpathEntryFor(sym1) flatMap { path =>
    /* + LOGS HERE */ println(s"MATHIAS ${sym1.javaClassName} ${path}")
    if (isJDK(sym1)) {
echo "/** Example: [[java.util.concurrent.atomic.AtomicReference]]. */" >> Example.scala
echo "class Example" >> Example.scala
mkdir scaladocs
../build/pack/bin/scaladoc Example.scala -d scaladocs | grep "MATHIAS" | awk '{print $2,$3}' | sort | uniq

Here we can see what symbols we looked up external links to,
and what entries on the classpath they came from.

java.lang.Class /usr/local/Cellar/openjdk@17/17.0.12/libexec/openjdk.jdk/Contents/Home/jmods/java.base.jmod
java.lang.String /usr/local/Cellar/openjdk@17/17.0.12/libexec/openjdk.jdk/Contents/Home/jmods/java.base.jmod
java.util.concurrent.atomic.AtomicReference /usr/local/Cellar/openjdk@17/17.0.12/libexec/openjdk.jdk/Contents/Home/jmods/java.base.jmod
scala.Boolean /Users/math-ias/Desktop/scala/build/pack/lib/scala-library.jar
scala.Int /Users/math-ias/scala/build/pack/lib/scala-library.jar
scala.Long /Users/math-ias/scala/build/pack/lib/scala-library.jar
scala.Unit /Users/math-ias/scala/build/pack/lib/scala-library.jar
scala.package$ /Users/math-ias/scala/build/pack/lib/scala-library.jar

Since java9 stdlib classes have been put in a number of named java modules.
We can see there’s a naming convention to name the JMOD file after the module name, but I don’t think this is binding.

Furthermore, if we generate docs with the release flag …

../build/pack/bin/scaladoc Example.scala -d scaladocs -release 16 | grep "MATHIAS" | awk '{print $2,$3}' | sort | uniq

And we zoom in on the stdlib classes we can see they are loaded from a ct.sym file.

java.lang.Class /usr/local/Cellar/openjdk@17/17.0.12/libexec/openjdk.jdk/Contents/Home/lib/ct.sym
java.lang.String /usr/local/Cellar/openjdk@17/17.0.12/libexec/openjdk.jdk/Contents/Home/lib/ct.sym
java.util.concurrent.atomic.AtomicReference /usr/local/Cellar/openjdk@17/17.0.12/libexec/openjdk.jdk/Contents/Home/lib/ct.sym

I found an article on this special file The Anatomy of ct.sym — How javac Ensures Backwards Compatibility - Gunnar Morling.
My understanding is that ct.sym is a space efficient way to port newer java standard libraries to older java runtimes.

3. Addendum

That bit about older java runtimes makes me think it’s not a bug to link to the current JDK javadocs rather than the one specified with the release flag. I think even if you are running an older JVM, you are getting the latest JDK version of that stdlib class.

2 Likes