Make Scala-platform API independent of Java namespaces


#1

Outline

Users of Scala today depend on a multitude of libraries in the java namespace. This makes it unclear to user and library author alike whether or not a compiler backend (JVM / JS / Native) implements the desired JDK artifact.

Scala.js has done a great job of porting a subset of the java.* namespace, like java.time.LocalDate, but the support is not complete and there is no way to discern coverage or “completion” of the port.

By moving select parts into the scala.* namespace, a logical boundary between what’s supported and not is defined. It also provides a means to reimplement often used packages or features in a Scala-idiomatic fashion.

Long term goals

The intent of this suggestion is to increase predictability of library support among target platforms. Ideally, 100% of the scala namespace should be available on all platforms and cover common functionality like date/time, i18n, maths, networking, concurrency, host platform interop and so on. Tooling like scalac and sbt should, if possible, rely only on namespaces covered by the Scala platform.

Perceived benefits

Providing a single namespace that covers 99 % of common use-cases allows the Scala platform to evolve at its own pace. It gives non-JVM platform a coverage metric and clearly separate things that are undesired or hard to support (like java.lang.reflect) from the official platform.

Note that this suggestion does not intend to imply a separation from the JVM in any way; applications targeting the JVM would still be free to use whatever parts they like


SPP November Meeting Minutes
#2

well it would make more sense to have a module like structure á la java 9.
something you need to explictly enable to get it.
like my program requires scala.time.
without it I would not pull the module in, so that I’m still tied to java as-is. so that the current scala-library will be scala.base.


#3

Modularisation of the platform is not at odds with this suggestion. In fact, as the platform is modularised it makes even less sense to have dependencies on the Java libraries: If pulling in a scala.io module would add a transitive dependency on artifacts in java.io—which is today’s status quo—those transitive dependencies would also have to be covered by separate façades for other platforms, making successful cross compilation less predictable.


SPP Meeting 27th November 2017, 5 PM CEST
SPP Meeting 5th February 2018, 5 PM CEST
#4

Thanks for the proposal!
I will however say that I’m against this proposal because of it’s oversimplified view on the problem and lack of specific actionable and specific this-is-what-that-would-gain-at-that-and-that-cost points.

Providing a single namespace that covers 99 % of common use-cases allows the Scala platform to evolve at its own pace.

As someone participating in a standardisation ( reactivestreams.io ) process over the last years, I can tell you even agreeing on an API that has a total of 7 methods, and a specific goal, by definition, took years to reach agreement and mind-share. The “99% use-cases” phrasing makes me very nervous as it shows that while the proposal has a lot of good intentions, it does not seem to be backed by investigation in what realizing such effort would take – by which I mean, it’s a huge undertaking, with questionable benefits, and some things would end up leaking though anyway.

Coming back to the problem statement though:

This makes it unclear to user and library author alike whether or not a compiler backend (JVM / JS / Native) implements the desired JDK artifact.

This might be true, however the way this information would have to be relayed to users would in any case have to be on a projects website or repository, that “we work on … runtime”. Which isn’t very different from now.

I also don’t believe that libraries should be encouraged to claim “we run on X” without ever testing against that runtime. Yes, building a project for multiple runtimes does carry a cost for maintainers of testing against those runtimes, and I don’t think it should be encouraged to “yeah you don’t have to test because you use the … namespace!”.

As a thought experiment, we can also challenge my above statement that the “safe namespace” does not provide real safety – since runtimes differ. In such thought experiment, we can say that we only use “the safe namespace”, and are therefore proven to work on “all” runtimes… This sounds fine, but… then we need a way to guarantee that some project did not accidentally use the “not safe namespace”, and how would we do that? Either we need to implement a strong encapsulation (basically reinvent JDK’s modules, and enforce them in Scala). Such module system exists, and support for it should/will happen in Scala in any case, but that’s JDK-specific, since the runtime enforces it as well. The alternative is that library authors need to watch out and document things (so we gained nothing again, as it’s back to documenting things).

So IF we were to pursue what this proposal calls namespaces, it would have to align and play well with JDK modules, and enforce the same rules in other runtimes… But the other runtimes do not have such mechanisms, and there would be a semantic mismatch there again, so I don’t think this is a good idea to force others to implement the JDK module system – it’s complex, and other runtimes simply don’t have that nor should we force the JDK things on them.

Which comes back to the point of – yes, the runtimes do differ, and just blessing names, without a strong encapsulation tooling provides no more value than active testing on runtimes that a library claims to be compatible with. Rather even, without strong enforcing of such modules, it would even give a false perception of safety, which is more dangerous than no such (false) perception and reliance on testing (which always should be in place anyway).

My writeup has gotten a bit long, but I hope I’ve exemplified some of the issues that I see with the proposal.
Many of the goals of this proposal are achievable without the “namespaces”, and testing is required in any case to pull such things off in the real world.

Hope this helps,


Konrad
[ Akka, Scala Platform Process ] Member


#5

I agree.

So there’s an action item:

Since sbt is specifically mentioned, I’ll expand on this a bit. The above proposal can be read as rewriting various bits and pieces of sbt such that it’s free of Java libraries, and/or we move features needed to do so into the Scala Platform. That effort alone sounds like multiple person, multi-year effort. Given that we are barely getting started with being able to talk about files (and mostly as wrapper of NIO), at least the tooling idea sounds far fetched, and more importantly the upside is unclear for any party involved in such effort.

Going back to the original problem statement:

Is this really true? There’s scala-js-java-time, but do we actually have the confusion in the community of people saying “oh noes. I don’t know which Java libraries are currently supported in Scala.js!”


#6

In practice: yes, but not terribly severely. This is essentially a FAQ that many Scala.js newbies trip themselves up on before they understand how things work, but I don’t often hear this from people who are more experienced than that.

(That said, it can be a mild nuisance to figure out, when you’re talking about entry points that are plausible on the Scala.js platform – I’ve found that it occasionally takes a little digging to discern whether something is actually available.)


#7

Hello again,

Just now saw that this was discussed in the latest SPP meeting! I have for some reason not received any notifications on the discussions taking place in this thread; I apologise for my lack of response.

I am excited to see to much feedback on the proposal and will try to address them properly

@ktoso
Thank you for your input! The proposal is indeed a bit weak around timelines and details of implementation. Primarily, I wanted to open a discussion where feedback and thoughts could be gathered.

As someone participating in a standardisation ( reactivestreams.io2 ) process over the last years, I can tell you even agreeing on an API that has a total of 7 methods, and a specific goal, by definition, took years to reach agreement and mind-share.

Even a simple interface that abstracts exactly the APIs that java.io provides would be a win. java.io is the de-facto standard today, but as it lives in the java namespace there is no official support by the Scala platform, and it cannot evolve to meet specific needs without adding implicit wrappers around these types.

Moving even a subset of these APIs into the scala namespace—basically a copy/paste of the interfaces—would give library authors more confidence in working against them while maintaining compatibility across platforms, while allowing new features or changes to these APIs to be road-mapped and implemented in a timely manner.

The “99% use-cases” phrasing makes me very nervous as it shows that while the proposal has a lot of good intentions, it does not seem to be backed by investigation in what realizing such effort would take – by which I mean, it’s a huge undertaking, with questionable benefits, and some things would end up leaking though anyway.

It was perhaps a poor and valueless approximation on my part. Investigating value gained or efforts required is impossible without agreeing on a base level. I could only speak of what the perceived benefits would be at the point of writing the proposal.

I also don’t believe that libraries should be encouraged to claim “we run on X” without ever testing against that runtime.

I didn’t mean to imply that authors should claim compatibility without running their code—I (perhaps naïvely) hope that developers test their code before releasing it to the public.

The value lies in having some base level of reliability when writing the code in the first place and then working out eventual problems. The current message to developers feels like “Hey, just use the Java libraries! Maybe they’re supported for whatever platform you’re compiling against; maybe they’re not.”

Addendum after watching the stream

Thank you all for a very interesting discussion, both here and in the YouTube-stream!

I didn’t realise I came off as vague as I obviously did. Bill Venners pretty much got what I was getting at and I believe Aleksandar Procopec understood my intention as well.

To doubly underline, the basic idea of this proposal is twofold:

  1. give platform implementers—that is, those implementing the compilers/linkers for different runtimes like JS/Native/other future frontiers—a target for implementation.
  2. let users of Scala work against one API for common tasks.

An example of point 1. could be to create a façade scala.io.disk that contains definition-stubs for common operations like reading/writing files. Target platforms would then link to the appropriate implementation during compilation. For the example scala.io.disk, JVM would link to the existing Java libraries, JS would link to node.js (I imagine) and so forth.

The gain here is that the implementation effort is known: the definitions of scala.io.disk. Moreover, users of scala.io.disk can be certain that if the linking fails for some definition(s) this is known to the platform implementors and most likely scheduled for implementation at some point.


#8

Following my example in the previous post, the effort of migrating the JVM would consist of:

  1. making sure that scala.io.Foo is linked to the corresponding Java library;
  2. refactor code in SBT targeting java.io.Foo to target scala.io.Foo instead

The first step is obviously the most significant and debatable one, but step 2 should follow quite easily after that using automated code rewrites and then cleaning up the odd error.


One thing that was mentioned on the stream was code sharing. This is not a first consideration of the proposal, but having it in place makes it more achievable.

The difference between the proposal and status quo is basically having to implement scala.xyz to be platform independent versus having to implement an intangible percentage of the Java standard library to be platform independent. It is easier to build abstractions—and thus have code sharing—on tangible foundations.


#9

To be fair, in this proposal you would still have to decide on an intangible percentage of the Java standard library to create scala.xyz interfaces for.


#10

The decision of what to support in the Scala standard library will very much be tangible, even if it is a subset of some unknown percentage of the Java standard library. Having it to begin with then allows for further adaption and expansion—changes that will be possible to roadmap and discuss.


#11

I think what would make sense for this proposal is to start with a Test Compatibility Kit for Scala. This initially could just look at code coverage in the java namespace that represents the minimum sensible classes and methods required for the current Scala platforms. ScalaJS even implements deprecated Java methods. Next the TCK could look at making sure that the Scala and Java API perform as they should. This is obviously a big job. The problem is that Scala on the JVM can do anything and use any Java API so this could only be a core platform that could work cross platform for Scala JVM, Scala.JS and Scala Native. This essentially is happening today informally but there is no formal Test Kit. Removing the Java namespaces would be really tough. You can look at the level of effort to recreate Threads and IO/NIO on the Scala Native platform or just the amount of work for core Java in Scala JS to see the size of the job.

Creating an API is hard as mentioned and requires a good consensus. Without this the adoption would falter. A good example gone wrong is logging in Java. This has to be my pet peeve of all time in relation to APIs.


#12

The Scala.js test suite already contains a pretty large subset of what such a TCK would look like (see the shared/ subdirectory). A very large portion of our test suite cross-compiles on the JVM and on JS, to ensure that Scala.js and Scala/JVM behave the same. This includes virtually all the tests of our implementation of the JDK APIs.

Moreover, the Scala.js build and CI also run all of the (applicable) tests from the scala/scala repo (so-called partest + JUnit tests) recompiled with Scala.js, once again to ensure that Scala.js behaves in the same way as Scala/JVM.

And in addition to all that, cross-compiling libraries in the wild also cross-compile their tests (obviously), so they indirectly test that Scala.js behaves the same as Scala/JVM as well. We sometimes receive bug reports found by some ScalaCheck tests that discovered a bug deep down inside the Scala.js JDK implementations.


#13

My concern is that the Scala platform isn’t limited by the limitations of the browser. But then I would also like to see Scala not limited by the limitations of the JVM.


#14

So here is my 2 cents on the issue.

I think that Scala generally should work towards a proper Scala cross platform namespace solution, i.e. something analogous to java namespace but for Scala. This is as an actual developer who developers cross platform applications for Scala/Scala.js/Scala-Native and one of the major difficulties in this regard is that there is no conformity between the platforms, i.e. I have had to do the following to get things to work.

For example I had to create manual type aliases for https://github.com/cquiroz/scala-java-time, forked from https://github.com/soc/scala-java-time because the authors original work was never fully utilized. The scala-java-time library uses the org.threeten.bp package name where as Java uses the java.time package. Ideally I would rather see an effort so that is library ends up using the scala namespace and is supported on all platforms, seeing as how originally it was a clean ream port of java.time . This kind of work needs effort from the Scala Center/Lightbend though, its not all in the hands of the community.

Generally I would have actually rather seen some of the effort put behind all of the porting of the Java namespaces to rather creating our own Scala namespace, and having conversion libraries for people that wan’t to use the Java namespace. i.e. Ideally scala-native (and scala.js) should have anything in the Java namespace, and assuming we have something like scala.time.DateTime on all platforms, if someone wants to use java.time.DateTime then there should be a library that provides these conversions (just like we have libraries that provide conversions between Scala and Java collections)

I disagree here with @ktoso, the idea of such a namespace is not to provide a completely ideal solution on all platforms, its to provide a generic solution that works well enough on all platforms, i.e. something according to the Pareto principle. We also have reference to implementations that are already known to work (i.e. Java time). The other point of using a scala platform API is to provide a Scala idiomatic implementation for things like IO and date. The other main point of doing something like this is to help prevent diamond dependency problem.

I do however digress here, as we found out int he scala platform wrt better-file, we actually have to come up with a general set of guidelines, i.e. in context of better-files its when we should throw exceptions vs when we should use values. i.e. in Java everyone pretty much used exceptions for bad input so they had a much better guideline on what to do.


#15

Or … just change the package names inside that library to use java.time. I have no idea why no one has ever done this (even as a trivial fork), given the extremely low cost/benefit ratio.


#16

It does have the java.time package. It could be that’s done by the build
rather than in the source code, but you can definitely use java.time
everywhere while only adding scala-java-time to scala.js subprojects.


#17

Could this issue be addressed by a simple static analysis tool? There are
already tools out there that analyze all possible paths between
classes/packages, so just make a list of ‘safe-for-js’ classes, and see if
the list of all-possible-paths-from-my-app is a subset. If it is, you know
your app is 100% safe for JS. If not, you need to write some tests to find
out… and you even know what namespaces your tests should try to hit in
order to try to get to the unsafe namespaces. It would be like a bloom
filter, definite miss vs possible hit.


#18

The Scala.js and Scala Native linkers already do that. They won’t let you link your application if they cannot prove that everything reachable is available. That’s at link time, though, not at compile time. But from

see if the list of all-possible-paths-from-my-app is a subset

I guess you mean when there is an “app” to compute that, and that’s precisely what we have at link time: an app (as opposed to a library).


#19

I think we should make Scala independent of not only the Java namespace, but also the Java API design, considering Oracle sued Google for 8.8 billion dollars due to re-implementation of Java API in Android.

Especially, Scala.js and Scala Native reimplemenated Java API just like Google did.

I think all cross-backend Scala libraries should not depend on java.* API in the future.


#20

In principle, I agree. I don’t think it’s likely that we would get sued (I suspect the motivations behind the Google lawsuit don’t much apply to us), but we (for a vague sense of “we”) do appear to be a bit legally vulnerable. This has been a niggling concern of mine for several years now.

In practice, it’s a huge project. It may be a good idea, but there seems little chance of it happening without somebody passionately spearheading it, and gathering a lot of community support to make it happen. (And even then, it would take a very long deprecation cycle to wean the community over.)