2.13.x migration and source compatibility

mosesn · June 19, 2018, 6:40pm

We’ve been trying out 2.13.0-M4 for twitter util, and we have some feedback for scala implementors.

We have historically supported more than one scala version, which is useful for users who are trying to migrate from one version to another, so that they don’t have to both change their library version AND upgrade scala at the same time. However, scala 2.13.x seems to make it really hard to do this. Here are some of the issues we’ve run into when trying to migrate our first module:

CanBuildFrom removed without a replacement is an issue for compatible source code between 2.11 and 2.13 (where type inference is enough)
CanBuildFrom removed without a replacement is an issue for compatible source code between 2.12 and 2.13 (where type inference is not enough) e.g. Future.collect
It’s risky to replace breakOut with views in 2.11 or 2.12 because we have low confidence in the correctness of views pre-2.13.
ArrayBuffer no longer implements Seq, and no longer provides a result method. Nor does mutable.Buffer. We can call toSeq in every case, but we would rather not entail the extra copy, since this is sometimes on the hot path.
Very clunky to convert a java.util.List to a Seq from java. JavaConverters.asScalaBuffer(list).iterator().toSeq())

Our understanding is that folks are writing scalafix tools to simplify migrating from 2.12 to 2.13, which is great! However, it would be even better if there was a happy medium where we didn’t need to maintain two versions of our source code. Is there any chance you can give us the tools we need to maintain source compatibility between 2.12 and 2.13?

julienrf · June 19, 2018, 6:50pm

Hey @mosesn, thanks a lot for your feedback! Have you had a look at scala-collection-compat? It’s a library that provides a uniform syntax for collections usage on 2.11, 2.12 and 2.13.

It’s risky to replace breakOut with views in 2.11 or 2.12 because we have low confidence in the correctness of views pre-2.13.

This is a good point. The only solution I can think of is to replace xs.map(f)(breakOut): CC[A] with xs.iterator.map(f).to(CC). What do you think?

ArrayBuffer no longer implements Seq, and no longer provides a result method

It does implement Seq, what do you mean? There is no result method, but this method was previously returning this, so is this really blocking you?

Very clunky to convert a java.util.List to a Seq from java. JavaConverters.asScalaBuffer(list).iterator().toSeq())

It should be possible to just write list.asScala.toSeq. And if you don’t need an immutable.Seq (which is what the .toSeq method returns), you can just go with list.asScala, no?

scottcarey · June 19, 2018, 8:40pm

This is a good point. The only solution I can think of is to replace xs.map(f)(breakOut): CC[A] with xs.iterator.map(f).to(CC) . What do you think?

FWIW, all my pre 2.13 code used this idiom for anything lazy or any large stream of collection transformations. Views were broken, breakOut confuses those not familiar with it, and iterator just works, is lazy, and is fast.
I’m honestly still not sure what the purpose of views is when Iterator just works. It can’t be used more than once, but its usually simple to solve that problem client side. I’ve never ran into a use case where I actually needed a view.

2.13 views are just things that create iterators and produce collections now, so it can save me a few characters of typing every now and then.

mosesn · June 19, 2018, 8:49pm

This is a good point. The only solution I can think of is to replace xs.map(f)(breakOut): CC[A] with xs.iterator.map(f).to(CC) . What do you think?

Yeah, that should work.

It does implement Seq , what do you mean? There is no result method, but this method was previously returning this , so is this really blocking you?

Yes, I guess result is not a big issue. However, I have needed to change a lot of code because of ArrayBuffer being different in 2.13. I might be confused about this, but here’s a diff that shows the kind of changes I had to make to get it to compile

diff --git a/util/util-core/src/main/scala/com/twitter/concurrent/AsyncStream.scala b/util/util-core/src/main/scala/com/twitter/concurrent/AsyncStream.scala
index a6d4b48..c6e52e7 100644
--- a/util/util-core/src/main/scala/com/twitter/concurrent/AsyncStream.scala
+++ b/util/util-core/src/main/scala/com/twitter/concurrent/AsyncStream.scala
@@ -508,15 +508,15 @@ sealed abstract class AsyncStream[+A] {
     def fillBuffer(
       sizeRemaining: Int
     )(s: => AsyncStream[A]): Future[(Seq[A], () => AsyncStream[A])] =
-      if (sizeRemaining < 1) Future.value((buffer, () => s))
+      if (sizeRemaining < 1) Future.value((buffer.toSeq, () => s))
       else
         s match {
-          case Empty => Future.value((buffer, () => s))
+          case Empty => Future.value((buffer.toSeq, () => s))

           case FromFuture(fa) =>
             fa.flatMap { a =>
               buffer += a
-              Future.value((buffer, () => empty))
+              Future.value((buffer.toSeq, () => empty))
             }

           case Cons(fa, more) =>

I wonder if I’m having issues around whether the default Seq is immutable.Seq vs collection.Seq? Regardless, it no longer works.

It should be possible to just write list.asScala.toSeq . And if you don’t need an immutable.Seq (which is what the .toSeq method returns), you can just go with list.asScala , no?

This doesn’t work for me from java. I think somehow mutable.Buffer loses toSeq when converting to java? I’m not sure exactly. Here’s an example of the kind of diff I’m thinking about

diff --git a/util/util-core/src/main/java/com/twitter/concurrent/Offers.java b/util/util-core/src/main/java/com/twitter/concurrent/Offers.java
index a26569b3..5917024 100644
--- a/util/util-core/src/main/java/com/twitter/concurrent/Offers.java
+++ b/util/util-core/src/main/java/com/twitter/concurrent/Offers.java
@@ -4,7 +4,7 @@ import com.twitter.util.Duration;
 import com.twitter.util.Function0;
 import com.twitter.util.Future;
 import com.twitter.util.Timer;
-import scala.collection.JavaConversions;
+import scala.collection.JavaConverters;
 import scala.runtime.BoxedUnit;

 import java.util.ArrayList;
@@ -60,7 +60,8 @@ public final class Offers {
    * @see Offer$#choose(scala.collection.Seq)
    */
   public static <T> Offer<T> choose(Collection<Offer<T>> offers) {
-    return Offer$.MODULE$.choose(JavaConversions.asScalaBuffer(new ArrayList<Offer<T>>(offers)));
+    return Offer$.MODULE$.choose(
+      JavaConverters.asScalaBuffer(new ArrayList<Offer<T>>(offers)).iterator().toSeq());
   }

   /**

nafg · June 19, 2018, 10:51pm

If your scala class is meant to be called from java, why not at have at least an overload that takes a java collection or array, and do the conversion on the scala side?

julienrf · June 20, 2018, 12:03am

Would that work for you to return a Future[(collection.Seq[A]), () => AsyncStream[A]] instead?

This would remove all the .toSeq calls.

Interesting. I think you are not the first one reporting that calling the Java converters from Java does not work seamlessly. This might be something we can improve…

mosesn · June 20, 2018, 12:59pm

Java users often have trouble calling scala from java (exactly this kind of issue is a good example) so we typically make java shims when we think java users might run into issues. In fact, that’s exactly what we’re doing here: https://github.com/twitter/util/blob/develop/util-core/src/main/java/com/twitter/concurrent/Offers.java#L62. I think the specific reason why was because Offer is a trait, so it’s compiled down to being an interface in java, and interfaces and scala won’t compile static forwarders for interfaces. But we don’t always anticipate when java users will want to use an API, and frankly it seems like overkill to have to both make a java shim and make a java-friendly method in scala. I don’t think scala’s intent is for normal scala code to be so difficult to use from java anyway.

mosesn · June 20, 2018, 1:13pm

Ah, I see. The default Seq in scala 2.13 changes from collection.Seq to collection.immutable.Seq? I think it’s unlikely that we’ll want to change our APIs to use non-default Seq, since it will be hard to remember that we need to special-case Seq. It’ll be nice to get the symmetry of all of the default collections being imutable.Seq–I didn’t even realize that Seq was collection.Seq.

However, this means that I don’t know what the efficient way of building up an immutable.Seq is. For Maps, we can use MapBuilder, is there a canonical SeqBuilder?

julienrf · June 20, 2018, 1:45pm

It depends on the way you can build your Seq. We have a collection.immutable.ArraySeq that is very efficient in building a Seq if you know its size in advance (using ArraySeq.tabulate). Otherwise, the rule of thumb is to use Seq.newBuilder, which defaults to a List builder.

MasseGuillaume · June 22, 2018, 8:53am

Hi @mosesn,

I’m helping @julienrf with the migration by creating Scalafix rules. I would like to study the twitter-util diff. Can you push to a branch what you have so far? Thanks.

mosesn · June 22, 2018, 2:53pm

Here’s what we did:

mpilquist · July 23, 2018, 4:54pm

Some additional feedback from the perspective of maintaining a cross-built library. Scodec is currently cross-built for 2.10, 2.11, and 2.12. Scodec uses CanBuildFrom in a few places in order to abstract over the type of collection that’s built.

The scala-collection-compat library lets us maintain a single code base and cross-build for 2.11, 2.12, and 2.13, but doing so breaks binary compatibility with previously released versions. Here’s what this looks like for scodec: https://github.com/scodec/scodec/pull/119/files

Keeping binary compatibility can be accomplished by maintaining separate source files for 2.13 like Kenji did here: https://github.com/xuwei-k/scodec/commit/84e5f307bd1f36a2d0f3e4690a3331e8356a6091 This is pretty tricky though, and results in the 2.13 version of scodec having it’s own CanBuildFrom.

Any advice appreciated.

nafg · July 23, 2018, 6:28pm

Couldn’t you just release a new version?

sjrd · July 23, 2018, 6:36pm

That’s not how it works.

When you release a binary breaking version of your library, you break the ecosystem, which is bad. It’s worse if you do not bump the major version to signal that to your users; but bumping the version doesn’t make the binary breakage good.

mpilquist · July 23, 2018, 6:52pm

Yep, exactly. The last scodec-core release that broke binary compatibility was May 2016. With that said, we’ll likely release 1.11 and drop support for 2.10 unless someone comes up with another solution, as maintaining the 2.13 tricks linked in Kenji’s branch is not something I want to do.

jvican · July 23, 2018, 9:33pm

I personally think that dropping 2.10 support (for your concrete scenario) sounds like the right thing to do.

nafg · July 23, 2018, 10:56pm

I didn’t say it’s “good.” I said it’s an option.

Releasing binary-breaking versions is an old problem, not specific to scala. The fact is that it’s done a lot, although I agree it has unpleasant consequences.

mpilquist · July 24, 2018, 1:07am

I’m fine with dropping 2.10 support but it’s unfortunate that I have to break binary compatibility for 2.11/2.12 users just to be source compatible with 2.13 (modulo a reasonable amount of maintenance burden on library maintainers).

smarter · July 24, 2018, 1:32am

Maybe the compat library can be tweaked to preserve binary compatibility in more cases? E.g. in https://github.com/scala/scala-collection-compat/blob/master/compat/src/main/scala-2.11_2.12/scala/collection/compat/Factory.scala does Factory really needs to be a trait? Could it be a type alias or abstract type bounded by CanBuildFrom[Nothing, A, C] with extension methods defined on top of it?

smarter · July 24, 2018, 2:18pm

Here’s an attempt at implementing that: https://github.com/scala/scala-collection-compat/pull/115