Standard Library: Now open for improvements and suggestions!

Worth adding “guilt-free " for tupling.

Currently,

        res0 =
          {
            val ev$1: Integer = ArrowAssoc(Int.box(2)).asInstanceOf[Integer]
            ArrowAssoc.->$extension(ev$1, Int.box(3))
          }

instead of

        res1 = new Tuple2$mcII$sp(3, 4)

I’m not sure what the current thinking is on auto-tupling in

Some(x, y)

but I prefer the clarity of

Some(x -> y)

especially for complex subexpressions.

Edit: forthcoming

1 Like

I would like to see Fixpoint types in the standard library.

opaque type Fix[F[_]] = F[Any]
// should be F[Fix[F]], but opaque types can't be recursive

object Fix:
  def cast[F[_]]: Fix[F] =:= F[Fix[F]] = <:<.refl.asInstanceOf

  def apply[F[_]](f: F[Fix[F]]): Fix[F] = cast[F].flip(f)
  extension[F[_]](fix: Fix[F]) def unfix: F[Fix[F]] = cast[F](fix)

It’s a very widely applicable concept, and it would be nice if everybody could agree on the same definition to enable interoperability between libraries. Fixpoint types should also make it easier to derive typeclass instances for recursive types.

2 Likes

Another thing that should definitely be added is CanEqual instances for a lot more types. java.util.UUID, java.time.*, and probably others. Note that this not only benefits users of language:strictEquality: "foo" == UUID.randomUUID() doesn’t compile even without strictEquality because String has a CanEqual instance. Instant.now() == UUID.randomUUID() on the other hand does compile because neither of those types has a CanEqual instance.

3 Likes

Two other things

  1. some util to check if the tuple contains only subtypes of X (sth like IsMappedBy, but not for F[_])
  2. easier conversion to specialised Arrays (now, you have to map over the Array of Objects, or implement your own toArray). It would benefit on the performance
import scala.reflect.ClassTag

type ContainsOnly[Tup <: Tuple, X] = Tup match
  case EmptyTuple => true
  case X *: tail => ContainsOnly[tail, X]
  case _ => false

extension [Tup <: Tuple](tuple: Tup)
  inline def toArrayOf[T: ClassTag](using ContainsOnly[Tup, T] =:= true): Array[T] =
    // probably some typed array specialization would be better, but this is just an example
    tuple.toArray.map(_.asInstanceOf[T])
1 Like

I’d like a Tuple.of[T] (or other name), the type of tuples containing only T
In other words, the supertype of (,) | (T,) | (T, T) | (T, T, T) | ... (using the WIP syntax for tuple types)

ContainsOnly the boils down to a subtyping check:
using ContainsOnly[Tup, T] =:= true is the same as using Tup <:< Tuple.of[T]
(and IMO it looks more intuitive)

1 Like

In this line as well, I’d like a Tuple.IndexOf[Tup, T] since everytime I have to do something at the type level with tuples, I have to manually define this match.

4 Likes

how could this subtyping be implemented, a new intrinsic type?

Honestly not sure, but any solution would be worthwhile

Basically yes. Maybe similar to NamedTuple, could you not have a Tuple.Mono[T] opaque type whose only possible subtypes are either EmptyTuple or T :* Tuple.Mono[T].
If you don’t want to keep adding the same intrinsic every time a new case for tuples comes up where you want to define a custom subtying rule for what effectively is an opaque type over tuples, maybe you could encode this intrinsic into scala.compiletime.ops.Tuple.Subtype[...] or something, such that you could retrofit named tuple’s case as well as this new one to usage of that one compiletime op.

For what it’w worth I still find the collection hierarchy/inheritance chain overly complicated and would love to see it simplified. The Traversable → Iterable → Seq chain could perhaps be condensed, and I think View could be reformulated as it is to me unclear when to use it over the simpler and more predictable Iterator. I’d like to see an accessible API for creating immutable collections by way of a locally mutable construct that can be terminated into the desired result type, rather than selecting between a multitude of different builders and factories.

2 Likes

How would that be different from what we already have?

scala> Vector.newBuilder[String]
val res0: scala.collection.mutable.ReusableBuilder[String, Vector[String]] = VectorBuilder(len1=0, lenRest=0, offset=0, depth=1)
                                                                                                    
scala> res0 += "foo"; res0 += "bar"
val res1: scala.collection.mutable.ReusableBuilder[String, Vector[String]] = VectorBuilder(len1=2, lenRest=0, offset=0, depth=1)
val res2: scala.collection.mutable.ReusableBuilder[String, Vector[String]] = VectorBuilder(len1=2, lenRest=0, offset=0, depth=1)
                                                                                                    
scala> res0.result
val res3: Vector[String] = Vector("foo", "bar")
1 Like

Two small improvements that would help me:

  • When I need to pass the builder to a helper function, I’d like a simpler type than mutable.Builder[Int, Vector[Int]].

  • I’ve occasionally used (or been tempted to use) this pattern:

    val buffer = List.newBuilder[Int]
    buffer += 1 += 2
    val l1 = buffer.result()
    buffer += 3 += 4
    val l2 = buffer.result()
    

    It works, but it’s not clear that it’s guaranteed to work from the documentation, which seems to require calling clear after result before a builder can be reused.

A problem in general with the current collections is that their methods are very type-specific and heavy. This makes it quite difficult to add general methods—as extensions—to something like Iterable. The implementor needs to summon implicit factories for these types, which can be quite difficult to work out. Same thing when trying to implement a new subtype of for instance Seq.

The issue I take with builders in particular is that the builder is concretely typed, to Vector in your example. This affects how it can be passed around and again adds a lot of type-complexity.

Instead of predetermining the end result type, perhaps something like:

val builder = CollectionBuilder.newBuilder[String]
builder += "hello"
builder += "world"
builder.finalizeAs(Vector) // I realise finalizeX is a contended method name..

It still doesn’t prevent the builder from being used after it’s been finalised, but perhaps capabilities could solve that.

As for transforming contents of collections, I’ve been thinking of an immutable construct which isolates the transformation from the collection on which it is used. So something like a base `Transformer[From, To]` with functions like map, filter, groupBy etc, which enriches the transformation. Then, when the transformation steps are ready, it can be applied to a collection.

Yeah, it’s undefined behavior if you don’t follow that rule.

Builders are collection-specific by design, because sometimes the building can be done more efficiently if you know in advance what collection you want out at the end. List, with its hidden mutable tail pointer, is the best known example.

If you don’t want or need all that, I’d suggest using collection.mutable.Buffer as your type for building arbitrary collections. When you’re done adding things you can .to(Vector) or any other collection type you want.

(And as a bonus, since you know .to will always copy, that also avoids the problem charpov points out with .result, that it can only be called once.)

6 Likes

That’s what View is, what it does. It’s unclear to me why you want something else?

1 Like

This is out of scope for this thread, as the collections hierarchy can’t be redesigned without breaking binary compatibility. (The backwards bincompat constraint is something we should probably have highlighted from the start.)

I don’t want to entirely discourage discussion of different collections designs. Such discussions have been ongoing throughout Scala’s entire history. It’s just not for this thread. (And it’s highly unclear on what timeline a redesign could even happen, as we expect the backwards bincompat constraints to remain in place for a long time.)

3 Likes

ReusableBuilder explains that “In general no method other than clear() may be called after result(). It is up to subclasses to implement and to document other allowed sequences of operations (e.g. calling other methods after result() in order to obtain different snapshots of a collection under construction).”

But ListBuffer itself, despite going to considerable trouble to work after result(), doesn’t update the documentation to explain that it’s fine. I think it’s fine.

However, you wouldn’t see the documentation off the type of List.newBuilder, because it’s typed as a mutable.Builder, not a ListBuffer (which it actually is under the hood, as you can see from how it prints out).

It’s worth updating the documentation, and also looking at other subclasses of ReusableBuilder to figure out whether the only-clear-after-result restriction is true in that case.

For instance, ArrayBuilder can usually be extended safely, but not if you happen to call result() when the buffer is full:

scala> val ab = Array.newBuilder[String]
val ab: scala.collection.mutable.ArrayBuilder[String] = ArrayBuilder.ofRef
                                                                                
scala> for c <- "abcdefghijklmnop" do ab += c.toString
                                                                                
scala> ab.result()
val res37: Array[String] = Array(a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p)
                                                                                
scala> ab += "cod"
val res38: scala.collection.mutable.ArrayBuilder[String] = ArrayBuilder.ofRef
                                                                                
scala> ab.result()
val res39: Array[String] = Array(null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, cod)

So the documentation is correct: undefined behavior unless you call clear().

Anyway, this could stand a new look, but it is the kind of change that we can make without having to change the binary compatibility of the library. It’s all documentation, bincompat bug fixes, and slightly changed behavior.

If we did want to modify something, perhaps we’d add a snapshot() method (or somesuch) that was guaranteed to return existing work and allow continuation.

1 Like

With capture checking, reusable builders can be made safer:

trait Builder[A, C] extends SharedCapability:
  infix def `+=`(a: A): Unit
  // no result() in the api

trait ReusableBuilder[A, C] extends SharedCapability:
  def build(body: Builder[A, C] ?-> Unit): C

val builder: ReusableBuilder[Int, Vector[Int]] = ...
val first = builder.build: b =>
  b += 5
  b += 20
val second = builder.build: b =>
  b.addAll(List(...))
  b += 15

Since these patterns are safer under capture checking, they may be more widely used. Though, it would be incompatible with the current builder api since it includes .result().

Views confuse me: it is not clear when a method is eager (I think grouping is eager?), intermediary views are hard to pass around, and they are inefficient if used multiple times. I almost always reach for Iterator when I need to do multiple steps of transformation; even though it’s mutable at least the surface area is simple to comprehend.

What I’m describing is not views but a free-standing type which can be applied to collections, but it’s just something that’s been brewing in the back of my mind so I’m not sure how it would work in practice.