Standard Library: Now open for improvements and suggestions!

0ther wishes from my side (maybe there are other constructs I missed or have been introduced in the meantime):

  1. have a way to get the value of a singleton type (if doable via Macro):
    type T = (1,2,3)
    val t = singletonValue[T] // t = (1, 2, 3)
    
  2. Something more for the compiler I guess but I don’t know what can be done with macros by now. Infer the literal types instead of widening (a bit like const assertions in typescript):
    val i = const(1) // i is of type 1 and not Int
    val t = const((1,2,3)) // t is of type (1, 2, 3) and not (Int, Int, Int)`
    // ideally const would also work when using higher kinded types
    val l = const(List(1,2,3)) // l is of type List[1 | 2 | 3] and not List[Int]
    
  3. a function which makes type transformation from tuple to union easy (without the need to repeat it), something like:
    type u = UnionOf[Key]
    
    where UnionOf could be defined similar to (which results in (x | (y | ..)) which I would like to see flattened for better readability, so (x | y …)
    type UnionOf[T <: Tuple] = T match
      case EmptyTuple => Nothing
      case h *: t     => h | UnionOf[t]
    
  4. ideally UnionOf would be overloaded and would also work for enumtypes, so that I can define:
    enum Z:
      A, B, C, D
    val e = UnionOf[Z] // e is of type (Z.A | Z.B | Z.C | Z.D)
    
  5. a way to decompose a union in a match type (we don’t distribute over union types so that is still a possibility I guess). Currently I mainly want this so that I can abstract over enum entry types in a similar way Z.values can somehow (I want to stay on the type level). Maybe there are other means to already achieve this for enum? Nevertheless, I guess being able to decompose a union via match type would allow to define other type utilities such as typescripts exclude.
2 Likes

Oh one big improvement:
Make a lot more things inline !

Here are a couple examples:

  1. s"Error: Was $inlineableValue, should have been $otherInlineable"
  2. .ordinal on enums: How to inline an enum properly? - Question - Scala Users
  3. Things like tuple.length

Also related:
A way to get varargs as tuples so we can manipulated them at compiletime without using Macros

1 Like
  • PrioritySearchQueue
  • immutable version of LinkedHashMap
4 Likes

Strict zipping

Some form of strict zipping that enforces the things being zipped have the same length. (e.g. Python added a strict argument in 3.10, I often use a custom .zipStrict extension method).

Map.unionWith

.unionWith exists on IntMap and LongMap for some reason but nothing else. I think Swift got this right by requiring you to provide the value collision resolution function to ever “combine” Maps. Unexpected key collisions are so pernicious.

mapAccumulate

A combination of a map and fold is frequently useful.

2 Likes

One of the biggest questions to me is: what kind of compatiblity guarantee should we aim for? I think the standard library could be massively improved if we could rename stuff like Seq#apply, Option#get, Seq#head etc. to unsafeGet etc. It should be possible to do this in a binary compatible way using the targetName annotation, and we can write automatic migrations with scalafix. Cross compilation (if required) can be adressed with compat libraries (think scala-collection-compat). So I think it’s feasible without much breakage, but I’d like to hear others’ opinions.

4 Likes

Agreed, but how do you know how to combine values? In libraries like cats there’s a Monoid typeclass for this, and perhaps we should just include some of the more frequently used cats functionality in the standard library. For example, the standard library needs non-empty collection types so that groupBy’s type can be properly expressed (the values of the returned Map are always non-empty).
Another one I really like is alignCombine from cats. Basically it takes two Maps and combines them: if a key occurs in only one of the inputs, its value is taken from that input, and if it occurs in both, the values are combined.
E. g.

val a = Map(1 -> 1, 2 -> 2)
val b = Map(2 -> 2, 3 -> 3))
a.alignCombine(b) // Map(1 -> 1, 2 -> 4, 3 ->3)`.
``
This once again requires a `Monoid` typeclass to know how to combine them.
2 Likes

EDIT: I think you quoted the wrong thing for what you were responding to :joy:

I think the answer is that the caller just has to provide the combine function, like in LongMap the type definition is:

def unionWith[S >: T](that: LongMap[S], f: (Long, S, S) => S): LongMap[S]

f should be called combine in my opinion.

Also I’m not sure that the key is actually useful in combine, I would define it as:

class Map[K, V] {
  def unionWith(that: Map[K, V], combine: (V, V) => V): Map[K, V]
}

I think the Cats approach with typeclasses is great for Cats and I prefer that actually, but for the standard library I think just having the user provide it is more common. Maybe we’ll get lucky and have both, if you had a Monoid typeclass you could also have:

  def unionWith(that: Map[K, V])(using Monoid[V]): Map[K, V]

Original response:

I think for something like the standard library, you just make the caller provide it at the callsite, e.g.

class List[A] {
  def mapAccumulate[S, B](z: S)(f: (S, A) => (S, B)): (S, List[B]) = ...
}
1 Like

I understand the motivation, but I suspect that changing to, eg, unsafeGet would be, to say the least, controversial. It makes sense, but it’s fighting twenty years of what’s in peoples’ fingers, and for all that one can make a solid case that it’s usually wiser to avoid it, it’s still really common. So the blast radius mentally feels pretty large, even if we include Scalafix rules to adjust the code.

6 Likes

Perhaps a big ask here: Could we make the Vector the default Seq, rather than List?

In the past, Vector was known for having decent asymptotic performance O(log n) on most operations, decent locality due to the packed arrays it uses for the leaves, but terrible constant factors with many operations being 1-2 orders of magnitude slower than List or Array (link). With that context, IMO it made sense for it to be an opt-in thing that you only reached for once in a while, but not something that you used by default. List as the default has its own problems - prepend/tail are not common operations people typically perform on Seqs - but Vector wasn’t an appealing alternative

But since Rewrite Vector (now "radix-balanced finger tree vectors"), for performance by szeiger · Pull Request #8534 · scala/scala · GitHub landed, the constant factors have improved significantly, and I think it would make sense to revisit this decision. Could we take a few representative Scala programs, run them with Seq.apply creating Vectors instead of Lists, and see whether they slow down or speed up? In the past they would likely have slowed down due to the poor constant factors, but since Stefan’s work landed we may well find that Seq.apply defaulting to Vector results in a “free” speedup across the Scala ecosystem.

14 Likes

@lihaoyi
Your benchmarks was run on Scala 2.11.8. Did you do similar benchmarks for the 2.13.x collections?

1 Like

Very happy to hear that the road to additions and improvements is now open!

scala.os Module

While Scala has a solid standard library with a rich collections API, it’s still somehow lacking for some basic tasks. In particular, I think Scala would benefit from having an idiomatic standard module for working with files (along the lines of Python’s pathlib) and performing basic OS operations (akin to Python’s os module).

On one side, this would clearly benefit a majority (or at least a very large portion) of users. On an individual level I know I almost always need to work with files in some way or another. But there’s also data indicating that on Python’s side, the os module is by far the most widely used one.

On another side, while we have a few great external libraries for those operations (with os-lib currently being amongst the top 50 Scala libraries), the use-case is important enough to become standardised.
Also, being part of the standard library would make scripting workflows much more accessible, workflows for which os and path operations are usually central.

Rich Named Tuples API

Named Tuples are great and they attract new use-cases by the day within the community, but their API is currently quite limited. chanterelle already provides some nice additions but I think such an important feature deserves to come packed with useful functionality and common operations. I’m thinking of operations for mapping values, updating / removing fields, etc…

Better Option

In the aforementioned post we explored the possibility of empowering nullable types as a better alternative to Option. While the consensus seemed to be that Option would indeed deserve improvements, some valid concerns regarding the widespread use of T | Null were raised.

Regardless of the fate of nullables, I would then like to advocate for better Options ! Here are some preliminary ideas (already suggested in the above thread):

  • auto upgrade T values to Option[T] when the latter is expected (i.e. just passing t instead of Some(t) when a value is present), thus reducing the syntax overhead and making the code easier to write and parse
  • provide an unboxed runtime representation (when possible), similar to scala-unboxed-option or kyo’s Maybe type.
  • consider providing a short alias for Option[T] as T?
5 Likes

We should add Stream#gather like thing to collection api.

2 Likes

This i think is quite difficult to make ergonomic without improvements to compiletime.ops.int but worth exploring (i.e. unless everything is statically known size then generic operations are useless)

possibly an intrinsic type Sort[T <: Tuple] <: Tuple for tuples of literals in compiletime.ops.any package

i do not beleive there is any desire from core team to fundamentally redesign the collection framework until at least capture checking is at V1.0 - although with LLM perhaps prototype is not so much effort

1 Like
  1. see scala.valueOf or compiletime.constValue
  2. there is a prototype for Precise type class, however this is not so useful i find (cant handle complex expressions), feedback would be welcome in the scala3 repo
  3. see Tuple.Union
  4. noted - there is no Split type - i also had a similar idea (explode any sealed type to its descendants as a union)
  5. for now it is impossible to decompose a union type in a match type (i.e. remove X from X|Y|Z, leaving Y|Z) as union types have no discriminator. (possible to change type system maybe)
2 Likes

So on an inline method with e.g (inline xs: String*) varargs param - some sort of compiletime.asTuple(xs) (tricky unless xs is an inline param) - also more tricky if there are spreads involved (could compiletime error e.g. expected no spreads).

2 Likes

fold already lets you “map” a value so could you be more specific?

We are already using the targetName technique to replace some methods inline versions (with same name mind so source compat is somewhat preserved)

so to do the same for renaming will probably need more eyes and discussion once the process is live

1 Like

++ from me

3 Likes