Vector is in general a much more forgiving data structure than List: while it does not have quite as fast head append/removal performance, it has generally decent O(log n) performance across a wide range of operations, and doesn’t have the poor-indexed-loop poor-tail-append etc. footguns that List has. That makes it more suitable as the default for Seq which typically do not have the same head-biased usage patterns that List does
While @odersky mentions that previous attempts to change Seq()'s default from List to Vector resulted in performance slowdowns, Scala 2.13’s Vector class implemented by @szeiger is an order of magnitude faster than the previous 2.12 Vector class, so it’s worth a second look
I did an experiment with [WIP] Make `Vector` the default implementation of `Seq` by lihaoyi · Pull Request #26386 · scala/scala3 · GitHub , making Seq construct a Vector instead of List, and then build a compiler to compile mill-libs-javalib. At least on this benchmark (10 runs warmup, 10 runs measurement), the difference appears to be negligible:
| Case | run1 | run2 | run3 | Mean | SD |
|---|---|---|---|---|---|
| main (control, Seq=List) | 8852 | 8612 | 8960 | 8808 ms | 145 |
| seq-as-vector (PR) | 8707 | 8768 | 8767 | 8747 ms | 29 |
| seq-as-vector + 2M spin | 10977 | 11345 | 11236 | 11186 ms | 154 |
To ensure this is not a measurement error, and to ensure that touching the Seq() constructor can indeed make a difference to the compiler performance, I did a run with the Seq() intentionally slowed down with a spin-loop. It did demonstrate a measurable slowdown as shown above, indicating that the lack of change between the main (control, Seq=List) case and seq-as-vector (PR) is meaningful
The primary Seq() -> Vector case implemented in the PR replaces the Seq(foo, bar, baz) → foo :: bar :: baz :: Nil optimization with an equivalent Vector.fromArrayUnsafe optimization that statically detects Seqs with length <= 32 and directly constructs a new Vector1 from short literals (which should be most of them).
Given that the difference in performance from changing the default from List to Vector seems negligible, I think we should seriously consider changing Seq to return Vector in 3.10 once the feature freeze and library freeze is lifted. This would make Seq behave “reasonably” across most usage patterns, without the perform cliffs that List as an implementation has.
Notably I only benchmarked the new Seq() -> Vector implementation when used in the Scala compiler; most of my other com-lihaoyi libraries that I have paid attention to performance have already had most hot Seq() paths replaced by specialized data structures, so I wouldn’t expect to see significant changes there. We probably need to benchmark this implementation on a few more projects to see how it affects usage patterns across the board.











