Sources of implementation for polymorphic tuples

noresttherein · April 5, 2022, 6:21pm

If I wanted to look into how tuples are implemented and, potentially, propose an improvement, what file(s) I should be looking at? Are tuples kind of regular types after desugaring, or is their implementation somehow hard coded in the compiler?

bishabosha · April 5, 2022, 8:24pm

Hi, to put it shortly, the implementation touches many areas of the compiler, you can see the spread of its impact by looking for certain keywords in the compiler, such as defn.NonEmptyTupleClass (i.e. a pointer to class *:); defn.EmptyTupleModule (i.e. object EmptyTuple); defn.TupleXXLClass (i.e. class TupleXXL); nestedPairs (a function that computes generic tuple types).

Some dedicated files include dotty/TupleOptimizations.scala at main · lampepfl/dotty · GitHub

smarter · April 5, 2022, 8:54pm

If you"re interested in performance, see also Support tuple specialization · Issue #11114 · lampepfl/dotty · GitHub

noresttherein · April 5, 2022, 9:01pm

Thanks, that’s what was I afraid of. What prompted this question was noticing that tuple extension (i.e., method *:) is O(n), while it can be made amortised O(1) first prepend. Basically, if all you need your n-tuple for is to immediately create a single n+1 -tuple, then the final result can be created in O(n) rather than O(n*n). It is done by defining

   case class TupleX(contents :Array[Any], size :Int, @volatile var isOwner :Boolean)

If isOwner is false, or contents.length == size
1. create an array 2*size in length
2. arraycpy the contents and
3. return a TupleX(newArray, size+1, true)
Otherwise, you clear this.isOwner, write your next element at contents(size) and create a new instance of TupleX(contents, size + 1, true), sharing the same array.

If you really wished to, you could just reduce it to a newtype over an Array, with the first cell reserved for the combined size/ownership, but that’s inviting erasure conflicts you probably don’t want.

All this is of course moot if your tuple is under 8 elements or so, as object creation cost dwarfs the cost of arraycpy, but tuples are not only ad-hoc multiple return values from a function, and when used in parser combinators for XML or slick for tabular data, they can easily reach dozen of elements, when it starts to matter. The volatile bit can potentially be problematic, at least if a JIT-compiled loop of the standard O(n*n) can execute without waiting for memory loads and stores (I have no idea if it can or could in the future).

Does this seem doable without a huge impact (for a complete freshcomer to the scala compiler/stdlib impl)? Does it even seem worthwhile, or is the current solution deemed good enough for its intended usage (granted, the trick was developed for my homebrew immutable ArraySeq to avoid going through a view or a builder).