It’s very easy to implement - just copy from SortedSet. The following is a mutable example for 2.11.8, the version I’m currently working on. I will post a complete example when I finish building it.
package scala.collection.mutable
import scala.collection.generic.MutableSortedSetFactory
import scala.collection.immutable.RedBlackTree.Tree
import scala.collection.immutable.{RedBlackTree => RB}
import scala.collection.{GenSeq, SortedSetLike}
import scala.runtime.ObjectRef
/**
* A mutable SortedList heavily based on SortedSet which "is not designed to enable meaningful subclassing".
*
* Based on: https://github.com/scala/scala/blob/3cc99d7/src/library/scala/collection/mutable/TreeSet.scala
*
* @author Mygod
*/
//noinspection ReferenceMustBePrefixed
class SortedList[A] private(treeRef: ObjectRef[RB.Tree[A, Null]], from: Option[A], until: Option[A])
(implicit val ordering: Ordering[A])
extends AbstractSeq[A] with LinearSeq[A] with IndexedSeq[A] with GenSeq[A] with BufferLike[A, SortedList[A]]
with SortedSet[A] with SetLike[A, SortedList[A]] with SortedSetLike[A, SortedList[A]] with Set[A] with Serializable {
def insert(elem: A): Boolean = {
val old = size
this += elem
old < size
}
// AbstractSeq
override def apply(i: Int): A = RB.nth(treeRef.elem, i).key // Should i out of range result in NullReferenceException?
override def length: Int = size
// IndexedSeqLike
override def update(idx: Int, elem: A) {
val x = apply(idx)
if (!ordering.equiv(x, elem)) {
remove(x)
this.add(elem)
}
}
// BufferLike
override def +=:(elem: A): SortedList.this.type = +=(elem) // prepending is the same as the list is sorted
override def insertAll(n: Int, elems: collection.Traversable[A]): Unit = appendAll(elems)
override def remove(n: Int): A = {
val result = apply(n)
remove(result)
result
}
// Optimized methods
override def clear(): Unit = treeRef.elem = null
override def indexOf[B >: A](elem: B, from: Int): Int = lookupIndex(treeRef.elem, elem.asInstanceOf[A]) match {
case result if result < from => -1
case result => result
}
override def last: A = RB.greatest(treeRef.elem).key
// Tree helper methods, TODO: expand to non-recursive form?
private def lookupIndex(tree: Tree[A, Null], x: A): Int =
if (tree eq null) -1 else ordering.compare(x, tree.key) match {
case 0 => RB.count(tree.left)
case cmp if cmp < 0 => lookupIndex(tree.left, x)
case _ => lookupIndex(tree.right, x) match {
case i if i < 0 => i
case i => RB.count(tree.left) + 1 + i
}
}
// Copy everything else from TreeSet
}
Apparently one class can’t be a Set and Buffer at the same time (thanks to the implementation of SetLike[A, Container[A]], etc.) so I have to make it into a SortedSet that supports Buffer-like operation.
@jvican I haven’t benchmarked it but theoretically everything should run in O(log n).
Yes, I’m very sorry for the delay, I was on vacation!
Some benchmarks would be nice, but this is more of a long-term goal so it can wait for now. It would be good if you could synchronize with @julienrf to see potential ways we could integrate this into the scala collections. Though the ongoing efforts in the collections front is to redesign and simplify the hierarchy, adding this collection would be really useful for beginners that find it in other languages’ standard libraries.
Pinging @odersky in case he has some thoughts to share with us!
Given compatibility constraints, it would ship with 2.13 and the new collections library. @julienrf, @szeiger, and @odersky are collaborating on the development of this new library.
Since one of the authors of the new collections library seems open to the inclusion of a SortedList, I would guess that this is a green light for you to contribute an implementation to the collections-strawman repository, pending some benchmarks etc. Am I right @julienrf?
On a more serious note, I think an efficient SortedList would be a useful addition to the standard library. We already have sorted maps and sets. I"m not sure it should extend any Set base classes though because it’s not a Set (but I can’t think of a specific contract that it violates off the top of my head). I would expect it to extend Iterable.
It would indeed be useful to validate the compatibility of such a non-Set/Map/Seq collection with the new collections design. For this purpose it doesn’t have to be a complete or optimized implementation, we don’t even need to merge it at this point, but a proof-of-concept PR could demonstrate how it fits into the design.
Sorting is not a feature inherent to sets. I think that such a collection would be really handy, I’ve required it some times in the past!
I think so too. I agree with the sentiment of benchmarking all the collections redesign to make progress on large strides and with strong evidence of performance improvements. It would be good if you guys can create an issue in the repo to keep track of it, this is a good candidate to outsource it to the community.
I’m not sure this is the best way to approach the implementation, though. I agree with @Jasper-M that conceptually it’s not very intuitive and the less classes/traits that collections inherit the best for performance (AFAIK).
Oh, I didn’t pay attention that your implementation extends Set! So, what’s the result of e.g. foreach? Do you skip the duplicates or not?
A collection that can contain duplicates can not be a Set.
That being said, the idea of having a SortedList (as well as a SortedVector, that should have better insertion performance) makes sense to me. I would be happy to see how we can include them in the new design.
Ah, OK, so we are just having a naming issue here It seems that your SortedList is more a Set supporting index access. Maybe we should name it SortedIndexedSet.
I’m not sure we should support all the possible combinations of Sorted, Indexed and {List, Set, etc.} in the standard library but for sure the design of the collections should make it easy to implement a custom combination without having to copy/paste dozens of lines of code of an existing implementation.
Conceptually, SortedList allows duplicate elements. The C# SortedList is only useful when you want the same capabilities than C# SortedSet but you also want to access elements by index. I don’t think this is something as popular to include it in the collection. Wouldn’t a list that allows duplicate elements make more sense?
Having a look at your implemention, I see that you’re not following the C# SortedList interface (that adds a (key, value) pair instead of just an element to the collection).