Clarification on dependencies in SP (scalajson) - VectorMap

While developing https://github.com/mdedetrich/scalajson I have realised that I need to create a new type of collection to satisfy the requires of a JSON Object, mainly I require an immutable map that has effectively constant lookup on all operations as well as maintaining key order.

Unfortunately there is no scala collection which satisfies this requirement which means I would have to make a new collection type (the only known current way to make such a collection is for it to be backed internally by a map + vector).

There is currently a branch in the project which implements this collection, however I realised that it would be far better design plus easier to maintain if I put this collection into its own library which I wish to call vector-map. The library would have the exact same design goals of ScalaJson (in the sense that it would have zero dependencies, be ultra stable with minimal releases and conform to all of the binary compatibility guarantees that scalajson itself has).

The implications of this would be that scalajson is no longer truely zero dependency (it would depend on vector-map). Also there are questions about whether to include VectorMap in the 2.13 collections redesigned.

An initial implementation of the library can be found here https://github.com/mdedetrich/ordered-map

2 Likes

Hi @mdedetrich,

I’d be curious to know more about the performance differences with the existing HashMap.

I think (but I might be wrong) nobody actually uses ListMap. So, if VectorMap is always a better choice than ListMap we should just replace ListMap with VectorMap in 2.13.

If you mean existing collection.immutable.HashMap, I can’t use it because it doesn’t maintain key insertion order (well it does, but only up to 4 keys and thats only because its specialized, i.e. see Map1/Map2/Map3/Map4)

EDIT: I have clarified original post to reflect this

To maintain insertion order we have LinkedHashMap, but only mutable. We should probably have an immutable implementation, you are not the first one to ask about.

Well if you are talking about mutable collections, you can use a standard collection.mutable.HashMap to maintain insertion order while having effectively constant lookup on index/key (scala.collection.immutable.ListMap is based off a linked list so it will have terrible lookup by key/index). In fact this what all JSON ast representations use for their mutable representations (including Javascript).

EDIT: I incorrectly mentioned LinkedHashMap when I was talking about scala.collection.immutable.ListMap. The post has been edited

If we are talking about immutable collections, the only known implementation (which has effectively constant lookup by key/index as well as maintaining insertion order) is a Map that is backed by a immutable HashMap/immutable Vector. Scala has scala.collection.immutable.LinkMap, but its backed by a linked list which doesn’t have the effectively constant lookup time (it does however have ordering)

Its pretty standard, other libraries like https://facebook.github.io/immutable-js/ have it (its called OrderedMap, see https://github.com/facebook/immutable-js/blob/d3bce8d9baacac1bf9c233cec1c84e25e8db4083/src/OrderedMap.js)

I have posted my initial implementation, for people to see here https://github.com/mdedetrich/ordered-map

I still need to add tests/benchmarks as well as publishing the library

mutable.LinkedHashMap is backed by a hash table, so it has constant-time lookup. It behaves similarly to a linked list because its entries store references to prior and subsequent insertions; however, it is not backed by a linked list, and only shares a linked list’s behavior with regard to traversal of elements.

Sorry I got my collections confused, I was talking about scala.collection.immutable.ListMap, will clarify the original post

@julienrf Have made an issue on the collections strawman about this https://github.com/scala/collection-strawman/issues/293

I have also renamed OrderedMap to LinkedMap as noted by @pathikrit (in Scala, ordered implies collections that are ordered by some key/index where as linked implies collections which maintain key insertion order)

Repo has been renamed to https://github.com/mdedetrich/linked-map

Is making this lib a dependency of something included in the Scala Platform just making it part of the Scala Platform without a vote? Is it meant to be shared by others using the Scala Platform, and if not, should it be shaded or just included in the ScalaJson source?

There is a document detailing ScalaJSON’s current issues here https://docs.google.com/document/d/1Xcaj7Alvmhch8nUf31L0UzELd5Pf8tzf2x6XbwQCeUQ/edit?usp=sharing