Optimizing BigInt for small integers

#1

Currently, BigInt is a wrapper around java.lang.BigInteger.

The Spire library has a SafeLong type with identical semantics, which specializes into two cases:

  • an instance when the underlying number fits in a Long,
  • another instance that wraps BigInteger.

That would help tremendously the BigInt performance profile when the underlying value is reasonable.

Spire is considering dropping SafeLong in the future due to the induced triplication (BigInteger, BigInt, SafeLong); however, the optimizations SafeLong provide could be ported to the Scala standard library.

It seems that source and binary compatibility could be completely preserved.

Would that contribution be appreciated? What would be the criteria used to decide whether the PR is merged?

2 Likes
#2

Excuse my ignorance, but I’m wondering: once you wrapped a Long and added logic to detect any overflow, aren’t you already more than half-way towards re-implementing BigInteger? In other words, are we sure that something that switches between Long and BigInteger will perform better than just using a BigInteger straight?

It appears BigInteger is based on an Array[Byte], which makes me wonder whether it would perform better if it was based on an Array[Int] or Array[Long]?

#3

I’m not considering reimplementing BigInteger. As per OpenJDK code, it is already based on Array[Int], and some operations are candidates for intrinsics.

The logic is here: Scala BigInt wraps BigInteger which wraps Array[Int]. Due to people possibly using the class type pattern matching, I doubt “BigInt” will be made an opaque type in future releases.

I makes total sense to optimize the small path of BigInt, which can then be used as a general purpose integer type with minimal overhead and sensible semantics.

However, I don’t have the original benchmark to back up my claim and the speed up would be workload dependent.

(I wouldn’t reimplement BigInteger anyway; the machinery for huge integers is positively scary – Montgomery multiplication etc. The Long/BigInteger split however, can be implemented and tested quite straightforwardly).

1 Like
#4

If this is really a win and if it doesn’t introduce pathological edge cases, wouldn’t it be better to implement it in the upstream OpenJDK BigInteger type so that everyone can benefit?

#5

The JVM BigInteger type is not exactly performant compared to other languages. I have no idea why it is not optimized further (see http://www.wilfred.me.uk/blog/2014/10/20/the-fastest-bigint-in-the-west/ )

However, this is not my immediate concern. What I have on my hand is well performing, thoroughly Scala code that could replace the stdlib wrapper without breaking compatibility.

2 Likes