Optimizing BigInt for small integers

denisrosset · June 14, 2019, 1:53pm

Currently, BigInt is a wrapper around java.lang.BigInteger.

The Spire library has a SafeLong type with identical semantics, which specializes into two cases:

an instance when the underlying number fits in a Long,
another instance that wraps BigInteger.

That would help tremendously the BigInt performance profile when the underlying value is reasonable.

Spire is considering dropping SafeLong in the future due to the induced triplication (BigInteger, BigInt, SafeLong); however, the optimizations SafeLong provide could be ported to the Scala standard library.

It seems that source and binary compatibility could be completely preserved.

Would that contribution be appreciated? What would be the criteria used to decide whether the PR is merged?

curoli · June 14, 2019, 3:39pm

Excuse my ignorance, but I’m wondering: once you wrapped a Long and added logic to detect any overflow, aren’t you already more than half-way towards re-implementing BigInteger? In other words, are we sure that something that switches between Long and BigInteger will perform better than just using a BigInteger straight?

It appears BigInteger is based on an Array[Byte], which makes me wonder whether it would perform better if it was based on an Array[Int] or Array[Long]?

denisrosset · June 17, 2019, 4:55pm

I’m not considering reimplementing BigInteger. As per OpenJDK code, it is already based on Array[Int], and some operations are candidates for intrinsics.

The logic is here: Scala BigInt wraps BigInteger which wraps Array[Int]. Due to people possibly using the class type pattern matching, I doubt “BigInt” will be made an opaque type in future releases.

I makes total sense to optimize the small path of BigInt, which can then be used as a general purpose integer type with minimal overhead and sensible semantics.

However, I don’t have the original benchmark to back up my claim and the speed up would be workload dependent.

(I wouldn’t reimplement BigInteger anyway; the machinery for huge integers is positively scary – Montgomery multiplication etc. The Long/BigInteger split however, can be implemented and tested quite straightforwardly).

clhodapp · June 17, 2019, 7:37pm

If this is really a win and if it doesn’t introduce pathological edge cases, wouldn’t it be better to implement it in the upstream OpenJDK BigInteger type so that everyone can benefit?

denisrosset · June 17, 2019, 8:53pm

The JVM BigInteger type is not exactly performant compared to other languages. I have no idea why it is not optimized further (see http://www.wilfred.me.uk/blog/2014/10/20/the-fastest-bigint-in-the-west/ )

However, this is not my immediate concern. What I have on my hand is well performing, thoroughly Scala code that could replace the stdlib wrapper without breaking compatibility.

julienrf · June 19, 2019, 8:01am

Hey Denis,

Thanks for your message, yes I think such a contribution would be highly appreciated . If your implementation of BigInt is an order of magnitude faster than the current one for values that fit in a Long, I have no doubts that your PR will be merged! De you need any help?

denisrosset · June 20, 2019, 4:10pm

Hi Julien,
thanks for your message! I haven’t hacked the stdlib before (apart maybe from bug reports and documentation), so I welcome help in setting up a good process.

First and foremost, we can’t introduce bugs during the change. The Long/BigInteger split will introduce additional logic, and the interesting cases are when the numbers are just at the boundary.

I think the change should be introduced in a few steps:

Change BigInt into an abstract class and a BigIntegerBigInt implementation class. Verify that binary compatibility is preserved, and that tests pass.
Introduce a LongBigInt implementation class; still perform the computations using BigInteger, but allocate the result as a LongBigInt if it fits in a Long. Be sure that existing tests cater for the boundary cases.
Migrate piece by piece the logic from Spire to avoid the detour through BigInteger.

I’m currently working with Ubuntu and Scala on the JVM. I need help setting up a minimal process that verifies that binary compatibility is preserved, and tests pass. The testing process should include parts of the community build as well after each step above, before we merge the PR in the stdlib.

In particular, I’ve never run any tests on my own computer; I’m familiar with testing with the JVM, but with Scala JS or Scala native. I may want to use test procedures with different levels of exhaustiveness to get feedback in a shorter amount of time (hacking Scala is sadly not my day-to-day job).

BTW, I’m sure of the positive performance impact this would have. Is there a standard framework to benchmark improvements in the stdlib, and where is it documented?

NthPortal · June 23, 2019, 8:49am

MiMa does this for you. I don’t know how to run it manually in sbt (I’m sure someone here or in Gitter can help you with that, or google), but Jenkins runs it as part of the build regardless.

JMH is used for benchmarking; you can find a README and code under test/benchmarks

denisrosset · November 1, 2019, 10:17pm

Ok… binary compatibility will not be preserved as the constructor of BigInt is public, and BigInt will become an abstract class.

However, source compatibility should be preserved as long as user code calls the BigInt.apply factory method.

Questions:

Is breaking binary compatibility a deal breaker for that optimization? There may be a way around that, by making LongBigInt a subclass of a non-abstract BigInt.
If we accept to break binary compatibility, then we have to assess the amount of source compatibility breakage. How many libraries call the constructor of BigInt directly, rather than using BigInt.apply? Is the Scala community build actually rebuilding packages, or just running tests using an updated standard library?

smarter · November 1, 2019, 11:37pm

It’s a deal-breaker for 2.13 but not 2.14.

The community build rebuilds everything from source, that’s why it takes ~24 hours to run

denisrosset · November 5, 2019, 5:40pm

I think I have identified a good way to proceed.

A. I will first clean up the current code to prepare the LongBigInt / BigIntegerBigInt split. In particular, channel direct constructor calls through BigInt.apply, optimize a few of the current methods, port some of the Spire tests. This would be done preserving binary and source compatibility. I’d PR this against 2.13 so it could be merged ASAP.

B. Perform the breaking changes with minimal changes: make BigInt abstract, introduce a final BigIntegerBigInt concrete instance with private constructor.

C. Preserving source/binary compatibility with the previous step, add the LongBigInt concrete instance, migrate code from Spire’s SafeLong.

Questions:

Is there a thing as “too much test code” for BigInt? I’ll to err on the cautious side due to amount of changes this introduces.
What would be the roadblocks to have this merged? Does it introduce risk for too small a benefit? Do I need to demonstrate performance benefits to have the merge approved in 2.14?
I have one question about the design. At the final step, BigInt will have two implementations, one holding a Long, one holding a BigInteger. Currently, Spire’s implementation has the invariant that the BigInteger instance must not hold an integer that’s a valid Long. That enables a few additional optimizations but forces all instance creations to go through a size test, and adds a bit of mental overhead when reading the code. What do you think?
After step B), the modifications would be merged against 2.14. There is no branch named 2.14 yet, right?

smarter · November 5, 2019, 6:08pm

The more tests the better :).

If the motivation for the change is performance then I think having benchmarks demonstrating the improvements is a reasonable requirement to get it merged.

I think that’s reasonable, benchmarks demonstrating these additional optimizations would be nice.

Yes, for now you can create your PR against 2.13.x, you can then retarget it to 2.14.x once that branch has been created.

sjrd · November 5, 2019, 6:29pm

In Scala.js, we would eventually like to have a third subclass that would use a JavaScript bigint internally. We could do that on our own if the subclasses that you introduce not only have a private constructor, but are private themselves.

denisrosset · November 5, 2019, 7:25pm

Ok, I’ll get a comprehensive test suite for BigInt running first, then I’ll thinking about the next steps.

How do you usually run benchmarks on changes to the Scala library?

denisrosset · November 5, 2019, 7:28pm

Happy to make the subclasses private to help performance in Scala.js; that’ll be an additional argument to get this merged.

My main motivation in this: Spire has a SafeLong class that has been proven faster than BigInt or BigInteger ages ago, and I’d like to integrate these improvements in the stdlib. Then the Spire typeclasses do not need to depend on Spire concrete data types which makes Spire modular.

denisrosset · November 5, 2019, 8:11pm

I may start to uncover things. See https://github.com/scala/bug/issues/11792 for one possible issue, and https://github.com/scala/scala/pull/8526 for the ongoing work.

denisrosset · November 6, 2019, 12:09am

Can I get feedback on the PR https://github.com/scala/scala/pull/8526 , and why it’s failing?

Earlier commits failed because I thought I had uncovered a bug, thus I added a failing test. Now all tests are passing.

smarter · November 6, 2019, 12:36am

The scala repo requires that every commit in a PR passes the CI, this makes it easier to do bisection, partially revert a PR, and generally leads to a better commit history.

denisrosset · November 8, 2019, 8:40pm

Actually I found a way to preserve binary compatibility in 2.13, at the price of slightly increased memory usage.

I’m changing the BigInt class definition to


/** An arbitrary integer type; wraps `java.math.BigInteger`, with optimization for small values that can
 * be encoded in a `Long`.
 */
final class BigInt private (_bigInteger: BigInteger, _long: Long)
  extends ScalaNumber
    with ScalaNumericConversions
    with Serializable
    with Ordered[BigInt]
{
  // Class invariant: if the number fits in a Long, then _bigInteger is null and the number is stored in _long
  //                  otherwise, _bigInteger stores the number and _long = 0L

  def this(_bigInteger: BigInteger) = this(
    if (_bigInteger.bitLength <= 63) null else _bigInteger,
    if (_bigInteger.bitLength <= 63) _bigInteger.longValue else 0L
  )

Then BigInt stays final, we can have optimized logic for small values.

Compared to the subclassing proposal:

in the case of a long-sized BigInt, we save the price of the BigInteger instance, so the 4/8 bytes of the reference are not an issue,
in the case of a larger BigInt, we now store a unneeded Long field, which implies a maximal memory overhead of around 10%; that maximal overhead is computed for a BigInt that barely doesn’t fit in a Long.

I conjecture that BigInt has two major use cases in the Scala world:

use as a generic integer where most of the instances fit in a Long, but BigInt is used to avoid problems related to overflow; that’s how we use BigInt (or the optimized SafeLong equivalent in Spire)
use for cryptographic purposes.

In the use case 1., we’ll have a net win. In the use case 2., the overhead is likely to be small as numbers will be much larger than a Long.

I propose to optimize BigInt while preserving binary compatibility in 2.13, running it against the community build as an additional safety measure.

Then split the logic using subclasses in 2.14.

What do you think?

denisrosset · April 27, 2020, 3:45am

Modification has been submitted as a PR for discussion in