Toward a brighter SBT future

andreaTP · September 18, 2018, 3:01pm

At ScalaItaly I met with @dotta and we discussed a bit over SBT, at the end we decided that it will be worth to bring the attention of the community on the topic, and here I am.

What happened recently during the transition from 0.13 to 1.X was a great success and many things dramatically improved breaking minimally the end user API.
This resulted in having most of the ecosystem seamlessly following with a tiny effort and guidance.

Now, I would like to think ahead, and focus on the key pain points that anyone is facing while using Sbt (specially Sbt plugins maintainers), to pave the road to Sbt 2 maybe

We identified two major issue categories:

performances
We still pay long waiting times while waiting for sbt to start and caching doesn’t help in many cases.
API simplification
The most common reaction after teaching Sbt beginner stuffs to people has been so far: “Now I understand why I did X in the past”, I do believe that we can do better by analyzing the current situation and finding out a minimal subset that will make builds easier to understand (now we also have cbt and mill experience to teach us).

Into these categories I would at first try to collect in no particular order specific items from the community, I start:

performances
dependencies lookup looks to happen more frequently than expected even in presence of fully formed caches
ivy2 is used as default, but tools like ‘coursier’ demonstrated the tech gap
API
Scopes on 3 axes, it’s extremely hard to identify a Key in a 3 axes space
Configurations / Projects they are both “folders” for isolating Keys, files etc., (i.e. the root concept overlap) if we use a Occam razor only one has to survive (I’m in favor of Projects since Configurations seems to be a direct translation of Ivy2 concepts)
TaskKey is both an axis and a value, this is extremely confusing
Static lookup instead of dynamic dispatch of ‘.value’, this is antithetic to what we normally use in a program workflow

I’m not arguing to have answers to the point I listed and I respect and love the work done specially by @eed3si9n @dwijnand @jvican (and many more) I’m just pushing for an even more awesome future, Sbt is still the first tool anyone coming into Scala has to touch, and we should take special care of it IMO.

sadhen · September 18, 2018, 3:29pm

Performance is important.

eed3si9n · September 18, 2018, 4:32pm

Thanks I am glad this transition went well, in no small parts due to community effort to migrate plugins and builds over to sbt 1.

performance

I think we can improve different aspects of the performance. I think a good place to start would be to come up with various scenarios of performance that we care about.

I’ve written profiling guide if a contributor wants to investigate a particular issue.

idea: better UX on what’s going on

What I kind of think interesting is that performance sometimes could be subjective. There’s usually a demand to reduce clutter on screen, and reduce logging. But when a task takes too long, sbt would look like it’s pausing forever. Some UI improvement to let the user know what’s happening could both improve the user experience, and also quickly identify the culprit of slowdown in the build. See Log is either too noisy or unhelpful.

idea: thin client

To mitigate loading, we could investigate the possibility of having an instance of sbt running in the background at all times. That’s the approach thin client takes.

API simplification

I think spacial representation (using subproject axis) is promising.

One area I started looking into is for cross building. One thing I would recommend is identify some specific idea, post something on discussion board, and try it out in a build or a plugin.

idea: spacial-representation-of-test plugin

For instance, if people want to stop using Test configuration, and use subproject axis, you can probably start doing that by injecting some test related tasks into Compile configuration, which can be a plugin.

librarymanagment API

When we shipped sbt 1, one of the things we did was to remove direct dependency to Apache Ivy, and abstracted it to librarymanagement API. What we need next is a Coursier implementation for the API. There’s a pull request for that - https://github.com/sbt/librarymanagement/pull/190. Last I checked, it wasn’t running the scripted tests when it was hooked up to sbt, but I think this is a challenge that can be worked on.

AMatveev · September 18, 2018, 5:14pm

It would be great if there were exists something like gradle cache.
The Gradle build cache is a cache mechanism that aims to save time by reusing outputs produced by other builds

lihaoyi · September 19, 2018, 9:55am

You really should try using Mill. Virtually everything you mention is fixed in Mill™…

performances
We still pay long waiting times while waiting for sbt to start and caching doesn’t help in many cases.

Mill uses a thin client + daemon by default so starts in ~200-300ms. Even “cold” non-daemon invocations start in 1-2s: much less time than SBT takes.

API simplification

Mill’s API is a lot simpler…

dependencies lookup looks to happen more frequently than expected even in presence of fully formed caches

Mill doesn’t do that

Scopes on 3 axes, it’s extremely hard to identify a Key in a 3 axes space

Configurations / Projects they are both “folders” for isolating Keys, files etc., (i.e. the root concept overlap) if we use a Occam razor only one has to survive (I’m in favor of Projects since Configurations seems to be a direct translation of Ivy2 concepts)

TaskKey is both an axis and a value, this is extremely confusing

Mill has no scopes; tests are just separate modules, and it works great. If you want some duplication (e.g. “every Foo module also has Bar and Baz modules included”) you simply define a trait and inherit from it.

Static lookup instead of dynamic dispatch of ‘.value’, this is antithetic to what we normally use in a program workflow

Mill does that

better UX on what’s going on

Mill has pretty good debuggability of the build dependency graph, with mill path, mill plan, mill show, mill inspect, and mill visualize.

Some UI improvement to let the user know what’s happening could both improve the user experience

Mill prints the currently executing task to the terminal in blue, so you never have a case where it’s unclear what’s happening “now” from Mill’s point of view.

and also quickly identify the culprit of slowdown in the build

Mill dumps a out/mill-profile.json file automatically, showing how much time is spent in the various build steps:

$ cat out/mill-profile.json
[
    {
        "label": "mill.scalalib.ZincWorkerModule.classpath",
        "millis": 1312,
        "cached": false
    },
    ...
    {
        "label": "main.client.javacOptions",
        "millis": 5,
        "cached": false
    },
    {
        "label": "main.client.compile",
        "millis": 3876,
        "cached": false
    }
]

We already have this information, so dumping it to JSON is trivial to implement and very useful as a user.

thin client

Mill’s client is much thinner than SBT’s thin client; it’s only 500 lines of dependency-free Java code, and boots in 200-300ms, rather than 3-5s.

If you haven’t tried Mill, you should: it’d be a lot easier to overhauling the SBT codebase. Even if you decide you do want to go back and invest time moving SBT forward, Mill can give you a taste of what things can be like. It’s a whole new world out here

andreaTP · September 20, 2018, 8:37am

@eed3si9n thanks for your hints.

I personally like the spacial-representation and the thin-client ideas, I will give these topic a spin in the next days!

Performance
looks like this is the most important point, I have the feeling that Sbt is not loaded CPU or memory wise (even if a faster startup will be appreciated), but IO is problematic since caches are often “re-checked” online (or this is the feeling), is there any action plan already started? Is the coursier integration going to help here?

Overall I’m not familiar with the Lightbend Tooling way of working, in the Akka Team we can follow on https://github.com/akka/akka-meta is there anything similar for Sbt and related?

@lihaoyi I perfectly understand your statements and I truly appreciate your work on Mill.
Still a lot of effort has already been spent around Sbt, the ecosystem is already huge and most of the “boring” and “time-consuming” integrations have already been done there.
My personal position is to learn from Mill as much as I can to understand what is actionable back in Sbt to move forward.