Asking for your feedback on sbt + Scala Center announcement

One more thing, :stuck_out_tongue:

  • some attention to multi-module builds would be nice. Running tasks/commands over the root project doesnā€™t have the affect you expect when for example, a sub-project fails (it really ought to fail, but often it doesnā€™t). Often I find that plugins work around these weirdnesses by redefining everything themselves, eg,
    ā€“ sbt-sonatype appears to override publish for every sub project
    ā€“ sbt-doge redefines world+dog (and actually does a better job than vanilla sbt).

There should be some simple rules defined about how running tasks on a root project works that are well documented and which plugin authorā€™s donā€™t need to figure out work-arounds for.

sbt-doge ā€¦

Yes, making sub-projects respect crossScalaVersions natively instead of requiring this weird plugin would be a good concrete improvement.

Iā€™m using SBT (and Scala) for small projects, so I have my own perspective for its problems.

Performance is not an issue. If I need to recompile entire project the scalac would eat much more time than sbt. If I works incremently, then SBT is already running. Fast start is always better than slow, but current speed is bearable. The same goes for the Scala itself that takes plenty of time just to run hello world script.

Using SBT is not so difficult. Comparing to other widely adopted Scala madness like typeclasses full of implicits generated by macroses. SBT uses dependency graph conception that is slightly harder than usual batch script approach, but data driven programming is pretty familiar for functional programmes that are attracted to the scala. And it is easy enough (comparing to Free monads) to be used even by an imperative programmer. Syntax sugar is very common thing in the Scala world too. And it is used moderately in SBT.

SBT is confusing. Especially for a newbie. In a several ways. And I could not suggest a way out of it.

The first issue is build.sbt against project/build.scala duality. SBT could not refuse either because it aims for scalability. .scala file is sure overkill for a small test project, since it could take more lines to initialize a project that it has itself while single line in the .sbt file enough. For a large project .scala is irreplaceable since it could implement complicated build logic. Removing any of it would split community: some would refuse to update SBT, some would go for a fork/alternatives.

Keeping both rises questions. Where should I put bits of my configuration? What .sbt equivalent is for the .scala code and vice versa? What is the order of evaluations? The latter was a problem for me when I started to use SBT. Confusion could be mitigated with good documentation describing all confused point. Just took into account that users may have previous experience with other build tools, possible make/ant/maven and they have appropriate expectations.

So you need to teach make guy that although there is a order in which configuration files read, they are evaluated later in lazy manner. Maven guy expects special syntax for static configuration without any dynamic scripts. Tell him that .sbt is generally used as configuration with := syntax and project/build.scala is like in-place plugin that describes all non-trivial behaviour.

The second confusion source is half-way DSL. It is somewhere between completely new language and orthodox scala. If you would treat it as clean scala, it would give you some surprises when compiling the project. If you would treat it as complete DSL you may forget to do some courtesy like calling .value. But you still could not enhance DSL further without breaking connection with scala world. And you could not give up on DSL, because it will increase significantly code amount for a small projects. The same scalability problem.

SBT emits confusing errors when it get something that it could not chew. Not a problem for a seasoned scala programmers. They had saw more complex and arcane compiler errors. But SBT should suit for beginners. How else a beginner could start learning scala if not with SBT? And at that stage he could not process build errors naturally. So you need some template/graphic configuration for newbies.

SBT derive strength from turing complexity of the Scala used as configuration language. And there lies great danger for a newbie. To guard him from perils some extra wrapper that would take turing complexity and give confidence is needed.

SBT is poor documented for advanced users. Documentation for beginners may lack few things, but you could spend a hour, another hour at most entire day if you is persistent enough and you would get enough information to use SBT consciously. But the documentation has only two pages related to plugin development. And when you are writing a non-trivial plugin, you should interact with SBT internals a lot. It is not documented anywhere. I was forced to look into the SBT sources and found them hard to read. Probably Iā€™m a weak scala programmer. But there were almost no comments.

So I spend a week to fight SBT and won. But eventually 0.13.5 emerged and my code broke. I find easier to freeze SBT version than to do another dive into SBT sources to fix my code appropriately. So Iā€™m waiting for either SBT stabilize in version and I could take another dive and forget for years or until it finally become well documented.

SBT forget to export number of essential functionality to the users. SBT uses it internally, but users should reimplement it, which sometimes leads to very cumbersome code.

The first feature that comes to mind is caching dependencies. When you use internal SBT tasks they checks if dependencies has changed since the last run and recompile the target only if some changes were made. But when you try to write your own tasks you discover that there is no such functionality provided.

Let take a simple example: we have documentation in a folder with markdown files and would like to compile html site from the files. The SBT would recompile the html each time even if there is not a slightest change in the markdown files, even if it modifier time is unchanged. And user has no means to prevent it unless they would like to write a bunch of code. Even primitive make utility could compare file modifying dates and avoid worthless work. That shortage become pain every time I generates some files manually.

Another example is test and run command behaviour. They scan entire code base and find classes with specific signatures, make list of them and suggests user to select. Such filtering is not available for plugin writers. I have a use case when it is needed: my own test and benchmarking utility. I need to find all benchmark classes and execute them.

One more thing about the previous case. The first problem prevails: if you use naive approach for finding alternatives, full scan would be performed each time user executes mentioned command. So you need a way to cache scan results.

Iā€™ve looked in the SBT source code and it has a lot of black magic for such use cases. Carefully deployed magic, so users rarely wonders how they beloved runMain and testOnly works. But if you carefully read all the provided documentations you should become curious how can config keys, task keys and commands be composed to reproduce complicated behaviour of the mentioned commands.

So I think that the biggest drawback of SBT in its current state is that you could not reproduce basic behaviour with provided public API. So there is an indicator on SBT maturity: when its scala sources build implementation would become decoupled from SBT dependency graph solver (together with cash provider, logger and other utilities) . So all scala-related build mechanics would be delivered as separate plugin, and SBT could be used as generic build tool without it.

SBT needs call-by-name semantics for keys. That is highly subjective theme, but I found current implementation very inconsistent. Suppose you has keys A and B. B depends on A value, e.g. B := A.value*2. They could be accessed through Inner and Outer scope. And A get a reasonable default value 1 in the Outer scope. How Inner/B is resolved? Took Outer/B and inherit it, so it would be 2. And what if Inner/A changes to 10? Inner/B remains the same: it is already assigned 2 and it has no reassigning. Call-by-name means that B would follow the A in every scope.

End users pays little attention to the issue since they get their environment ready to use. Mid users (that uses SBT API to write plugins) developed already workaround: abstract configuration into a variable, and apply it to each scope separately. Piece of boilerplate and endless source for strange errors when someone forgets to apply variable to a particular scope.

3 Likes

I think having a small(er) number of canonical build templates would help.

A great starting point would be the Scala equivalent of

"stack new my-project"
with option of ā€˜libraryā€™,ā€˜applicationā€™,ā€˜multimoduleā€™

The output would be an empty project with default settings and plugins,
e.g. library, would lay out a project (e.g. as per type-level cats/dogs libraries) with core/ docs/ tests / and default scalac options.

Some libraries such as ScalaCheck, ScalaTest should be included by default.

Not only would it make for a much easier build (essentialy most people would fill in the gaps),
it would also encourage good practice. It would ensure that Scala builds are the same across most projects and that would help adoption greatly.

I would go further and suggest things like ā€˜tutā€™ and ā€˜microSitesā€™, ā€˜pgpā€™ be standard defaults.

I think these are relatively small changes but would stop ā€˜sbtā€™ from being a hinderance to Scala adoption, which I think (rightly or wrongly) is the case for some people.

It would be nice if scala-center could do something to improve the publishing experience. If we want more people to contribute scala projects it would be nice if we asked less of them to get it going. I took a crack at it today, ended up in failure. I am sure I will get it going tomorrow but found the variety of options and configuration made it far more difficult than publishing a package to cargo, brew, rubygems, npm etc.

Ivy, Maven, seach for a tutorial, find a snarky choose your own adventure page explaining (somewhat accurately) how hard publishing is in scala. I asked sonatype or bintray on Gitter/scala, got an answer that they both were equally terrible. Then tried following documentation on http://www.scala-sbt.org/0.13/docs/Bintray-For-Plugins.html which said to name my repo ā€œsbt-pluginsā€ when using the sbt-bintray plugin, then tried publishing in sbt-bintray where it threw an error looking for a repo named ā€œmavenā€.

I know someone can jump in and give me historical reasons why itā€™s harder than most environments but what would really be helpful is if we could come up with concrete steps to make it just as easy as it is on the better competing platforms. I am trying to learn it better myself so I can provide more useful feedback, unfortunately this is just an experience report from today when I tried to publish my first sbt plugin. I will say I have easily spent 4 times as long on sbt as I have on the actual code I want to publish which does little to encourage people to share their code.

2 Likes

to achieve the goal of moving to scala 2.12 (IMHO the most important thing to be done), it would be good if the Scala Center helped migrate the sbt plugin ecosystem to sbt 1.0. The new features of 1.0 are small compared to the benefit of a modern scala (allowing FLOSS volunteers to not waste any more time cross building for 2.10)

1 Like

I release to Sonatype and this, to me, was not terrible at all. I followed http://www.scala-sbt.org/0.13/docs/Using-Sonatype.html
In case you want to have a look at a working example, see https://github.com/unic/ScalaWebTest for examples and usage documentation in the readme. Feel free to send a direct message in case I can help you.

PS: I feel that releasing is a bit more complicated, then one would initially expect, because the problem is harder then expected. Artifacts need to contain certain metadata (author, license, ā€¦), artifacts have to be signed, release process needs a sign off/staging process to allow to redo a release in case it is incomplete. Hope this helped.

the recursive build thing is cute but itā€™s basically just more boilerplate to maintain that I would rather was in the build.sbt or supporting .scala files

I have yet to find an non contrived use case for that.

We use the third level of recursion in sbt builds in the Scala.js build. Or rather in the sbt-plugin-test project of the Scala.js build. In that project, we have a project/project/build.sbt containing:

sources in Compile += baseDirectory.value / "../../../ir/src/main/scala/org/scalajs/core/ir/ScalaJSVersions.scala"

This makes it so that, in project/build.sbt, we can use ScalaJSVersions.current:

addSbtPlugin("org.scala-js" % "sbt-scalajs" %
  org.scalajs.core.ir.ScalaJSVersions.current)

to add the current version of the sbt plugin to the plugins of the sbt-plugin-test build. This in turns allows to reference enablePlugins(ScalaJSPlugin) in build.sbt.

In general, this pattern can be reused by any build that defines an sbt plugin, and has a subdirectly containing a test for that version of the sbt plugin.

The minimal valid (and usually complete) .gitignore for an sbt build is:

target/

This is also applicable to multi-project builds, as that gitignore will also ignore foo/target/, as per the rules of .gitignore files.

Ignoring project/project/ (as Iā€™ve seen done in some .gitignore files, including one linked to somewhere in this thread) is incorrect, as project/project/ can contain checked-in files (e.g., the Scala.js build has one). What should be ignored is project/project/target/, but again that is already covered by target/.

4 Likes

Hello,

If you are using IntelliJ IDEA, you want to add .idea to .gitignore.
Thereā€™s a corresponding folder for Eclipse (I think itā€™s .eclipse).

 Best, Oliver

No you donā€™t. You want to add .idea to your global gitignore file. Because even if you use IntelliJ IDEA, I might want to use Eclipse on your codebase. And you never know what crazy editor will use on your codebase, that will create spurious IDE-specific files.

3 Likes

I wanted to add a vote for reproducibility (also called out here: Asking for your feedback on sbt + Scala Center announcement )

At Stripe, we use bazel for scala builds. In bazel, the goal is 100% bit-for-bit reproducibility, and as far as we know (and we have tests for it) we meet that goal. We never have to run clean. We can refactor a large file moving classes from one file to another and have zero issues (no need to run clean). Lastly, we check in the sha hashes of all the jars we are using, so we know exactly which jars each build ran with. Sbt could also compete in this space!

I would really love it if sbt allowed me to commit a lock file with the transitive resolution of all jars (including a sha checksum). This could also be faster since we would not need to walk the pom graph if such a lock file is present.

Secondly, I would love to see a test harness for sbt to make sure we have bit-for-bit reproducibility even after moving file contents around (a common refactoring issue). Maybe such tests exist, but I still see bugs like this with sbt (in 13.15 most recently) and usually I just run clean and nuke the existing state, which is a shame and slows down the next build.

As a functional programming community, I hope we can make more of the build a pure function, and make the entire build fully reproducible by default.

For more information:
https://reproducible-builds.org/
https://bazel.build/ ("{fast, correct} - choose two").

5 Likes

Add support for building libraries from source via Github URL with efficient library caching.

It helps make forking of open source projects easier and eliminates the need for cross-compiling as well as the associated binary compatibility issues.

Iā€™m not familar with ScalaJS, and Iā€™m sure I may be missing some detail, but I wonder whether using a plain-text resource file (version.properties?) with version numbers may be enough, in a hypothetical SBT version lacking recursive builds.

Oh we would find a solution. Iā€™m not saying itā€™s critical. Iā€™m pointing out that there are non-contrived use cases for the recursive nature of sbt.

2 Likes

I gave a talk about this topic last week at ScalaDays. Here are the slides (as part of the sample project): https://github.com/szeiger/leftpad-template/blob/master/The%20Last%2010%20Percent.pdf

The particular problem of sbt-bintray looking for the ā€œmavenā€ repository would be caused by not setting bintrayRepository := "sbt-plugins" as part of the plugin project.

1 Like

to 1) I found a ugly Solution to add an additional Artifactory (Maven Repo) URL for first start of sbt. :unamused:
You have to extend a configuration file in the sbt jar.
If someone is interested I can document it in more Detail.
I used it with sbt 1.0.0-M5

We use the third level of recursion in sbt builds in the Scala.js build. Or rather in the sbt-plugin-test project of the Scala.js build.

The SBT docs suggest using a system property for this. That is the approach that I have used myself when writing tests for SBT plugins, i.e.

scriptedLaunchOpts := scriptedLaunchOpts.value ++ Seq(
  "-Xmx1024M", "-Dplugin.version=" + version.value
)

and

addSbtPlugin("uk.co.jhc" % "jhc-sbt-plugins" % System.getProperty("plugin.version"))

That seems easier to me than adding part of your projectā€™s sources to the buildā€™s sources. The ScalaJSVersions.current could be provided by sbt-buildinfo.

1 Like

I find myself thinking about SBT and Scala Center again this weekend and am thinking release cadence is a place the Scala Center may be able to help.

Simply put, I propose it is a good thing for the community that the build tool release cadence matches the language release cadence.

I frequently see being on Scala 2.10 touted as an advantage, with the term ā€œlong binary compatibilityā€ spoken of as a good thing. I am failing to see the benefit of Scala being on a roughly 18 month release cycle while SBT is on a 36 month cycle. If itā€™s a one time failed experiment then lets leave it at that, but given that just this month I am continuing to hear how the 36 month cycle was great for SBT I am worried it may be repeated, and feel that the thread of it being repeated alone is enough to discourage contributions in this space.

Can we just have the discussion of whether the separated cycles are preferred by tooling developers and users. Or maybe a poll?

2 Likes

you need https://github.com/fommil/class-monkey

1 Like