Proposal to remove procedure syntax

nafg · August 15, 2018, 12:43am

There have been discussions about improving for comprehensions over the years. I agree Dotty is the right time to push hard for them.

quiray · August 15, 2018, 4:52am

Not, it is not. ) { vs ): Unit = { for public methods is not 1 character and to be frank, I want my side-effecting methods/functions to be clear that they return nothing, so one cannot use the variant without type ) = {, because last expression in a function might return something (surprisingly common), type would get inferred as non-Unit and then you have a type-signature you never wanted and the returned value is usually very confusing, because it is often random useless value. And of course if you call this mistakenly non-Unit-returning function as a last in other function which was supposed to be a procedure, the confusing type and value spreads.

jvican · August 15, 2018, 6:58am

Yes, this is the right moment to propose improvements in this area! I’ll respond in the other ticket.

scottcarey · August 15, 2018, 7:49am

Depends on the project. Its usually better (and easier to test) stuff that returns values, so unit returning effects should be in fairly well contained places in the code-base. Its a code smell if there are too many of them. A very, very bad code smell. “Too many” is very context dependent of course. If you’re truly working with low level I/O effects its quite a bit different than high level business logic. At higher levels of the program, tools like monix.Task are great, but this means types are Task[Unit] and not just Unit.

I guess that brings up one more negative for procedure syntax:
It can’t be used for ‘unit like clones’ whether that is Runnable closures, Task[_] or other user defined types that are basically the same thing.

If I were to have 5+ developers start at once, there would be more formal training. But in a work environment training of that sort is IMO a waste of time without enough scale. It is a far more effective use of my time to have one-on-one sessions with real code and real goals. We locate some projects from the backlog that are not on the critical path and are not too complicated. The process of learning a new language with an actual task to do, with people who can help you when you get stuck working in the same room, is a significantly more rapid way to learn a language than any abstract set of classroom sessions.

For those I am teaching, these issues don’t bog them down too long, but sometimes after a half day they are stuck on something that can be illuminated by adding a single type ascription or informing them of some fun quirk in sbt or specs2. This learning process is much more effective as they ‘learn the hard way’ but without being left to toil away in frustration too long.

So its not about the loss of productivity for my new hires, as we have experts around to answer why the compiler is giving them some obtuse error, do code review and enforce best practices, and let beginners trip over some warts, get up, and quickly learn from their mistakes.

But for others, who don’t have such support, I can only imagine that while one of my beginners has a 2 hour delay before asking me for help, somebody is trying to teach themselves Scala all alone late at night, getting frustrated and wondering whether they should just switch to Kotlin instead.

The key part of my statement is ‘if possible’. I should have been more specific – if the operation naturally has any way what-so-ever to communicate any variance in what the side effect did, it should be returned.
Hence, SQL UPDATE and DELETE return the number of rows affected, you don’t have to go query the DB again to find out what happened. ConcurrentHashMap returns enough info on putIfAbsent to know whether the item was inserted and what the previous value was. TrieMap likewise has atomic operations that return info. Set.add returns boolean so that you don’t have to call Set.contains right after an add to see if it added.

I primary work with mutable data structures and concurrent data structures for performance reasons, and these informative return values are invaluable.

For my own code, Unit returning methods are few and far between unless they are private implementation details. Effects are pushed to the edges, so that most logic is written and tested without state. Encapsulated mutable state ends up looking a lot like the data structures above – returning values related to the state change when relevant.
Effects on the edges often are I/O and if asynchronous would be Futures or Tasks and not raw Unit unless the app is simpler and its OK for main() to synchronously call the main effect.

Being easy to spot is not a good or bad thing. Its just a thing. You know what else would really jump out at me when reading? If the language dictated that all keywords be In AlTeRnAtInG cAsE.

I don’t find it any harder to locate unit returning methods when using : Unit than procedure syntax. Its really easy to spot. Thus, I don’t ascribe much value to procedure syntax other than the lost horizontal space.

Can you clarify? Just what needs to be re-learned? Anyone who knows Scala today knows what:

def foo(): Unit = println("foo!")

means so I’m not sure I understand what needs to be re-learned if procedure syntax is removed.

I actually started with this as an alternate proposal but did not get far. My #1 motivation is that there are two ways for doing the same thing and we should have one. The other solution to that problem is to disallow : Unit = { and require that all unit returning methods use procedure syntax.

But this just makes the language even less consistent. It makes def a unique snowflake among def/val/var/lazy val.

def x {
  println("x")
}

Ok, quick, change that to val, var, or lazy val. Oh. Ok.

val x: Unit = println("x")

So I couldn’t just replace def with val? nope. And writing this would be forbidden:

def x: Unit = println("x")

So any refactoring that changed types (perhaps from Unit to Task[Unit], which would often mean changing some defs to vals as well) would be more difficult.

Restricting unit returning methods to procedure syntax might imply we have to allow

val x {
  println("x")
}

for the sake of consistency, but… eeew.

And I kept coming up with more and more ways that the lack of uniformity between the forms would be awful and stopped.

RichType · August 15, 2018, 7:53am

In my view removing those abstract method forms is a big plus, especially as the base trait now defines the method return type for overriding sub traits.

gabro · August 15, 2018, 9:08am

FWIW, Scalafix has a rewrite for that case too. It’s a fairly trivial fix which shouldn’t require manual adjustments.

Jasper-M · August 15, 2018, 10:14am

If I didn’t read most of the rest of your post I would think you are advocating for keeping procedure syntax. Because side effecting methods simply cannot be replaced with vals, unlike pure (parameterless) methods.

buzden · August 15, 2018, 12:45pm

Hahaha, that’s funny because of this discussion with the following statements:

jducoeur · August 15, 2018, 1:33pm

… and that conversation was started in response to my comment you’re quoting, I believe…

Ichoran · August 15, 2018, 5:57pm

That’s a good point. Solo-learnability is important. The Rust compiler’s exquisitely helpful error messages are a perfect example of this. The errors you get when accidentally using procedure syntax are not one of scalac’s strong points.

Ah, yes, that is very important. I still maintain that having operations where you cannot count on the result and therefore have to query some return value (which the compiler will let you ignore!!) is not ideal, but often it’s unavoidable.

This isn’t true for me, though. I have a lot of Unit-returning methods because there are many operations that are guaranteed to succeed. For instance, I often have objects that encapsulate state, but am able to arrange the types so that the method calls always succeed if they can be called. For instance, if the state is some data and I want to smooth it, I’d usually much rather have a smoothing algorithm that works on anything than one that occasionally might fail or require further action on my part.

I’m still left wondering if anyone who actually does use a non-negligible number of Unit-returning methods prefers : Unit = over procedure syntax.

Don’t just type def foo(i: Int) { println(i) } and think you’re okay. It takes a while to smooth over those cognitive glitches. “Okay, I’m going to print this. Oh, wait, I have to use : Unit. Now, what was I thinking? Right, printing. So…”

It was! But I have been troubled by for-foreach vs. for/yield-map/flatMap/withFilter for a while. If the principle of language design one is following is to provide flexibility for expert users, it’s okayish. But for simplicity of language design and teachability in a small number of consistent concepts, it’s not so hot.

So you just prompted me to actually write something about it.

scottcarey · August 15, 2018, 6:35pm

Sure they can, if the val is a lifted function. val x(): Unit could be syntactic sugar for val x: () => Unit. I’m not advocating for that, but side effecting methods can most certainly be replaced with vals.

Lets talk a bit about methods, functions, syntax, and consistency.

object Methods {
  def lunch: Unit = println("mmm, method sandwich") // bad style, because...
  def launch(): Unit = println("pressed the big red button")
  def handleOrders(orders: String): Unit = orders match {
    case "lunchtime"  => lunch
    case "launchtime" => launch()
    case "hammertime" => println("do do do do, do do, do do")
  }
}

Methods.handleOrders("hammertime")
Methods.handleOrders("lunchtime")

output:

do do do do, do do, do do
mmm, method sandwich

lunch is missing a parameter list, which means that it is source compatible with other forms that would break the contract. launch is not. Change it to Unit / functions, and:

object Functions {
  val lunch: Unit = println("mmm, function sandwich") // OOPS! eager execution
  val launch: () => Unit = () => println("pressed the big red button")
  val handleOrders: String => Unit = _ match {
    case "lunchtime"  => lunch
    case "launchtime" => launch()
    case "hammertime" => println("do do do do, do do, do do")
  }
}

Functions.handleOrders("launchtime")
Functions.handleOrders("lunchtime")

output

mmm, function sandwich
pressed the big red button

Maybe the meaning of val lunch: Unit should not be an eagerly executed unit, but instead be treated as if it was type => Unit? Of course we have lazy val, but that executes only once. Or allow the syntax val lunch: => Unit? Or even val lunch(): Unit as sugar for val lunch: () => Unit.
That is a bit off topic so I don’t want to go into detail there. Also there might be dragons, I haven’t looked.

Anyway, this is one motivator for the best practice that side effecting things should have parameter lists.

But in the case of launch, the change from def to val is source compatible at the call site and also does not change behavior. launch() means the same thing as a def or a function val. What is unfortunate is that we had to change from method to function syntax. It sure would be strange, but if val launch(): Unit was in fact short-hand for val launch: () => Unit then there would be more uniformity in the language in this case.

This is where my crazy example of trying to use procedure syntax for a val came from. It would be possible to allow val procedures, compiled as function values. That would make procedure syntax less niche and more consistent, but would not simplify the language.

Ok, but sometimes it is better to use some other type:

object Sammy {
  val lunch: Runnable = () => println("mmm, Sammy sandwich")
  val launch: Runnable = () => println("pressed the big red button")
  val handleOrders: String => Runnable = _ match {
    case "lunchtime"  => lunch
    case "launchtime" => launch
    case "hammertime" => () => println("do do do do, do do, do do")
  }
}

scala.concurrent.ExecutionContext.global.execute(Sammy.handleOrders("lunchtime"))

In some ways this is the cleanest of them all. But there is no way it can be made consistent with procedure syntax, even with the crazy val procedure idea. Procedure syntax only works with Unit returns and is incompatible with SAM typing.

This is also a case where it might have been nice (and consistent with def) to type:
val launch(): Runnable = println("pressed the big red button") as short-hand for val launch: Runnable = () => println("pressed the big red button"). Dragons are probably hungrier and larger here, though.

Lastly we get to:

// a very simplified Task variant like monix or cats effect
class Task[A] private (f: () => A) {
  def run(): A = f()
  def flatMap[B](fm: A => Task[B]): Task[B] = Task(fm(f()).run())
}

object Task {
  def apply[A](f: => A): Task[A] = new Task(() => f)
}

object Monads {
  val lunch: Task[Unit] = Task(println("mmm, sandwich within a sandwich"))
  val launch: Task[Unit] = Task(println("pressed the big red button"))
  def handleOrders(orders: Task[String]): Task[Unit] = orders flatMap {
    case "lunchtime"  => lunch
    case "launchtime" => launch
    case "hammertime" => Task(println("do do do do, do do, do do"))
  }
}
val task = Monads.handleOrders(Task { println("issuing order"); "lunchtime" })
println("about to issue order")
task.run()

prints

about to issue order
issuing order
mmm, sandwich within a sandwich

I guess the summary is that lifting from method to function has a lot of syntactical overhead, but SAM types, functions, and more complicated effect frameworks are all very similar. With some arm twisting it would be possible to make procedure syntax more consistent with lifted function values. But that is as far as it could go.

Methods are the odd one out even if we remove procedure syntax, in a couple ways. Perhaps the gap could be narrowed.

Procedure syntax only works with def, only works with one specific return type. Of all the common ways to work with side-effects or signal side effects, it only works for one of them.

scottcarey · August 15, 2018, 6:45pm

Ah, thanks. I was thinking about the reading of code aspect since so much discussion had been around the ability to know if something had effects or not, which is something a reader cares about. A writer… well they know that answer You were talking about the writing of code aspect.

scottcarey · August 15, 2018, 6:50pm

Well, I have them, down in the ‘fiddly bits’ of various things that deal with state. And of course there is the ubuquitous foreach or similar.

But once I escalate it to a high level application ‘operation’ like a background task that compacts state or optimizes structures, those end up using other frameworks. They end up in Actors, Runnables, or Tasks that represent the same thing – but can not use procedure syntax because that doesn’t work with any type other than Unit and I’m using other types to represent effects for high level application stuff.

NthPortal · August 17, 2018, 4:15am

That was exactly my experience when I first started learning Scala. def foo(p: Param) { is extremely visually similar to def foo(p: Param) = { (a single character of difference, excluding spaces), and the fact that the two have completely different meanings caused me quite a bit of confusion. It is also very similar to Java’s syntax (type method(params) {), but in Java, that syntax is how you return a value.

I also find the syntax quite irregular and inconsistent. Why should Unit as a return type get special syntax, but not any other return type? Even the frequent sources of these ideas, C and Java, require you to specify the void return type.

quiray · August 17, 2018, 4:52am

Because it doesn’t return anything? After all it is in other places in the language, e.g. for comprehensions (missing yield for side-effects).

As I wrote earlier, it seems to me that this is an attempt to make the language more verbose to help beginners. How long is one a beginner? A week, two, a month? What about advanced users heavily using side-effecting methods? Since ) = { form isn’t guaranteed to return Unit, we would have to use everywhere mouthy ): Unit =. I think all of these “beginner issues” could be solved by IDE (e.g. different color of a method name when using procedure syntax, or special color of = to make it stand out) and better compiler messages or even compiler options for beginners (e.g. disallow/warn about procedure syntax), no need to push language in Java direction.

Jasper-M · August 17, 2018, 12:23pm

Strictly speaking it returns ().

dhoepelman · August 17, 2018, 1:25pm

Strongly in favor of this change, it makes the language more consistent and favors clearly denoting side-effects with : Unit.

quiray · August 17, 2018, 5:43pm

Yes, you are technically correct. It returns a value which programmer never cares about, because the return value is known at a compilation time. Just because it can be made more regular doesn’t mean it improves the language. For example would you say that dropping for or match (both can be expressed by other, more “regular”, constructs thus removing irregularity, both are non-trivial from the perspective of the compiler) is a good direction?

If I keep using Scala and this proposal passes (it seems it is already decided), it will be for me quite a downgrade. I won’t be using mouthy : Unit = {, but a shorter version = { which will lead to worse type checking, since I am no longer guaranteed a method will return Unit (a random type from last expression of a method can propagate and can lead to confusing types - worse error messages, and I would also expect to worsen compilation times slightly).

I don’t really like the approach “beginner/compiler friendly at a cost of everybody else”. Simple default warning when using procedure syntax (e.g. “You are using an advanced language feature…” describing what procedure syntax is) with a compiler flag to disable it would IMO suffice to help beginners (and people who don’t want to use it).

The language direction seems incoherent from my perspective. In an other thread, Mr. Odersky is against val in for comprehension, even though it would lead to a more regular language - brevity is obviously preferred. Yet here it seems making the language a bit worse for intermediate/advanced users to keep code concise is suddenly no longer a preferred direction, being regular wins here.

tpolecat · August 17, 2018, 8:19pm

I am in favor of this change because it removes a special case. Unit should be treated like any other type.

som-snytt · August 18, 2018, 3:17am

Apparently, any type with literal support in the language must be treated with suspicion.

(), true|false, 42, "hello, world", null

There are magic numbers and magic strings. We should start referring to the magic unit value.

Issue to lint inferred Unit.

I didn’t feel strongly about this topic when teachers of beginners began to report a few years ago that procedure syntax was confusing. But the pro-procedurists have persuaded me, and I also wish it had been relegated to a SIP-18 feature flag.

My biggest pain point is writing unit tests with : Unit =. But now, resigned to the new syntax, I realize that my test framework should not require me to write unit tests that way.

Has anyone checked whether Unit could be omitted? Then the Ur-assignment operator could be resurrected:

def f() := println("hello, world")

to mean omitted Unit.

I can’t wait for class syntax to evolve to class C = { } because I’ll miss having this conversation.