Making `for` simpler and more regular

As an educator who works with beginners a lot, I feel that the difference between the yield version (expression) and non-yield version (statement) is significant. Of course, if there is really only one version (expression), then the true simplification is to get rid of the yield. This does require that the compiler be able to optimize down to a call to foreach when the result isn’t used for anything.

I have to say that there are other parts of this proposal that I dislike. While it might simplify the overall syntax to force {} if there is more than one element in the loop, that forces introducing the {} syntax, which I can currently avoid. For the novice it is important to distinguish simplification of the whole vs. simplification of what the beginner must be taught.

Would it make sense to instead of removing the foreach version, make it also require a keyword?

Like: for (thing <- things) do println(thing)

Where the compiler checks the statement/block after do returns Unit or gives a warning.

You can have a deprecation period where keywordless for comprehensions give a warning telling you to add do.

5 Likes

I would find for yield translating to either map or foreach based on the context to be strictly worse than the status quo. Especially since there is nothing enforcing foreach and map to both be available and to have exactly the same semantics except that foreach throws away the result.

I don’t see how you could possibly teach for comprehensions before covering block expressions anyway.
For me personally one of the most confusing things about for comprehensions has always been that () and {} work differently than in every other part of Scala’s syntax.

1 Like

Intriguing idea. I suspect it would help mitigate the problem (by requiring that brief moment of thinking about which one I want), and sort of puts the foreach and map variants on more level ground…

It also has unintuitive evaluation order. You can write things like

for {
  x <- xs
  y <- f(x) if x > 3
} yield y
and be mystified by why f(x) gets called on an invalid value.

Oh my god that’s disgusting. I’ve never seen it in any code anywhere until now and whatever syntax rule allows that should be incinerated. The if should have to be on its own line, or after a semicolon.

2 Likes

My issues with for are a bit different.

  • (... ; ...) vs {... ; ...}: somewhat sympathetic but overall meh. I believe arguments were brought forward for both styles.
  • Explicit vals: we had that and got rid of it. No way going back.
  • require semicolon before if: ditto. Outlawing “postfix” ifs where alignment is otherwise vertical could make a good rule for a linter, though.
  • yield with a type parameter: would complicate things further.
  • foreach: I would hate to have to work without it.

For me, the more pressing issues are:

  • Destructuring vs filtering. We should investigate whether there is a good way to distinguish irrefutable pattern matching from the rest. It has to work uniformly for val bindings and for generators. Maybe use ?=, ?<- for bindings that can fail? A failing match in a for expression would have to filter.
  • Leaving out yield accidentally. I would actually like to revive SIP 12. That would force you to write either yield or do and would also make the parents vs braces debate redundant.
  • Contorted code in the translation of = bindings, as described in this Dotty issue. This one is in my mind very important because it currently leads to bad generated code. To fix this, we’d need to change the rules so that an if in a for expression requires a “zero” or “empty” getter in the generated-over structure, instead of a withFilter method as of now. That’s a bigger change that affects libraries so migration is harder.
3 Likes

for without parens or braces suddenly starts to look so like Haskell’s do-notation, especially if you indent yield the same as other lines:

for
  x <- 0 until N
  y <- 0 until N
  if isPrime(x + y)
  yield (x, y)

It looks so nice and clean!

Do you mean writing several consequent a ?= expr or a ?<- expr lines with meaning that the first successfully matched line wins? If so, are you going to treat all consequent ?=s (or ?<-, respectively) as regarding to the same matching block? It would mean, then, that you cannot have two different matches one after another (unless you put something like () = () in between).

No I just meant ?<- would behave like <- now does. <- would be reserved for destructuring matches, where it is a compile-time error if the right-hand-side does not conform to the pattern’s type.

6 Likes

I am in favour of requiring do (as well as much of the rest of SIP 12). Though in the short term, it’s a big change, in the long term, I think it regularizes the language more

+1 for reviving SIP 12’s do. I was surprised it wasn’t brought up earlier in this topic!

As for using pattern matching with generators, could someone explain the use case? I don’t … see a use in them if they throw a MatchError when they don’t match? All your case statements would just be case _ if predicate <- gen.

I’ve always found for comprehension guards to be useful and to do what I expect them to do (filter). This “postfix if” is rather confusing; I’m not sure why @odersky is so opposed to removing a potentially confusing syntax.

x <- gen if predicate

can be rewritten as

if predicate
x <- gen

or

x <- gen
if predicate

depending on whether the predicate utilizes x or not.

As a completely separate discussion point from the above, something that always bugged me about for comprehensions was the irregularity of what is and isn’t allowed inline in the for comprehension, such as why generalized statements aren’t allowed but why value assignment is, which just makes for workarounds like _ = doMyStatement().

What issues are there with just allowing arbitrary code inside for comprehensions between generators?

Yes, it’d mean using val x = ... instead of just x = ..., but I found assignment syntax to be confusing when I was learning for comprehensions. Assignment without val or var isn’t allowed anywhere else except case classes with default parameters, which is something of a corner case in my mind (and don’t forget that specifically allowing value assignment in for comprehensions is itself odd).

3 Likes

This sounds extremely reasonable!

A compile-time neds-to-match would be great in other contexts, too. Especially when applying case (key, value) to Maps.

scala> Map(1 -> 2, 3 -> 4).map { case (key, value) => s"$key maps to $value."}
res0: scala.collection.immutable.Iterable[String] = List(1 maps to 2., 3 maps to 4.)

I would also be in favour of reviving SIP-12’s do, however, I’m in favor of that part alone. If it were full SIP-12 or nothing, I’d choose nothing.

Could you elaborate on that? Why is there no way of going back, and what were the oppositions against requiring val back then?

1 Like

This is good for clarity but not for language simplicity. We already have case for bindings that can fail.

Okay, but then wouldn’t the same argument also apply to the first part of that (i.e. that guarding against missing yield by requiring everything to evaluate to unit is going to require quite a few ())?

When we have the same problem in two different places, one would expect that the solution would be the same in the two cases. Now, some details do differ, but that doesn’t mean that we should skip presenting the logic behind why the solutions should differ.

You’re right, I didn’t think enough about the first part of that.
To that I’d say, forcing an explicit () in some foreach-for-loops would actually be a higher burden in some cases than a missing procedure syntax

That is, if the body of the loop was a one-liner before, to suppress the possible warning, you’d have to change that 1 line to either a { action; () } line, or into 4 lines if you want to avoid semicolons. It also (if you’re using multiline syntax) an ugly } { (which, among others, makes me want the do).

Following that train of thought, I agree with your second part. But my conclusion is different, while you want a warning in those cases, I’d vote for the opposite: do as the foreach alternative of yield, as @odersky brought up, and : Unit = for procedures.

It’s of course not the same syntax, but I’d say it follows a similar train of logic.

In both cases, for Unit-returning/side-effecting code, you insert a “meaningful” and explicit mention of that between the definition and the execution parts.

For for loops, that is what’s being looped over, the do keyword, and the body.
For methods, that’s the definition (name, modifiers, parameters), the : Unit type annotation, and again, the body.

I am not so convinced about that. case labels a “case”, i.e. an alternative. I feel it is a misuse of meaning to use it for something else. Also, it would introduce a modifier (case) in front of <- and = bindings where no modifier was allowed before. That’s a major syntactic irregularity/complication.

1 Like

I think Ichoran might mean something more along the lines of this:

for {
  case (x, 1) <- pairs
  case (y, 0) = foo(x)
} yield x + y

Would hat carry the same complications?

I think Ichoran might mean something more along the lines of this:

for {
  case (x, 1) <- pairs
  case (y, 0) = foo(x)
} yield x + y

Would hat carry the same complications?

Yes, that’s precisily what I meant:

  • there’s now a potential modifier for generators, where there was none before
  • the lined-up cases suggest visually that these are alternatives, which is a lie of course.
2 Likes

Ah, now I understand it. I read your concern as case <- and case =, which had confused me. Sorry!

I wonder why is it that all of you got bitten by the fact that pattern matching in for-comprehensions filter, instead of throwing match errors, and I really wonder if it is due to how you got introduced to the language. Personally my first interaction with the language was reading Programming in Scala from cover to cover, I learnt what it meant there, and I never used it incorrectly.

I’ve had to teach Scala many times in teams I led and none of the people used the filtering wrong, because they did not know you could do pattern matching in for comprehensions until told, and when they learned it, they were explicitly told what it did. Is the surprise you all get because you started with scala by looking at code and guessing what it does? I frankly don’t understand this disparity in expectations.

Everyone seems to agree on

  • there should be irrefutable patterns that don’t expand to withFilter calls (and hence avoid that overhead)
  • for yields should be more distinguishable than foreachs (I also like the for do alternative for this)
  • it should be more friendly to imperative fors for when they are needed, such as mixing side effects (obviously, without expanding to more filters, like today).