Ergonomics of end markers

odersky · June 21, 2022, 4:19pm

End markers were one of the new things that came with indentation syntax in Scala 3. They were largely unknown territory, so until now there are no hard rules where to put them. Here’s what the docs said:

An end marker makes sense if

the construct contains blank lines, or

the construct is long, say 15-20 lines or more,

the construct ends heavily indented, say 4 indentation levels or more.

Now, with one year of experience, I find the first point is by far the most important. An end marker “feels right” if the definition it closes contains blank lines. In that case, leaving out the end marker seems almost always awkward. So I would be ready to recommend to make that a rule that can be enforced by linters and style tools. By contrast, the second and third points feel much more situational.

This brings me to another point. An end marker for definitions has to repeat the definition name. That helps to sync the programmer’s and the compiler’s understanding what is closed, and it is great reading aid for long classes and methods that span a page or more. But it can get in the way for shorter definitions. Example:

def longwindedlyNamedMethod = 

  // an inner method
  def recur(x: Int, y: Int) = 
     if x == 0 then y else recur(x - 1, x * y)

  recur(1, 1)
end longwindedlyNamedMethod

Here, I think the second use of longwindedlyNamedMethod is overkill; it is obvious what is closed so the name following the end just adds clutter. I have found myself hesitating whether to put an end marker in situations like this. Sometimes I decided to drop the blank lines instead. So, maybe we need a third option. We could be a but more flexible and allow end markers closing definitions to come without repeating the defined name. I.e. like this:

def longwindedlyNamedMethod = 

  // an inner method
  def recur(x: Int, y: Int) = 
     if x == 0 then y else recur(x - 1, x * y)

  recur(1, 1)
end

An end marker that is not followed by anything must close a definition that starts on the same column; it cannot close an expression.

If that choice is available, I think I could reformulate the recommendations when to use end markers as follows:

always use an end marker if the closing definition contains blank lines
the end marker should be followed by an identifier if the closing definition is longer than ~20 lines.

What do people think about these choices?

bjornregnell · June 21, 2022, 5:00pm

One option could be to, similar to how you can write end for and end while etc., allow to write end def and the current rules can be kept largely as is just adding “def” to the list of options in the EndMarkerTag syntax rule. Thus allowing:

def longname =
  "much code with blank lines"
end def

Seems more consistent with the current scheme IMHO (I have actually found my self trying to write end def while in a coding flow…), although a bit longer than just end.

(And perhaps allow end class etc as well for consistency.)

soronpo · June 21, 2022, 5:11pm

I’m a heavy user of end markers, but only thanks to scalafmt that adds them automatically, and without them I don’t think I would have migrated to indentation-based syntax.

For the proposal, I’m mainly worried about adding yet another option. Since the end token is added automatically, it does not bother me much that it has a name in short defs, and it is a must for long ones. So we’re mainly discussing if the readability for short defs is enough to justify adding this option. Not sure. One could argue that consistency is better.

bjornregnell · June 21, 2022, 5:16pm

Another worry with allowing just end is that people might get sloppy and use just that without end something even if that something is short, and the “sync” between compiler and coder on what is ended is lost.

After thinking more it seems consistent and safer to always require an EndMarkerTag. Your example with code clutter is solved by allowing end def.

tgodzik · June 21, 2022, 5:18pm

I agree that consistency is better and I would not allow single end since that might be problematic if anyone used end as a name. Not sure how much work would that require from the Scalameta parser though.

end def seems a better solution and would also not impact the parsing too much.

A question comes to mind, would we be able to have an automatic rule that would use def instead of a full name and what would the rule be? Or just leave scalafmt as is currently?

Ichoran · June 21, 2022, 6:33pm

This is my favorite option due to increased consistency.

I would also suggest allowing end def foo, end class Bar. It’s entirely possible, though atrocious naming, to have def foo inside class foo, which would render the disambiguation helpful (also useful in case of similar rather than identical names).

Then the general rule is: if you have a (braceless) block, you can end it with end; if you have a block of type b, you can end it with end b; if you have a named block with name n, you can end it with end n, and if you have both, you can end it with end b n. In every case, you can add any or all identifying information to the end marker.

One caveat about my recommendation, in the interests of transparency: although Modula-2 was my first serious programming language, and I adored it (and still kind of miss it in some ways), I don’t personally like end markers in Scala. 100% of the time if something is long enough to need end markers, I would rather have braces: they are both clearer and less obtrusive to my eye. So I’m arguing about end statements from general principle, not from personal usage.

Also, I would hope that all automatic format or linting tools would have a setting to respect a lack of end markers for people (like me) who don’t use them.

lihaoyi · June 23, 2022, 2:04am

I personally think they’re too aggressive. In particular, both rules penalize programmers for breaking up chunks of code with blank lines. Scala code already has a reputation for being too dense, rather than too verbose, and we should encourage blank lines to mitigate that issue.

The “default” choice for users should be what we want to encourage: python-like syntax, with blank lines to aid readability. The two rules above mean either (a) the default is a ruby-like syntax, or (b) the default is to discourage blank lines, both of which I don’t think I can get on board with.

rgwilton · June 23, 2022, 7:42am

I often use blank lines within larger methods to split sections of related logic together and make them more readable. E.g., if case statements had more than one line trivial logic then I would naturally put lines between them to make the separate case statements more visually obvious.

I like Scala 3’s ability to leave out braces for short methods and type definitions. I.e., the places where the braces don’t really aid visual readability of the code. However, I’m a less convinced that end markers are objectively better than braces, even more so now that VScode has started using colours to match pairs of braces.

Hence, I suspect that I will end up being inconsistent and using a mix of styles:

end markers for large top level classes/objects (e.g., 50+ lines)
brackets for medium sized methods
indentation only for small methods, blocks of control logic, etc.

davesmith00000 · June 28, 2022, 11:56am

I agree.

One of my code bases is 33k lines of pure Scala 3. I never use end markers except to signal the end of a long class/object, or occasionally in a file with lots of objects to help readability, e.g. in a file that holds a chunky ADT based on a sealed trait.

My own experience:

I’m very happy with end markers as they are.
I personally find end markers for defs weird/surprising/heavy/a code smell, and don’t use them.
I like using blank lines to break things up as @lihaoyi suggests, and I personally think it’s totally readable just with good indenting.
I don’t miss braces in the slightest.

I would be upset if we started enforcing a convention of end markers all over the place. Scala 3 is lovely and clean, let’s keep it that way!

TheInnerLight · June 28, 2022, 12:52pm

My view is that end markers should never have been introduced. I guess the practical alternative to that now is to recommend against their use in all circumstances.

Whitespace sensitive syntax is not revolutionary, it exists in lots of other languages such as Haskell, F# and Python. People are able to read and understand code perfectly well in those languages without littering their code with end markers. I’m not aware that the lack of end markers is considered a notable issue in any of those language communities.

I fully accept that there is an adjustment period in moving from braces to whitespace sensitive syntax but let’s embrace a convention that encourages full adoption rather than every code base being a different take on a slightly weird half-way house.

bplommer · June 28, 2022, 2:14pm

Strongly agree. Significant indentation is a nice way to reduce visual noise for small blocks of code, but when blocks reach a size where you want to explicitly delimit them - well, that’s a solved problem, and the solution is braces:

def longwindedlyNamedMethod = {

  // an inner method
  def recur(x: Int, y: Int) = 
     if x == 0 then y else recur(x - 1, x * y)

  recur(1, 1)
}

TRoland · June 28, 2022, 2:30pm

If we’re making the method name optional at the end marker, I propose we shorten it down further to just } - and just to keep consistency, we could also indicate that people should be on the lookout for the end marker by inserting an opening marker - if we insert (or replace the colon) a {, I think that would be ideal!

charpov · June 28, 2022, 2:58pm

I like end markers. Maybe too much Ada and Modula in my younger days. I still find them preferable to:

or

        } // end of loop
      } // end of function foo
    } // end of method bar
  } // end of class
} // end of package

megri · June 28, 2022, 7:25pm

I agree with this. I also wonder if end markers are a band aid for poor visual guidance; essentially that the more indented something is—like 4 spaces—the less need to explicitly mark where things end.

odersky · June 29, 2022, 7:16am

An example of a language with end markers and indentation is Lean.

https://leanprover.github.io/lean4/doc/namespaces.html

end markers are mandatory for namespaces, which is analogous to objects in Scala.

bplommer · June 29, 2022, 8:27am

charpov:

I like end markers. Maybe too much Ada and Modula in my younger days. I still find them preferable to:
        }
      }
    }
  }
}
or
        } // end of loop
      } // end of function foo
    } // end of method bar
  } // end of class
} // end of package

That’s fair, but unlabelled end markers lose any advantage in this respect.

charpov · June 29, 2022, 12:15pm

Except for the fact that they’re optional. But I agree that labeled markers work better for me. I use them once loops/methods/classes have blank lines, and I don’t use unlabeled markers at all.

bplommer · June 29, 2022, 1:01pm

Yes but my point is that exercising the option removes the advantage end markers have over braces - so any use case for exercising the option of unlabelled end markers is already served just as well by braces.

aepurniet · June 29, 2022, 5:02pm

You can delete an optional end marker and still have a valid program. However you cannot do the same thing with a brace.

unkarjedy · July 2, 2022, 4:27pm

You can delete an optional end marker and still have a valid program. However you cannot do the same thing with a brace.

It’s not an issue with proper IDE support.
Removing matching closing/opening brace is a trivial action.