Pre-SIP: A Syntax for Collection Literals

Ichoran · January 16, 2025, 7:17pm

Despite protesting, I’m in the camp of thinking that we do have a problem. I just don’t like this solution as presented.

It’s the same problem, fundamentally, as x.plus(y) instead of x + y. Some operations are just so incredibly common that you want to have specialized syntax for it to reduce cognitive load and ease interpretation. Braceless syntax is another (successful) example of this.

In some cases, one writes a lot of collection literals. In those cases where it’s necessary, I absolutely do notice that it’s more awkward in Scala than in Rust or Python or whatever, if I can use the basic datatype. (If I have to use something else, everyone is in the same boat.)

The question is whether this can be done so that it (1) works with tooling and (2) is pretty much a strict improvement to everything. I don’t know about (1) but I’m pretty sure as designed (2) isn’t going to be true–the prospect for confusion over simplicity seems high.

dos65 · January 16, 2025, 7:21pm

I really like the proposed syntax. It seems like a small change that has a big positive impact.

I opened my current project and did the search for List.apply. I haven’t found any place where I would be disappointed if it will be replaced by [...].

For begginers, in addition to well known syntax from other languages, it also may contribute to better start experience where they omit some details about collections. I mean what to choose Seq or List, what is the difference and etc.

I haven’t expected to see the tooling concern about it. I mean if we compare it with previous changes like fewer braces this one doesn’t feel problematic at all.

fanf · January 16, 2025, 8:32pm

About the the concerns of learnability: It’s evidently not a problem in a dozen other languages.

It’s not so much the learnability in isolation that is a problem than the fact that it creates one more schism between different groups of scala dev, balkanising a bit stronger away the idea of “a” language. Or even core set of shared convention.

I’m an old scala dev (almost 2 decades), and I often have difficulties to understand the syntax I see in conferences or other code bases. Perhaps I’m especially dumb, but between scala 2, scala 3, with or without brackets, given or not, changes in “_”, etc etc, I feel extremelly confused as soon as I need to share my experience with other Scala dev.
Adding one more voice to the babel tower doesn’t feel to me like the best way toward simplicity. Nor easiness, for that matter. Not for me, at least.

mberndt · January 16, 2025, 10:06pm

I agree, but collection literals don’t cross that threshold. They’re common but not that common, and, more importantly, the syntactic overhead can be brought down to a single character today without any language changes. Shout-out to @JD557 for his library straw man from the other thread (do look at the Scastie!)

What is an extremely common operation is to create a new object of the expected type by calling apply on that type’s companion object, hence the original “companion object apply” proposal. Limiting it to collections makes it both much less useful and easier to replace with a library. And at that point it simply doesn’t provide enough value.

eed3si9n · January 16, 2025, 10:15pm

I didn’t mean to imply that [0, 1, 0] syntax is especially going to be more difficult than any of the previous syntactical changes made in Scala 3.x lifetime. My meta-ask/reminder for the Scala Improvement Process is to perform environmental impact assessment prior to the acceptance of the proposals, as specified in Process Specification as follows:

Once the implementation is deemed stable, including appropriate tests and sufficient support by the tooling ecosystem, the implementers and reviewers can schedule the SIP to the next Committee meeting for final approval.

I just wanted to remind the stakeholders (including myself as tree-sitter-scala contributor) that this hasn’t been the case for recent SIPs, and this SIP would be a good opportunity to get on track.

tarsa · January 16, 2025, 10:33pm

between

and

there could be:

val v = Vector // or alternatively you could do import with renaming
val matrix = v(
  v(1, 0, 0),
  v(0, 1, 0),
  v(0, 0, 1)
)

it works already for scala 2 and scala 3: Scastie - An interactive playground for Scala. and it has majority of the conciseness advantages

Sporarum · January 16, 2025, 10:34pm

For me the one thing that would make me think “We really need new syntax for this” would be nix-style collection literals:

val people = [
  Person("John") // no trailing comma !
  Person("Paula")
  Person("Rain")
  ...
]

val settings = [
  "brightness" -> "bright"
  "fov" -> 90
  "subtitles" -> true
]

With potentially even:

val matrix = [
  [1 0 0]
  [0 1 0]
  [0 0 1]
]

val settings = [
  brightness -> "bright" // lhs is a string literal or magic enum value
  fov -> 90
  subtitles -> true
]

Again, I don’t think we should do this, but that would justify a syntax change, as the above is absolutely impossible at the library level (discounting custom string interpolators)

scalway · January 16, 2025, 10:37pm

val min = 4
[3 min 12 33]
//[3.min(12) 33] or [3 4 12 33]

how you will parse this?

eed3si9n · January 16, 2025, 11:03pm

Python has built-in List, Dictionary, Set, and tuples, and there’s a special literal for each of the datatypes. It’s One Way of doing things, and WYSIWYG (what you see is what you get).

Scala 3 will have:

Three Ways of defining a list List(0, 1, 0), 0 :: 1 :: 0 :: Nil, [0, 1, 0].
What you see isn’t what you get

A good mid term exam question would be: Explain the following function (1000 words):

val makeList = [a] => (a: a) => [a]

mberndt · January 16, 2025, 11:06pm

If Python is what those people want, then that’s what they should use.

Sporarum · January 16, 2025, 11:10pm

I won’t get more into details here, but it can only work if it parses as:

List(3, min, 12, 33)

For the other option, use:

[(3 min 12) 33]

Ichoran · January 16, 2025, 11:19pm

Having comma inference would be a big plus. But we already can:

val settings = (
  brightness = "bright",
  fov = 90,
  subtitles = true,
)

so it’s only the matrix version which we can’t really do now, and just like with semicolon inference, we would need to use the separators within the same line, so

val m = [
  [1, 2, 3]
  [4, 5, 0]
  [6, 0, 0]
]

is the best you could realistically do (but honestly, that’s pretty nice).

jeremyrsmith · January 16, 2025, 11:20pm

What grinds my gears about this, is that the other thread had lots of discussion and the overwhelming conclusion (from what I read – including agreement by the person who originally proposed it) was that it’s not worth adding complexity to the language (syntactically and for type inference) just to save a few keystrokes on defining collections. Yet here’s the idea moving forward anyway (and the reasoning smells a lot like “Well JavaScript does it”).

First off, there are several reasons why using square brackets for this is a bad idea: even if it doesn’t introduce syntactic ambiguity for the parser (is that assured?), it does add a big ol’ kink to the rules when you’re moving from C-ish languages to Scala – “square brackets are only for type parameters and type arguments; we don’t use them for indexing or collection-related things” gets an addendum of “except for collection literals – those are also square brackets”. What’s gained is not sufficient for the added confusion, IMHO.

More generally, special syntax for this or that isn’t going to move the needle for Scala. Being able to make examples look a little prettier isn’t going to help. Saving a few keystrokes in rare cases isn’t going to make a difference when people choose a language.

What’s going to make a difference are things like:

Performance. As fast as Java (faster is better, of course!). Lower-cost abstractions.
Tooling for the language. IDE support. Notebook support. Pleasant-to-use build tooling (and build tool integrations)
Useful/understandable/actionable error messages. Out of the box, but also (especially) in weird type-astronaut API situations.
Metaprogramming (Scala 3’s principled metaprogramming is really awesome, – and I would bet that the shared mechanisms between compile-time macros and staging capability is probably Scala 3’s “killer app”, if I were a gambler. Staging has a huge amount of potential in particular)
Stability. Bincompat problems have held Scala back in the past. TASTy promised to solve this, but tools that really demonstrate the promise haven’t materialized.

IMHO, we need more discussion and focus on things like that and less discussion and focus on how syntax would make some cases look nicer or take fewer keystrokes. Keystrokes aren’t the problem, and syntactic changes that only save keystrokes by eliding explicitness are a distraction and an anti-improvement.

Folks who want a concise syntax for collections can just add Tuple conversions. This adds nothing to the language, prevents overloading square brackets, and allows concise syntax for creating collections inline.

lihaoyi · January 16, 2025, 11:47pm

A lot of complaints here are general complaints about tooling. I agree that’s a problem, but it doesn’t seem like a blocker: if the folks working on a change need to collaborate with Metals or Intellij to get a patch out before final acceptance that’s probably doable.

One thing that everyone seems to miss is that this sort of “target-typed literal” syntax is already present in Scala:

SAM-conversion: we already have () => blah syntax that can mean different things depending on whether it is run standalone or whether it has a target type!
new{} syntax: this can become new Foo if the target type is Foo, and otherwise becomes an anonymous subclass of Object
Even arbitrary method calls like foo(bar) can mean different things due to type inference and implicits depending on the target type, v.s. called in a no-target-type context

Imagine if Scala didn’t have SAM-conversion, and you had to write out SAMs explicitly new Runnable{ def run() = blah } whenever passing parameters. Or if it didn’t have type-inference and implicits, which would mean foo(bar) has the same result regardless of target-typing or not. That would definitely make the language simpler, but clearly there is a benefit from this kind of target-typed shorthands that people appreciate the ones that we already have.

Arguably these target-typed shorthands are much of what makes Scala not Java. Which is why the cases where Scala is more verbose than Java really stick out like a sore thumb, with collection construction (like the one below) being more verbose than basically ~every other programming language in existence.

val matrix: Array[Array[Int]] = Array(
  Array(1, 0, 0),
  Array(0, 1, 0),
  Array(0, 0, 1)
)

There is definitely a lisp/scheme-like simplicity to Scala’s insistence of using the collection name to explicitly construct collection literals. It’s simple, it’s elegant, it’s easily parseable. And nobody else does it! Even Clojure has decided that despite it’s lisp heritage, it is worth special casing some collections syntactically with square brackets. The simplicity of lisp/scheme is worth something, but the rest of the programming community in every other language has decided that some things in life are worth special casing, and I don’t think they are all wrong.

djspiewak · January 16, 2025, 11:57pm

Respectfully, your argument is basically “Scala already has a lot of syntax sugar, so there’s precedent for adding more” combined with “many common languages have this specific sugar”. The problem with the first argument is it’s ignoring the fact that this proliferation of different syntaxes in composition is exactly what everyone finds hard about Scala! When teams and companies and even individuals choose to not adopt Scala, even Scala 2 (which is much simpler, syntactically), this is far and away the most cited reason in my experience. It doesn’t seem prudent to lean into this problem by adding to the pile.

As for the matrix example, it’s easy to get something much more concise like Matrix((1, 2, 3), (4, 5, 6), (7, 8, 9)) even today. If you want to be super fancy and rely on careful type inference, you could even drop the leading six characters. I just don’t see the burning need here.

mberndt · January 17, 2025, 12:13am

Java doesn’t have collection literals, not even Array literals, all it has is Array initialization with {}. What Java does have is List.of(), which is more verbose than Scala’s List().

Because other languages usually don’t have companion objects with a variadic apply method.

You keep coming back to this “everybody else does it too” rhetoric because at this point it’s impossible to justify this feature in any other way. Again, @JD557 has demonstrated that this can be done as a library. So where exactly is the point of this feature over simply adding something like that to the standard library? As far as I’m concerned, that, together with the overwhelming opposition we’ve seen in this thread really should be the last nail in the coffin for this thing.

satorg · January 17, 2025, 12:19am

I’d like to add my five cents on this, if I may… Personally, in my entire Scala career I rarely suffered from the lack of collection literals. Because usually in real-world apps/services collections are read from somewhere (configs, databases, files, etc) and then processed. And if it happenes I decide to round some corners and hardcode collections in the code, I’ll most likely regret it later on. So from my perspective, the collection literals syntax looks like a minor improvement, if any, so I’m not sure it’s worth the turmoil.

Yes, languages like Python do have them, but I’m a little concerned about the trend of “pythonizing Scala” – Python is a very different PL that shines in different areas and struggles where Scala shines.

Yet, Scala doesn’t have a clear unambiguous syntax for tuples, because the regular parentheses-based one that Scala uses is no way clear – it tends to conflict with regular grouping, method calls and so forth. Perhaps if Scala wants to get clear collection syntax, it makes sense to start with tuples in the first place. Because tuples (a) are collections too, heterogenous though (b) are more often constructed inline in production-level code than regular collections.

jeremyrsmith · January 17, 2025, 12:26am

This is actually a pretty good comparison, because it’s highly analogous to what’s being proposed. However, SAM conversion doesn’t introduce any new syntax – it only allows existing lambda syntax to be used for a new purpose (seamlessly specifying things that are essentially functions anyhow). The closest analogous solution would be to allow using Tuple syntax to seamlessly specify things that are essentially Tuples anyhow – and this can be already be done (in opt-in fashion) with conversions.

The other two examples are reaching, I think.

lihaoyi · January 17, 2025, 1:31am

Sure, but what are the alternatives? We can’t start removing things from the pile, because that would break backwards compatibility, which is also a big problem! Should the pile just stay exactly where it, frozen forever, and anyone who wants to improve the scala language should just close up shop and go home?

If you say “I don’t think this feature is worth it” i can accept that, but the surrounding reasoning around “not adding things to the language” doesn’t really make sense. There already is a language that is fixed without new features: Scala 3.6. It doesn’t make sense to limit Scala 3.7 and above to the same constraints when Scala 3.6 already exists and does the job.

jeremyrsmith · January 17, 2025, 3:18am

I don’t think this feature is worth it