Pre-SIP: A Syntax for Collection Literals

channingwalton · January 20, 2025, 9:42am

I would like to reiterate … the primary point of pain, but a huge margin, I hear from everyone I work with is tooling despite the progress with Metals and IntelliJ. It simply isn’t good enough anymore. Effort should be spent in that area, not this.

Many people agree that Scala is a fantastic language, but a great language must be supported by great tooling for it to succeed. People in the commercial world are leaving Scala because of the developer experience which is exhausting and frustrating. They are more productive in poorer languages like Kotlin or Java because the tooling more than compensates for the inadequacies of the language.

Features like this are nice, but I fear the elephant in the room is being ignored.

mberndt · January 20, 2025, 9:47am

If we’re going to discuss tooling, then we’re going to need something more specific than a generic “tooling isn’t good enough” complaint.

channingwalton · January 20, 2025, 9:49am

Apologies, I should clarify that my issue is one of prioritisation and where energy should be spent, and to my mind it shouldn’t be on features like this.

mberndt · January 20, 2025, 9:50am

This has been discussed at length both in this thread and the other one.

stewSquared · January 20, 2025, 9:52am

I did see the follow-up case class literal proposal; I thought that was an elegant way to handle this, if still a little uncomfortable that they aren’t really named tuples anymore while sharing the syntax.

I agree the former is bad for multiple reasons, but I found it fun to illustrate this by taking it to an absurd conclusion: data defined as what is essentially unlabelled s-expressions. It’s because these are difficult to work with that lisps add records and structs.

stewSquared · January 20, 2025, 10:20am

While I like the symmetry of also using parentheses-based syntax for collection literals in the case where we’re using them for case class literals, I don’t think it aligns as consistently as you might hope; merely removing the constructor name.

With the case class literals, when we remove the constructor name, we also want to add the labels, or constructor parameter names, to avoid the case of ambiguous unlabelled tuple ASTs.

But List.apply(elems: A*) has a repeated parameter. I don’t see a nice way to use elems in your examples. So there’s no visual distinction from the collection and an arbitrary unlabelled tuple, which is even worse than the unnamed tuple to case class situation. Furthermore, what if there were non-repeated parameters beforehand? Do we add a label to the first one? What about multiple apply methods?

Maybe this would work if (...)* was valid syntax if an applicable constructor exists that ends in repeated parameters? I’m not sure this works as a consistent syntactic rule between the two (just omit Object.apply and name the arguments). The collection literal is definitely still a distinct case.

OndrejSpanel · January 20, 2025, 11:47am

Joking: I file a priority claim of the idea of using tuples for literals in Pre-SIP: a syntax for aggregate literals - #213 by OndrejSpanel

I am afraid we start moving in circles, repeating the same arguments over and over with just slight variation. The thread is becoming very long and it is hard to read and remember what was already said, but it is still easy to post a new post. If this should be constructive, some moderation effort (keeping track of arguments and content) would be necessary, but I do not see where would this come from?

mberndt · January 20, 2025, 12:38pm

I feel that this point is underappreciated. An improved type inference algorithm that can propagate this information inwards is probably a less intrusive and more general solution.

And actually, there already is a case in Scala where type information is propagated inwards: lambda expressions.

val x: String => Int => String = a => b => a * b

This can infer that the type of a is String and the type of b is Int. Maybe instead of having more and more syntactic forms that allow this kind of inward propagation we can somehow generalize it?

diesalbla · January 21, 2025, 3:15am

As there already are long posts above, let me point out we already have string templates. For example, the Json library circe has a syntax to create Json objects using a block-quote, such as the following one:

  val mat = json"""
    [ [1, 0, 0]
    , [0, 1, 0]
    , [0, 0, 1] ]"""

Likewise, one could define string templates to parse collection literals, and even use whitespace for a separator:

  val oneTwoThree = seq"1 2 3"
  val anotherLit  = seq"Pi  cos(2.0)  E*3.0"
  val diag: Seq[Seq[Int]] = mat" [1 0 0] [0 1 0] [0 0 1]"
  // Or even cleaner 
  val diag = mat""" 
        1 0 0 
        0 1 0
        0 0 1
     """
  // https://en.wikipedia.org/wiki/Rotation_matrix  
  def rotate(a: Double) = realmat("""
      cos($a)   - sin($a)
      sin($a)     cos($a)
  """
  // We can even write complex numbers 
  // https://en.wikipedia.org/wiki/Pauli_matrices
  val pauli2: Seq[Seq[Complex]] = 
    complexMat"""
      0   -i
      i    0 
    """
  val empty = seq" "
  val mapHor = map""" 1: "one"    2 : "two"   3 : "three" """
  val mapVert = map""" 
      1 : "one" 
      2 : "two"   
      3 : "three" 
   """
   // how about combining list parsers and date parsers? 
  val boeMpcs: Seq[Date] = dates"""
      6 Feb 2025   20 Mar 2025   8 May 2025   19 Jun 2025
      7 Aug 2025   18 Sep 2025   6 Nov 2025   18 Dec 2025
    ""

This would have some advantages over the proposal.

It does not extend the syntax, so it has no impact on tooling. From skimming the comments, it seems the language is already a struggle for tooling developers to support.
The quotes creates a boundary with the Scala language, which removes the need to fit with the rest of the language. Every literal block template could implement its little language, with a cleaner syntax than that of square brackets, commas, and braces.
It is extensible and adaptable to many types beyond those in the collections library. This avoids a division between core collections, with ad-hoc literal syntax in the compiler, and other types.
It is implemented as library code, for existing codebases to opt-in, which prevents any accidental incompatibility. Moreover, each literal syntax can be implemented apart from the other, whereas the compiler parser is a shared monolith.
There are many state of the art parsers to build upon, without having to fit them into the parser of the compiler.
It better covers the deprecation. Any syntax proposed above, or any others that may be added, may later on turn out not to be a good idea or fall out of fashion, such as the old XML literals. If and when that happens, it would be easier to deprecate library code than language syntax.

mdedetrich · January 21, 2025, 10:57pm

Personally I would hold back here, due to work reasons I have ended up spending my day job in other languages (primarily Kotlin right now) and there is a severe case of the grass is greener on the other side going on here.

Especially given the complexity of Scala (being a strongly typed static language) I would actually argue Scala has one of the best tooling out there. There are of course issues, i.e. sbt basically being its own sub-dsl which doesn’t help approachability along with Scala having to inherit all of the pros and cos of a JVM ecosystem but the primary reason other languages get away with “better tooling” is not that strictly speaking the tooling is better (in fact its almost always worse), its just that the language is simpler and because the language is simpler a lot of things that would be part of the language has now been migrated to tools.

There definitely was an issue with Scala’s tooling in the early days, particularly with features like implicits where it was incredibly difficult in non trivial codebases to figure out how implicit values were being summoned/propagated but this is a solved problem now.

sjrd · January 22, 2025, 9:30am

I was told offline that my comments on this thread have been perceived as snarky, and ignoring positive arguments while focusing entirely on small negative details to drill down against.

Being snarky was never my intention. I have a hard time identifying the snark in my comments, but ultimately what matters is that they were perceived that way. Of drilling on the negative details, I am definitely guilty. This is often my state of mind when “reviewing” things (in the broad sense). Without constant effort of my part to highlight the positive things, it doesn’t happen; and I have clearly not put in the effort while answering in this thread. For all those things, I would like to apologize.

Trying to make amends, I do think there are a number of good things going for this proposal. Some highlights:

It’s been shown that unrelated visual noise prevents the brain from efficiently chunking “code text” while reading. When writing a significant amount of data in code, any visual chunk that is not the data themselves is adding strain on the amount of things our brain can process. The symbolic delimiters at the beginning and end of the collection are enough. Reducing or removing the non-symbolic syntax helps our brain chunk the data out. So in these situations, collection/map literals are definitely helping.
The target typing approach to adapt the literals definitely fits in Scala. The SAMs are a good example of that.
It does not impact TASTy nor binary compatibility (in neither direction), which is always a good thing. Changes that only affect source meaning are a lot easier to keep track of in the long run, where compatibility is a major concern.

channingwalton · January 22, 2025, 10:09am

Well, I have worked in other languages too, I’ve been working with scala since 2.7 and see how much things have improved.

My current experience in Scala 3 is that compilation can take up to 2 minutes on 1000 classes (on an M4 Max Pro) in some cases, IntelliJ frequently fails to import projects so people are running sbt on the command line which causes its own issues. Metals also breaks in similar ways.
This kind of thing puts people off, I know many people that left scala because of it.

mdedetrich · January 22, 2025, 10:46am

True, I have been using Scala also since 2.7 (which is around 15 years ago, wow time flies)

I am also on a M4 Max Pro, and yes Scala does take a while to compile. But so does any strongly typed language, Rust/C++/Haskell (with enough features) also take long amount of time. In fact I think that C++ is strictly worse than Scala in this sense, Rust might be as well.

Obviously when compared against Go or C, its much slower but at the same time those languages don’t actually do as much as Scala does (A lot of the logic in those languages are suspended at runtime which has its own issues)

There is also incremental compilation to help with this, a fresh compile isn’t done that often.

You can set Intellij to use sbt to import a project which helps. Also as a comparison, right now I am forced use a 1 year old version of Intellij with my kotlin project because otherwise Intellij runs out of memory and hogs all of the CPU (just an example of grass is always greener on the other side).

True but it is massively improving

I am aware of this sentiment, from my experience when people leave its often an emotive triggered/response and they don’t usually leave the language for the reason they state. Using your example, an IDE failing to import a project is usually the straw that breaks a camels back but the real reason/s are often something else.

Also I think its good to put some perspective here, Scala is often used in high complex non trivial projects and because of this there is some heuristic bias. To put it differently, the vast majority of Java/Kotlin projects are much simpler in structure than Scala ones (talking about build level complexity here).

This means that people get a skewed impression, because the typical type of Java/Kotlin project in Scala has no issues in being imported by Intellij or Metals, and if thats all of what Scala had to deal with than peoples opinions would be different.

And if those projects are as equivalently complex as the Scala ones then they are often much worse than the equivalent Scala experience. One ironic experience is that back in the day, there was an sbt-android plugin that was significantly better than the kotlin/gradle plugin when it came to user experience (because sbt has principled solutions for problems like caching and classloader isolation which Gradle still doesn’t have). The issue was the project was a solo man project and never got proper support from any of the relevant communities so it essentially died.

mdedetrich · January 22, 2025, 10:49am

I fail to see how this is the case. If this was the actually impression that was being given, I think thats indicative of the fact that given the benefits of the feature (which in my view is extremely marginal) there are too many cons.

If the feature happened to give significant tangible benefits it would have been a different story.

hepin1989 · January 22, 2025, 12:33pm

I think Scala’s main success metric is killed Kotlin.
And Scala should improve the Maven plugin.

I think we can hold a vote and publicize it widely to finally decide which syntax to use.

rjolly · January 23, 2025, 3:40pm

On reflexion, I now think writing named tuples with square brackets might not be that crazy after all. All we have to do is recognize that the concept has nothing to do with tuples and everything to do with case class literals or records and stop shoehorning one concept into the other. So, assuming it’s not too late for redesigning named tuples : rename them as records, write them with square brackets and (as per Martin’s addition) use them as case class literals. Leave unnamed tuples alone whose pitfalls (no syntax for arity 0, 1 and clashes in general with parens notation used for expression delimitation) were never a problem until now. For one thing this would solve the question whether or not named tuples should be convertible to unnamed and in what direction. And we would get coherent syntax for data values, with repeated and optional parameters and everything:

val b1: BuildDescription = [
  declarationMap = true,
  esModuleInterop = true,
  baseUrl = ".",
  rootDir = "typescript",
  declaration = true,
  outDir = pubBundledOut,
  deps = [junitInterface, commonsIo],
  plugins  = [
    [ transform = "typescript-transform-paths" ],
    [ transform = "typescript-transform-paths",
      afterDeclarations = true
    ]
  ],
  aliases = ["someValue", "some-value", "a value"],
  moduleResolution = "node",
  module = "CommonJS",
  target = "ES2020"
]

What do you think?
Edit: and with records we could do away with pattern matching and avoid the ridiculous val (x = x, y = y) = ...
And: if this change is ever incorporated, at the same time would it be possible to de-experimentalize generic number literals so that Vector[BigInt] parses seamlessly for the mathematicians’ happiness.

philwalk · February 1, 2025, 9:12pm

I have recently been translating various python and R code to scala, and this proposal definitely addresses a scala weakness.

JanBoerman · February 2, 2025, 12:31am

I am on the fence whether this feature would benefit Scala or not, but if it is implemented, I do believe in the absence of a target type [x, y, z] should default to a mutable random-access collection.

vincenzobaz · February 2, 2025, 1:27pm

In Python and Javascript, we define a list as [1, 2, 3] but we also retrieve an element with the bracket notation: [1, 2, 3][1] == 2.
In Scala we use () for .apply() to retreive an element from the collection.

I think this proposal adds inconsistency: coming from Python I would expect [1, 2, 3][1] == 2 and not [1, 2, 3](1) == 2.

kjsingh · February 3, 2025, 9:30pm

In that case a translator tool would help better than hoping scala compiler understands all idioms?