Pre-SIP: a syntax for aggregate literals

mberndt · July 3, 2024, 8:30pm

A syntax for aggregate literals

Hey there,

this thread was born out of a recent discussion on the Scala Discord.

Motivation

Unlike most other programming languages today like EcmaScript or C++, Scala does not have a literal syntax for collections and objects. It makes up for this with (potentially variadic) apply methods on the relevant types’ companion objects, e. g. List(1, 2, 3) or SomeCaseClass("foo"). This works, but it means that it is often necessary to spell out the name of a type (or rather, its companion object) when it could easily be inferred by the compiler. This makes many very common types of code clunky and verbose, and it is in stark contrast to other parts of the language where types don’t need to be given explicitly when they can be inferred from the context. val definitions don’t need a type, the type of a lambda parameter is usually inferred, and even the type that the lambda expression as a whole evaluates to can be inferred, e. g. it’s possible to pass a lambda expression to a function that expects a Runnable.

Another problem is that having to spell out the type of a collection makes code harder to refactor. When you change the type of a function parameter from List to ArraySeq, all call sites need to be rewritten even though no semantically relevant change happened.

An example from Li Haoyi’s os-lib library is the os.proc syntax. Since it is impossible to a method to have both parameters with defaults and a variadic argument list, it is necessary to work around this with an additional method:

os
  .proc("ls", "abc") // variadic function call
  .call(cwd = ???) // optional named args

It should be possible to write this as a single method call, and it could be if the command to be executed could be written as a collection literal.

But it gets worse when you want to create objects of a deeply nested structure of case classes.

List(
  Person("Martin", Birthday(1958, 9, 5)),
  Person("Matthias", Birthday(???, 7, 11)),
)

This is very typical when writing tests, and it’s clearly very redundant. I often found myself working around this using the only class type that Scala does have literal syntax for, tuples, and then mapping over the list, which is quite clunky. It is compounded by the fact that a common technique to avoid name clashes is to nest the definitions of types inside the companion object of the types that they occur in. E. g. Id is a name that is likely to cause clashes, so people write it like this:

case class FooRequest(foo: FooRequest.Id)
object FooRequest:
  case class Id(i: Int)
def callFoo(r: FooRequest) = ???

callFoo(FooRequest(FooRequest.Id(42)))

This is very common in generated code, such as from the Guardrail code generator for OpenAPI schemas. Not only is this code mind-numbingly redundant, you are also going to need a large amounts of import statements to import all the relevant companion objects into your file.

Proposed solution

The solution I propose is to have square brackets as a new syntax to signify a call to a type’s companion object’s apply method. Which companion object that is is determined by the type expected in that position. Using it in a position where no expected type can be determined is a compile-time error.

In the simplest case that is just a val definition with a type annotation:

val ints: List[Int] = [1,2,3] // desugars to List(1,2,3) because a list is expected

But you could also be calling a function that expects a certain type:

def frobnicate(ints: List[Int], someArg: Int = 42) = ???
frobnicate([1,2,3]) // desugars to frobnicate(List(1,2,3))

Note that this cannot be done with a variadic function: the optional someArg parameter makes that impossible, hence the awkward os.proc(…).call(…) syntax in os-lib. A variadic function also doesn’t allow more than one list argument:

def dotProduct(
  as: List[Double],
  bs: List[Double]
) = ???

dotProduct(
  [1, 2],
  [3, 4]
) // desugars to dotProduct(List(1, 2), List(3, 4))

The List[Person] example could easily be rewritten either like so:

List[Person](
  ["Martin", [1958, 9, 5]],
  ["Matthias", [???, 7, 11]]
)

or like so:

[
  ["Martin", [1958, 9, 5]],
  ["Matthias", [???, 7, 11]]
]: List[Person]

The FooRequest example is also much simpler:

callFoo([[42]])

In addition to not having to spell out the object types here, you also don’t need to import them into your file any longer, which cuts down the boilerplate even more. With this syntax you will no longer miss the forest for the trees and instead focus on what’s important: 42, the answer to life, the universe and everything.

Points to clarify

How does this work for non-case classes? They don’t have an apply method on the companion object!
- it’s fine, just call the constructor instead, just like leaving out the new keyword already does in Scala 3
What if the context of the expression doesn’t require any type, e. g. val without type annotation?
- we could define a default for that case, like Seq which is already special because variadic functions. But I feel that’s just too arbitrary
can this be made to work for types like java.time.LocalDate where you need to call LocalDate.of instead of LocalDate.apply?
- I wouldn’t know how
can we eliminate the parens when calling a function with this syntax? E. g. foo[a] rather than foo([a])?
- this clashes with the syntax to apply type parameters. We could use a different syntax than [], but it would be very unfamiliar to most users. {} is already taken, and <> would likely clash with the less/greater-than operators. I think [] is probably best despite the paren issue
what about named parameters and Map initialization?
- Everything between the [] works like it always did: ["foo" -> 3, "bar" -> 5] for a Map, [answer = 42] for named parameters

Prior art

There is ample precedent for this kind of feature. C# has Collection Expressions and Target Typed New (thanks to Li Haoyi for pointing these out). Scala allows us to cover both of these with a single syntax as collections and case classes are initialized the same way: by calling the companion object’s apply method. C++ has list initialization, which covers both collections and classes/structs.

Q & A

Won’t this make code less readable? I won’t be able to see which objects are being created!
- We have experience from both other languages (like C#) and other language features (lambda expressions for SAM types) to suggest that this will improve readability by eliminating distracting redundancy – it is easier to find the important bits in the source code if there are less irrelevant ones
- Tooling like metals can provide visibility into this (i. e. show you what companion object is called), much like it can show inferred types and implicit parameters today
- You still have the option to use the explicit syntax in places where you feel that it helps readability
Do we really need this just avoid typing Vector every now and then?
- This also works for the initialization of case classes, of which there tend to be more than list literals, and I think it can cut down massive amounts of boilerplate in that area
- It also makes code easier to refactor (e. g. changing a function parameter’s collection type or renaming a case class)

I’d love to hear your thoughts and feedback on this proposal. I’d also like to thank Fabio (SystemFW), Haoyi and Luis (BalmungSan) for their invaluable input that led to this proposal.

sideeffffect · July 3, 2024, 9:39pm

This is really good idea!

Essentially using [] as syntactic shorthand for apply from a companion object of the type which is imposed from above.

I hope Scala’s type system can be made to work in this top -to-bottom way (besides the ordinary bottom-up way).

This would be a solution to a real problem. Thanks for suggesting it, @mberndt

Edit: The only potential issue I can see is with reading. But only some cases (in other cases, this proposal would clearly help). And in those cases, rolling would help: types can beer displayed, Ctrl+click would still take you to the relevant apply. This feature is worth an experimental status at least IMHO.

sideeffffect · July 3, 2024, 9:54pm

You know how we have this marketing for Scala as Python, but with types?

If this got accepted and implemented, it would turn Scala into a Lisp/Scheme/Clojure, but with types. Pretty cool, I think.

bishabosha · July 3, 2024, 9:55pm

I would suggest tuple syntax to be readapted, rather than square brackets which in all cases introduces type arguments currently.

mberndt · July 3, 2024, 10:47pm

Thanks @bishabosha!

I think that’s an interesting idea. [] are the norm for collections in most languages (Haskell, Javascript, Python, EcmaScript), but there is precedent for () as well, notably Perl with the my @ints = (1,2,3) syntax. It would also answer the question of what the desugaring would look like when there is no expected type for the expression, it would be TupleN(…) (though of course we don’t need () syntax to make that particular decision).
That said, there is one problem with using (), and it is that tuple syntax doesn’t support named parameters. Unfortunately, (5, x = 42) is nevertheless a valid expression today which evaluates to (5, ()) and as a side effect assigns the value 42 to the var x. So it’s technically an incompatible change, though I’m pretty sure that next to nobody writes code like that because it’s pointless.
A more serious problem with () is that you can currently wrap any expression in as many layers of redundant parens as you want – (((3))) is currently just 3, and I’d like to keep it that way.
I had also considered a syntax like new(), but that didn’t seem like a good idea given that Scala is moving away from new and also because it’s more verbose than []. The point here after all is to make things more concise. It’s true that [] is only used for types currently, but I think that’s actually a good thing because then it can’t clash with any existing syntax.

Ichoran · July 3, 2024, 11:27pm

Specific feedback

The obvious solution here is simply to remove this restriction.

With export statements, you can easily provide all of these for users with one import statement.

This is not a good choice for Scala. Square brackets always mean type parameters. But one can keep the idea and use a more suitable syntax, if it’s a good idea. Four ideas come to mind; maybe you will have more.

regular tuples with <- to form a rough parallel to destructuring in for comprehensions (<- is otherwise rather underused), so val x: List[Int] <- (1, 2, 3)
Decorated parens that mean “this had better be a known type; now use its apply method”. Decorating with : should be syntactically unambiguous, so val x: List[Int] = (: 1, 2, 3 :). It should also make one very happy.
Using a “fill in the thing that you know goes here” symbol in place of the companion name, and the universal symbol for that is _. Thus, val x: List[Int] = _(1, 2, 3).
Think of it as broadcast of parameters and use syntax that is an analogy of propagation of varadic parameters. val x: List[Int] = (1, 2, 3)*. (In this case you wouldn’t need a literal at all–a pre-existing tuple with the correct types should also be broadcast.)

Yes, it will make some code much less readable. What is the argument in

foo([4, [2, 9, true], "", [7], 0.15, [[3, [[5]]]]])

anyway? Yes, a sufficiently powerful IDE presumably will be able to tell you on hover what is actually going on with the 5, but when you make a language feature easy, people will employ it to the logical conclusion, and not everyone uses the most potent tooling all the time (especially since it almost invariably comes with substantial downsides).

So I don’t think we can claim that it won’t result in less readable code in some important cases. Instead, we have to argue that the benefit is worth the risk.

An alternative

Right now, you can already import the apply method and rename it. Whenever you really have a lot of redundancy, you can spell out exactly what you mean in advance.

import Birthday.{apply => b}
import Person.{apply => p}
List(
  p("Martin",   b(1958, 9,  5))
  p("Matthias", b( ???, 7, 11))
)

(I took the extra liberty of aligning vertically.)

That’s pretty darn clear–you say what your types are. It’s not hard. It isn’t typical style, but I don’t see any reason it couldn’t be.

Another feature that would obviate the need for it

If we get partially-named tuples, and we get broadcast of tuples into function arguments, and we allow the early arguments for tuples to be slotted into varargs, and we allow named parameters after varargs (but they must be named if used; they can’t be positional), then there isn’t really anything left for this proposal to do.

All these other things are also plausible. So one would probably want to rule out some subset of them, because having too many ways to do the same thing makes it hard to read code.

bishabosha · July 4, 2024, 8:09am

actually we have the experimental NamedTuples syntax in 3.5.0-RC2

mberndt · July 4, 2024, 10:43am

Hi @Ichoran, and thanks for engaging with the proposal!

I do want to push back a bit though; I still think that this proposal adds value despite your counter-arguments.

Re variadic functions: the restriction that defaulted parameters and variadic parameters don’t mix exists for a reason – feel free to draft a Pre-SIP to resolve the ambiguities (I would support it!), but as things stand today, this doesn’t work. Also, even if we resolve this, it doesn’t change the fact that you can only have one variadic argument list, so the dotProduct example won’t work.

Re export statements: that is a lot of additional boilerplate that you need to write, it has a run-time cost (because export is implemented with forwarders) and it will be different for every project, whereas the feature I’ve proposed will work consistently everywhere. So I don’t think that idea is going to fly. We’d have to establish some sort of cultural norm to have library authors write these exports, and I don’t see any chance of that happening. Besides, this proposal is meant to eliminate boilerplate, not have people write more of it (i. e. exports).

As far as syntax is concerned, I don’t really want to paint this particular bikeshed right now – I could certainly live with another syntax if it is suitably terse. @bishabosha’s proposal to re-use tuple syntax is certainly interesting if it could be made to work.
I will point out that _(1, 2, 3) is a non-starter because that is valid syntax today and equivalent to f => f(1,2,3).

As for readability, my personal experience is that it’s much more helpful to spell out field names rather than type names. There are countless examples of this:

JSON has become the de-facto standard for REST APIs, and you never spell out type names in JSON – but you always supply all the field names, and people seem to generally agree that this is readable
Kubernetes manifests which are written in YAML. Again, you always spell out all the field names but never the type names. At my company, we internally decided that we want to use a full-blown programming language to generate our Kubernetes manifests, and we went with Typescript over Scala specifically because it doesn’t require us to spell everything out explicitly. I think Scala should be suitable for this kind of use case
I also noticed that the longer I’ve been doing Scala, the more I’ve been using named parameter syntax for constructing case classes, and the more I’ve been irritated by the fact that toString doesn’t give me field names

But it’s also not always necessary, because the meaning of an expression can often be deduced from the the data itself. Sure, 5 could be anything, but when you see something like [1958, 9, 5], it doesn’t take a rocket scientist to figure out that 1958 must be the year.

Regarding the “imported apply” idea, I’m not a fan tbh, because now I have to keep all those renamed imports in my head and remember that p and b aren’t functions that might do who-knows-what to my data but just aliases for the apply method. I’d rather have a syntax that tells me “nothing exciting going on here, just creating some objects”. More importantly: imports are just irritating. They don’t add anything meaningful, they only exist to avoid name clashes, and every time I want to write a complex data structure I need to first get all the imports right before my IDE gives me parameter assistance. That’s a huge loss of productivity right there, and I want it to go away.

Regarding the other features you mentioned – partially-named tuples, broadcast into function arguments – I’m afraid I don’t understand how they relate to the problem that this proposal is intended to solve: boilerplate-free initialization of case classes and also collections.

lihaoyi · July 4, 2024, 11:51am

I think this proposal is in the right direction for all the reasons @mberndt gave. Lots of details would need to be ironed out - collections vs case classes, parens vs square brackets, named vs positional params - but it would be a big step forward in making the Scala language more ergonomic

It’s worth calling out that other languages have bits and pieces of this already:

C# has target typed new expressions, letting you instantiate classes while omitting the type name Target-typed new expressions - C# feature specifications | Microsoft Learn, and target typed collection expressions letting you do the same for collections Collection expressions (Collection literals) - C# reference | Microsoft Learn

Span<string> weekDays = ["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"];

Kotlin is getting target typed collection expressions too https://youtrack.jetbrains.com/issue/KT-43871

val set: Set<Int> = [1, 2, 3] // creates a Set<Int> 
val map: MutableMap<String, Int> = ["one": 1, "two": 2, "three": 3] // creates a MutableMap

Swifts collection literals are also target typed, e.g. allowing

var favoriteGenres: Set<String> = ["Rock", "Classical", "Hip hop"]

This proposal would be unprecedented a decade ago, but in 2024 it has a lot of precedence in every other comparable statically-typed compiled type-inferred language. Now that alone is not enough to justify inclusion in Scala, but it should be enough to justify discussion and to put aside the reflexive “this is weird and unnecessary” feeling that accompanies every new feature proposal

Syntax matters, as does syntactic sugar. If it didn’t we’d all still be writing Java 6. There’s obviously a subjective cost-benefit tradeoff to be made for any language feature, but the fact that other languages have decided it’s a good tradeoff suggests that others have found the tradeoff worthwhile, and we might too. If the Csharp, Swift, and Kotlin folks all agree something is the right thing to do, we should question whether Scala is so special as to justify choosing differently

Ichoran · July 4, 2024, 7:56pm

Varargs limitations

But that’s a different feature.

You certainly can’t

def dotProduct(xs: Double*, ys: Double*) = ???

dotProduct(3, 5, 7, 2)

because there’s no way to tell how much of each vararg to use. If it’s ever a pain point that we can’t

def f(xs: Int*, ys: Double = 1.0) = xs.sum * ys

then the language could be modified to allow it, with the rule being that if a parameter appears after varargs, you must always reference it by name. Incidentally, if you allowed an empty varargs, this would also double as a way to mandate parameter names:

def substr(str: String, *, position: Int, length: Int) = ...

substr("herring", 3, 4)  // Unclear--"r" or "ring"?--and not allowed
substr("herring", position = 3, length = 4)  // Aha

Anyway, if there are deficiencies in varargs itself, we should fix those. Note that you can have varargs at the end of each parameter block, so it’s only the default-after-varargs that is awkward. (And if it’s really really important, you can fake it with using.)

I think your proposal is entirely orthogonal to varargs. The point of varargs is to allow as many arguments of the same type as you need. The point of your proposal is to not have to repeat things that are known. There is of course an interaction: if the function has varargs, your proposal works with that, too. But the concerns are almost entirely separable.

No bikeshedding

Field names–whether yes or no, that’s not this proposal

Okay, but this is exactly the opposite from what you’re proposing. This is about adding redundant information that you ought to already know. Which is better:

[
  {"x" = 5, "y" = 7},
  {"x" = 2, "y" = 9},
  {"x" = 6, "y" = 0},
  {"x" = 4, "y" = 4}
]

Array(
  Vc(5, 7),
  Vc(2, 9),
  Vc(6, 0),
  Vc(4, 4)
)

If you think the latter has unacceptable redundancy, but the former is just clear, well, I think that’s because you’ve gotten used to thinking of data as essentially untyped save for a bundle of key-value pairs. And there’s nothing terribly wrong with that viewpoint–it works fine in many instances. But the idea that data should be typed also has advantages. And objectively (just count the characters!), the kv-pair version is the one with greater redundancy (and less safety).

Now, Scala is lacking a good way to interface with the land of key-value pairs where the keys are strings and the values are well-defined types. That’s what named tuples provides. Check it out! Now if you want, in Scala you can

Array(
  (x = 5, y = 7),
  (x = 2, y = 9),
  (x = 6, y = 0),
  (x = 4, y = 4)
)

Who knows what about data?

I agree that it’s a bit of a pain; the reason I mentioned it is because it makes far more explicit what the data types are. You have said so right there.

If you worry about whether something changes your data, can’t I worry about having no idea what the data even is? Presumably there was some good reason why the type was FooRequest not Int. And some good reason why FooRequest holds a FooRequest.Id not an Int. [[5]] says to me, “I know you think types are important, but I don’t, just use this data if you can”. That’s fair enough, but “hey, there’s a good reason this isn’t a bare Int” is also fair enough.

Now, I do agree that the FooRequest(FooRequest.Id(5)) thing is kind of ridiculous. You ought to be able to tell from context what is what, which is the point of the relative scoping proposal.

This would get it down to FooRequest(Id(5)) possibly with an extra dot or two before Id.

Your proposal would take it all the way down to FooRequest([5]). I can imagine this being even better, but I also can imagine this hiding an important distinction. This isn’t exactly an objection, but I do want to point out, again, that there are tradeoffs here. It’s not all just rainbows and ponies; people decided for some reason that being explicit was important, and you’re overriding that.

This is exactly backwards from the reasoning for requiring explicit infix to have infix notation, incidentally. People there, inlcuding @lihaoyi, were arguing strenuously that the library designer should be in charge of such decisions of usage. I argued otherwise, to empower the user over the library designer.

So I am sympathetic to making the same argument here: go ahead and [[5]] it if you want to (and if there’s only one possible meaning given the types available).

But I cannot accept both arguments at once; it’s simply inconsistent.

Why other features matter

With varargs, you have both a literal and a broadcast version:

def f(xs: Int*) = xs.sum

f(2, 3)  // Fine

val numbers = Array(2, 3)
f(numbers*)  // Also fine
f(Seq(2, 3)*)  // This is cool too

Is there any reason to restrict this capability to varargs? Not really. Maybe you want it guarded behind a keyword…but maybe not? The guarded version of the spread operator is proposed here.

Your proposal would, I think, supersede that because it’s not that hard to have an extra [] or ()* or whatever; the point is to not have to type the name of whatever you’re unpacking. But on the other hand, if you just view it as a spread, like with varargs, then

Array[Vc](
  (5, 7)*,
  (2, 9)*,
  (6, 0)*,
  (4, 4)*
)

is very close to the feature you’re proposing. The main difference is with named arguments, where your proposal parallels function arguments, but named tuples can’t be partially named.

So, anyway, it’s important to consider all of these things together, because many of them are trying to accomplish similar things, and we don’t want to end up with three ways to do the same thing.

Well, we might. But we can’t assess the tradeoff fairly by ignoring or undervaluing parts of it, and championing others in those cases where it shines. My goal here is to illuminate the tradeoffs, not reject the proposal.

Also, other languages have different tradeoffs. C# has the same feature in two different ways (one for most objects, where you have to say new() over and over again, and one for collections which has a bunch of restrictions on what is considered a collection).

Kotlin “is getting” sounds overly optimistic; the last word on that thread is, “We’re exploring possible designs for collection literals but aren’t ready to provide or commit to a specific version”, suggesting that this isn’t an easy thing to get right.

Swift has decided that it is an opt-in feature to be initialized by array literals or dictonary literals as part of their general initialization with literals capability.

These are all rather different tradeoffs than we’re discussing here.

So definitely the “wow this is weird, we shouldn’t” reflex is inappropriate. But it’s also important to make sure this fits Scala well, isn’t compromising aspects of Scala that are its strengths, and that out of various ways to accomplish something similar, we get ones that cover the important use cases but don’t provide too many different ways to do the same thing.

In particular, if we go with this, I think we should be very clear on (1) what else it would render unnecessary, and (2) how big a mess you can get yourself into by leaning on the feature too heavily.

Some questions

What happens if there are multiple apply methods?

class C private (i: Int):
  override def toString = s"C$i"
object C:
  def apply(i: Int) = new C(i)
  def apply(i: Int, j: Int) = new C(i+j)
  def apply(i: Int, s: String) = new C(i + j.length)

val a = Array[C]([2], [3, 4], [5, "eel"])

Does it work with unapply too?

case class Box(s: String) {}
val b = Box("salmon")

// Does this work?
b match
  case ["eel"] => "long!"
  case ["salmon"] => "pink!"
  case _ => "dunno"

val b2: Box = ["herring"]

// Does this print "I am a herring"?
for [x] <- Box(b2) do println(s"I am a $x")

Does it work if the type isn’t a class? Does it see through type aliases?

opaque type Meter = Double
object Meter:
  inline apply(d: Double): Meter = d

val a = Array[Meter]([2.5], [1.0], [2.6], [0.1])


type IntList = List[Int]
object IntList:
  def apply(i: Int, j: Int, k: Int) = k :: j :: i :: Nil

val xs: IntList = [3, 2, 1]


type ListOfInt = List[Int]

// ListOfInt.apply not found, or uses List[Int].apply?
val xs: ListOfInt = [3, 2, 1]

Does it trigger implicit conversions?

import language.implicitConversions
given Conversion[Option[String], String] with
  def convert(o: Option[String]) = o.getOrElse("")

val o: Option[String] = None
val xs: Array[String] = ["eel", Some("bass"), o]

If a class can take itself (or a supertype) as an argument, can you nest [] as deep as you want?

val s: String = "minnow"

// One of `String`'s constructors takes a `String`
val ss: String = [[[[[[[[[["shark"]]]]]]]]]]

som-snytt · July 4, 2024, 8:09pm

Wordle 1,111 4/6*

Somehow my wordle is still in the clipboard.

I meant to say

dotProduct(xs = 3, 5, ys = 7, 2)

This came up on a Scala 3 PR that tweaked named args, where I thought this was the intended semantics. Namely, ys means we are now in the ys.

I think that is doable. The current rules are more about positional vs named, rather than varargs. So it can be current rules until you encounter varargs. After that, any change of parameter must be a named arg.

mberndt · July 5, 2024, 12:02am

Hi @Ichoran, and thank you for taking the time once again.

vargargs

I think your proposal is entirely orthogonal to varargs

It does solve a problem that varargs as they exist today have, and the fact that this problem could also be solved in other ways doesn’t change that. Also the point here isn’t so much that varargs need to be fixed, it’s to demonstrate that varargs are not a substitute for the syntax that I’m proposing.

redundancy

Okay, but this is exactly the opposite from what you’re proposing.

No, it’s not as one-dimensional, my point isn’t that every saved keystroke = moar gooder. What I’m proposing isn’t to write everything with as little redundancy as possible, it is to give users a choice which redundant parts they want to spell out and which ones they prefer to have inferred from the context. Today, users can leave out parameter names, and the compiler knows which parameter it is from the position in the parameter list. Users don’t have that choice for redundant type names today.
I believe choice is good to have because which one is more legible depends on a lot of factors including the application domain, the data itself and the expected level of knowledge of the reader. One size does not fit all, and never allowing the type name to be spelled out would be as wrong as enforcing it.

about named tuples

I think that’s because you’ve gotten used to thinking of data as essentially untyped save for a bundle of key-value pairs.

That is not how I think about data at all, and I wouldn’t be using Scala if I did.

Now, Scala is lacking a good way to interface with the land of key-value pairs where the keys are strings and the values are well-defined types. That’s what named tuples provides. Check it out!

For now, named tuples are experimental, and I for one am not convinced that we want an entirely different category of types when we already have case classes – I would rather have a convenient syntax available to create case classes instead of rewriting all my case classes into named tuples. Not to mention that I might not even control many of those case classes, as is the case for example for the rather complex data types in zio-k8s.
And there’s more: named tuples can’t have defaults either which, again, makes them completely unsuitable for k8s manifests or the kind of code that Guardrail will generate for typical OpenAPI schemas. Moreover, named tuples require me to always specify each and every field name, and I don’t want that either.
So no, named tuples are very much not a substitute for what I have in mind, they’re not even close, and perhaps this proposal could supplant them entirely by making case classes more convenient than they currently are.

people decided for some reason that being explicit was important, and you’re overriding that.

It seems to me that that might have just been a historical accident more than anything else. As Haoyi and I have pointed out, many people, like the designers and users of C#, C++ and other languages, have apparently come to the conclusion that being explicit is not that important after all. C++ initialization lists work for pretty much everything for a reason.

Some questions answered

What happens if there are multiple apply methods?

[stuff] desugars to ExpectedType(stuff). Hence, Array[C]([2], [3, 4], [5, "eel"]) desugars to Array[C](C(2), C(3, 4), C(5, "eel")). It would just work.

Does it work with unapply too?

For now, my idea is just the desugaring of expressions above. Creating objects is much more common than destructuring (as evidenced by the fact that most languages had facilities for the former much earlier than the latter, if they have it at all), therefore I think the current destructuring syntax is probably good enough. But I could change my mind on this. If we want to replace named tuples with this, then it might be necessary to have that, too.

Does it work if the type isn’t a class?

If ExpectedType(stuff) works, then so does [stuff], it’s really quite straight-forward I would say!

The original proposal said something about companion objects and apply methods – I’ve changed my mind about that as it’s unnecessarily specific and doesn’t cover e. g. regular (non-case) classes.

Does it see through type aliases?

Yes (unless opaque)

Does it trigger implicit conversions?

implicit conversions work like you would expect them to.

val xs: Array[String] = ["eel", Some("bass"), o]
// desugars to
val xs: Array[String] = Array("eel", Some("bass"), o)

If a class can take itself (or a supertype) as an argument, can you nest [] as deep as you want?

Only if that is the only unary constructor of that class. val c: C = [[x]] would desugar to val c: C = C([x]). If C has more than one unary constructor, say one that takes C and another that takes Int, then it’s not clear what the expected type of the expression [x] would be, so it’s not clear if it should desugar to C(x) or Int(x). At that point you get a compiler error.

Ichoran · July 5, 2024, 2:01am

Thanks for the examples! So let’s think about the simplest rule that would pretty much work.

Let’s say that is it. There’s nothing else going on: it is entirely a syntactic, not a semantic desugaring, save that we need the semantic awareness of whether a type is already expected at a position.

We probably need [...] to function as a type inference barrier. Although one can imagine a solver that would figure out from the types inside [...] what it could possibly match, and from that figure out what calling types could be expected, and so on, it would be extremely opaque to programmers even if the compiler could often figure out the puzzle.

Furthermore, we’re expecting [...] to parallel method arguments (granted, only on the apply method, but that can be as general as any method). That means we need to figure out what to do about named arguments, multiple parameter blocks, and using blocks. For example, what if we have

def apply(foo: Foo)()(bar: Bar)(using baz: Baz) = ???

But hang on! Relative scoping was already suggesting that we use . (or ..) to avoid having to specify the class name over and over again. Bare . in the right scope would then just be…apply!

So if we write

val xs: Array[Person] = .(
  .("John", .(1978, 5, 11)),
  .("Jane", .(1987, 11, 5))
)

it’s arguably the exact same feature, and since we are literally using the same syntax for the constructor/apply call (with . in for the class name), there aren’t any weird gotchas to think through. Everything already works; the only thing we need to specify is where the relative scope is “we expect this type”.

This was suggested for things like .Id(3), where a single dot looks reasonable, but to me anyway it looks extra-weird without any identifier. Even though it’s longer, the double dot feels better to me:

val xs: Array[Person] = ..(
  ..("John", ..(1978, 5, 11)),
  ..("Jane", ..(1987, 11, 5))
)

So I think the two features end up completely unified at this point.

class Line(val width: Double) {}
class Color(r: Int, g: Int, b: Int):
  val Red = Color(255, 0, 0)

class Pencil(line: Line, color: Color) {}

val drawing = Pencil(..(0.4), ..Red)

would just work, all using the same mechanism.

Furthermore, it is hard to see why [0.4] should work and .Red or somesuch should not work. The “type is known and saying the name over again is redundant” thing is similarly bad.

Catching two birds with one net seems appealing to me.

spamegg1 · July 5, 2024, 7:07am

Although I agree with Ichoran’s points, I oppose both this and the Relative scoping idea… they are both horribly confusing and unreadable. Once one (or both) of these features are out, they will spread everywhere, everyone will be using them where they are not needed at all, purely out of sheer laziness and silly minimalist aesthetic reasons. (We live in an age where people cannot be bothered to say full words, instead they abbreviate it to the first 3-4 letters.)

Just leave it as it is, I have no problem writing Array and Person. There is such a thing as too much conciseness. It would be fine as a library, but should not be made part of the language (even as opt-in).

soronpo · July 5, 2024, 7:28am

To me a sign of a good feature is a useful one.

bishabosha · July 5, 2024, 10:44am

I am thinking now it is probably good to consider this alongside the spread heterogenous arguments proposal

edit: this one

mberndt · July 5, 2024, 11:59am

Hey @Ichoran,

Thanks for engaging once again!

We probably need [...] to function as a type inference barrier.

Oh absolutely, every other way lies madness.

Furthermore, we’re expecting [...] to parallel method arguments (granted, only on the apply method, but that can be as general as any method). That means we need to figure out what to do about named arguments, multiple parameter blocks, and using blocks.

Good point. While I had thought about named parameters (and came to the conclusion that [foo = 42] should work fine), I hadn’t considered multiple parameter lists or using clauses. Would they be problematic though? It seems straight-forward enough:
[foo = 42][bar][using baz].

The only potential issue here is that putting [bar] after an expression usually means “call this method and supply the type parameter bar”. But I think it’s not a problem because calling a method on a […] expression doesn’t make sense. You need to know what the type of an expression is to call a method on it, but […] expressions don’t know what their type is, they need to have it imposed from the outside. So it should all be fine.

it’s arguably the exact same feature, and since we are literally using the same syntax for the constructor/apply call (with . in for the class name), there aren’t any weird gotchas to think through. Everything already works; the only thing we need to specify is where the relative scope is “we expect this type”.

This is actually a really interesting thought, which led me to another idea. We already have abbreviated Lambda syntax with the _ placeholder. When a _ occurs in an expression, it is desugared by replacing the _ with an identifier, x say, and then prefixing the whole thing with x =>. So _ + 3 becomes x => x+3.
We already have a set of rules here to sort out the scoping details (e. g. does f(g(_)) mean x => f(g(x)) or f(x => g(x))? – it’s the latter).
What if we used the exact same set of rules, but with some other token – @, say – that is then replaced with the expected type?
val x: List[Int] = @(1,2,3) would desugar to val x: List[Int] = List(1,2,3). So far, this is equivalent to the [] syntax we had been discussing so far, but unlike that proposal, it isn’t limited to mere function application. For example this would work:

def foo(date: java.time.LocalDate) = ???

foo(@.of(1958, 9, 5))

The downsides are that it’s a tiny bit more verbose and that we wouldn’t be using [] for lists like most languages do – but Scala isn’t other languages, so that part is probably fine.
The upside is that it would be more flexible (the LocalDate thing) and that we’d be reusing a set of existing syntax rules from abbreviated lambdas, so it should be easy to teach.
I have to say this feels absolutely right to me, I love this idea! Thanks for coming up with that (after all it’s essentially the same as the ..(1978, 5, 11) syntax that you proposed).

Re merging into a different proposal: I had proposed that to @soronpo, but he preferred to have two separate proposals – I’m prepared to discuss this when he is. As for the heterogeneous spread thing – I need to look into it.

soronpo · July 5, 2024, 12:08pm

As discussed on discord, I proposed to expand the experimental Generic Numeric Literals into a Generic Constructor Literals.
So similarly we can define an implicit of something like:

trait FromLiteralList[T]:
  def fromList(args: Any*): T

It would trigger for literal lists defined by [] (e.g., ["hello",[1, 2, some_param]]).
Then we can add a specific FromLiteralList for case classes that enforces the types can can recursively summon FromLiteralList or FromDigits to each of the target case class arguments.

case class Foo(arg1: Int, arg2: String, bar: Bar)
case class Bar(arg1: String, list: List[Int])

val foo: Foo = [1, "2", ["bar1", [1, 2, 3, 4, 5, 6]]]

Sporarum · July 6, 2024, 9:35am

This is to me is a a much better way to implement this feature, it’s more general, and has clearer semantics

As for if this feature should exist in the first place, I must say I’m not super convinced.

A point to remember is that in C++ when declaring a variable you have to specify a type, always, even if just “auto”.
It therefore makes a lot of sense to avoid repetition by removing the need for specifying the class name twice.
And this is not something we have in Scala, since variables get their types inferred.

Furthermore, having used this feature in C++, it very easily leads to writing code that’s obvious when you write it, and you have all the context in your head, but very hard to read afterwards:
Of course this can be managed with code etiquette/best practices/etc but it’s another choice, another burden on the programmer.

I have never thought of doing this before, but I must say it seems very elegant (as long as it is declared very close to the use-site).
It has the advantage that the context is spelled right in front of you, making re-reading easier.
It’s syntactic weight is also an upside for me: It nudges users into not using it unless there’s really a lot of data.
It’s also easier to refactor out: Just replace all bs with Birthday and you’re done !

There’s also the tooling support question, when you have a name, you can control click it and it brings you to the definition, with just a pair of parentheses or brackets, it’s not as clear.
Especially since it’s harder to aim for a single character than a full class name.