Pre-SIP: a syntax for aggregate literals

bishabosha · December 13, 2024, 12:01am

its would be very simple to implement in comparison to other proposals, as it would reuse existing core features at desugar phase.

ragnar · December 13, 2024, 10:27am

Sure, but easy to do is not a good argument for adding a feature.

It would add syntactic overload just because “other languages also have specific syntax for this”. One of the strengths of Scala is that it has a lot of features that do not require these bespoke solutions for every little thing.

Both the prototype by @JD557 (scastie) and my earlier experiment that uses a macro to delegate to the companion of the inferred type (scastie) allow more or less the proposed language extension, except using method call syntax on some imported name instead of overloading brackets, i.e., some variant of val x: List[Int] = *(1, 2, 3).
Maybe there are some inconvenient edge cases with those implementations, but it seems much more fruitful to explore those implementations over syntactic additions to the compiler.

Some other notes:
• The “lookup a method in the companion object of the inferred type” prototype is ~30 lines of code, not sure how that does not count as simple to implement.
• Is it even feasible to add type class instances with the frozen standard library?
• The only need that was expressed by people for this feature was to make certain patterns in external libraries more convenient, was there any requests to make Seq(1, 2, 3) and Map("a" -> "b") more convenient?
• My impression is that the need was specifically for cases with many constructors, where it is uninteresting to remember which name is required at what position in the structure, because the structure is known to the compiler anyway. Allowing this usecase would require explicit adoption by third party libraries.
• Yet, how long until third party libraries could realistically make use of this? Given that it is such a minor syntactic convenience, and that libraries are encouraged to stay on old Scala versions, it seems this might take some years. I think it’s likely that people would just keep using the established patterns indefinitely.

mberndt · December 13, 2024, 4:19pm

I can’t follow this reasoning at all, in fact I think the opposite is true: the “# as placeholder for companion object of expected type” re-uses existing machinery to a much greater degree than the typeclass based proposal.

like the typeclass based proposal it re-uses the compiler’s existing notion of an expected type (we know the compiler has such a notion because it can infer argument types for lambda expressions)
unlike the typeclass based proposal, it re-uses the existing scoping rules for lambda expressions with _ placeholders
unlike the typeclass based proposal, it enables re-use of a vast amount of functionality in and around the companion object, like the apply method, but also things like the of method on LocalDateTime or collection conversions in expressions like myList.to(#) where the expected type is some other collection type

And that is actually the main reason why I think this design is much superior: it actually fits with the language as it exists today. The typeclass based proposal is a Swift feature bolted onto Scala to make it look more like Python.

And that is another reason why the “companion object placeholder” idea is a better design: the moment it’s released, you can start using it. Your libraries don’t need to change. You don’t need to add typeclass instances to your code. It just works.

MateuszKowalewski · December 13, 2024, 5:54pm

That’s indeed really not a good argument to introduce some odd and extremely irregular syntax.

“Because the mainstream does it” was never an argument in Scala. Actually the opposite. Scala dismissed a lot of mainstream stuff to create in the end superior and groundbreaking solutions, leading the way to substantial improvements of the status quo in the whole programming language space! We should not look back at the others, we should look forward to still be the one who gets copied, not the other way around.

I like the idea.

I started to use Scala much more for “scripting” since Scala-CLI, and I love it. But in that domain such shortcuts as proposed would be really welcome. For a single Map or Seq it makes of course no difference. But if you write code like in a primitive dynamic language (which you do often when you write “scripts”) you start to have a lot of nested Maps and Lists. Than even the three letters start to matter, as they’re just clutter. I would prefer a shorter, visually more lightweight syntax for that use-case. But I take the lib solution anytime before some odd syntactic overload!

If nothing happens here I’m going to steal the code that was proposed so far… I like it. I’m very grateful to @JD557 for providing it!

Maybe someone wants to publish a mico-library should this here not move, so no code needs to be stolen?

I’m not sure about the “katana”, though. I think @odersky is right in that that seems a little bit “too powerful”. JD557’s solution also works with type-classes, which is imho the right approach as it limits the shorthand syntax to only where TC instances are provided. That seems about right on the power level. It’s flexible, but not too magic. (Granted, a few years back TCs were considered advanced, and sometimes “too magic”. But we’re over this thankfully!)

mberndt · December 13, 2024, 6:04pm

There is no way to consistently argue that the _ placeholder in lambda expressions is OK but the # placeholder for the companion object is not. It’s completely arbitrary, and probably based on the fact that people have had years to get used to _ whereas # is a new idea.

MateuszKowalewski · December 13, 2024, 6:28pm

I don’t think you can reasonably nest underscores to arbitrary levels, where each instance may have completely different meaning. And if someone tried to actually write something like that it would very likely not pass code review anyway.

I would agree with you if it was only one level deep (like the underscore usually). But the construct allowed—and actually encouraged—to write for example #(#(#(foo, bar))), where each # could be something completely arbitrary depending on where this is written.

If you moved that construct elsewhere arbitrary things could happen. Maybe it even compiled, but did something unexpected…

So I don’t think the analogy really holds.

Shorthands are nice, but too much and too powerful shorthand syntax results in so called “write only code”. Usually you don’t want to encourage people to create “write only code”. Nobody wants to read “Perl one-liners” ever again! But an especially simple and short symbolic syntax would do exactly this. That’s Martin’s argument, and I think it’s reasonable.

OTOH, if I had *(*(*(foo, bar))) I actually know what this is. Without much context. The context would only tell me which kind of (nested) Sequence this is. But it couldn’t be something completely different. That’s a big difference.

mberndt · December 13, 2024, 7:00pm

The Scala language absolutely allows nesting underscores as deep as you like, it doesn’t impose any restrictions on that.

val f: (((((((() => Unit) => Unit) => Unit)
  => Unit) => Unit) => Unit) => Unit) => Unit =
  _(_(_(_())))

Is that reasonable code? Of course not, but we don’t impose arbitrary restrictions on underscores to prevent it. Instead, we trust programmers to not use the feature in this way despite the fact that they absolutely could do so. So yes, the analogy absolutely holds, and practical experience shows that people do know how to handle such features.

This whole idea of “you can nest this stuff, which makes for unreadable code, hence we can’t allow it” is just complete hogwash. We can nest loops, for comprehensions, objects, classes, traits, if expressions, matches, lambdas and probably a thousand other things, and when somebody says “but it hurts when I do that”, the answer is always the same: “don’t do that then”. This case is no different. Not one bit.

MateuszKowalewski · December 13, 2024, 7:28pm

If you like bracket soup so much, where everything is context dependent, have you considered to move to the Red language?

Also could you provide a realistic example of such nested underscores, like all the nested, realistic examples (which people would actually start to write!) of the use of the proposed feature?

I’m still not convinced the analogy holds.

mberndt · December 13, 2024, 7:39pm

The whole point of the example is that it’s not realistic, and that we still didn’t place arbitrary restrictions on _, despite the fact that it makes such nonsense possible. That didn’t turn out to be a problem, and there’s no reason to think that # would be any different.
It probably would be used and nested more than _, because deeply nested data structures occur more frequently than deeply nested lambda abstractions. There’s a word for features that are used a lot. They’re called useful.

MateuszKowalewski · December 13, 2024, 8:25pm

On a more constructive note: @mberndt could your proposal be implemented as a compiler plugin? As a prototype, to actually play with the feature?

I mean, I see some value in some “data literal” syntax. There could be maybe use-cases which are fine and safe. But I’m honestly not sure whether it would result in maintainable code.

One needs to take things like refactorings under consideration. But also that code isn’t always written by the most reasonable people…

Scala is in that regard nice as hasn’t much “foot guns” compared to most other languages. I think one reason for that is that it does not lighthearted adopt everything that looks “nice and convenient” (at first). It’s always a balance between safety and power. Think C++: It’s very powerful, and it has all the features, and lets you do whatever you want, however you want. But it’s very easy to shoot yourself in the foot with that language. Some people might be able to handle all that, but frankly most can’t. I would not like to see Scala introduce potential foot guns, even if these were only foot guns for less experienced people.

But like said, maybe it’s all fine, and this would be a nice addition. So how about creating a compiler plugin? What’s actually needed from the compiler to implement this?

sake92 · December 13, 2024, 9:56pm

I am probably vastly oversimplifying things here but here it goes.
Not sure if anyone mentioned approaches like this in this thread, I apologize if I am repeating something…

Sequences

Could we (ab)use tuples for this?
Automatically convert a tuple of correct type to a List for example:

val list: List[String] = (1, 2, 3)

Maps

Essentially a list of Tuple2s…

val map: Map[Int, Int] = (
  1 -> 1,
  2 -> 2,
  3 -> 3
)

Case classes

Now named tuples are a logical choice:

case class Point(x: Int, y: Int)
val point: Point = (x = 1, y = 2)

We can go from a case class to named tuple, why not other way around too…

This approach reuses existing syntax, and it is mostly easy to grasp, since the concepts are similar.
It reminds me a bit of haskell syntax.

JD557 · December 13, 2024, 10:17pm

I was playing around with this and implicit conversions, but I don’t think this can currently be done without some changes to the compiler:

Type inference issues aside (maybe someone smarter than me can fix some of those), there’s always going to be the Tuple1 elephant in the room.

mberndt · December 13, 2024, 10:29pm

sake92:

Case classes

Now named tuples are a logical choice:
case class Point(x: Int, y: Int)
val point: Point = (x = 1, y = 2)
We can go from a case class to named tuple, why not other way around too…

I have actually had this idea before, I just haven’t expressed it in this thread. But Martin shot down both the original [] proposal as well as the # idea, and assuming he doesn’t change his mind, there is room for a separate proposal to re-use named type literal syntax for case classes. I’ve been wanting to create a separate thread for this for some time now, I just didn’t get around to it…

soronpo · December 14, 2024, 10:18am

I’m pretty sure it can be done with macros. I’ve done similar things.

soronpo · December 14, 2024, 2:02pm

After trying it out, the only thing that prevents me from implementing it is the problem of covariance within implicit conversions

github.com/scala/scala3

inline implicit conversion infers `Nothing` for covariant types

opened 09:55AM - 06 Jan 24 UTC

soronpo

itype:enhancement area:inline area:implicits area:infer

## Compiler version v3.3.1 v3.4.0-RC1 ## Minimized code ```Scala impo…rt language.implicitConversions class Foo[+T] implicit inline def conv[T, R](from: R): Foo[T] = compiletime.summonInline[T =:= Int] new Foo[T] val f: Foo[Int] = "o" //error ``` ## Output ```scala Cannot prove that Nothing =:= Int. ``` ## Expectation No error.

soronpo · December 15, 2024, 1:10am

Cool update!!!

I created a strawman fromtuple library that utilizes implicit conversions. Indeed due to bug inline implicit conversion infers `Nothing` for covariant types · Issue #19388 · scala/scala3 · GitHub the conversion cannot be applied directly on the collection types, but I created an opaque ~[T] that can be used as a wrapper for the target type to trigger the implicit conversion and force an invariant conversion.

The library converts the following composition of tuples to:

List/Set/Seq/ListSet/Map/ListMap
Map/ListMap require (key1 -> value2, key -> value2, ...) tuple patterns
New class instances by using the default class constructor.
Int to Long and Double weak conformance
Types that do not match the above patterns first try summoning Conversion before giving up.
Type mismatch errors (or Conversion implicit custom errors) are positioned to the specific arguments that are at fault. This is a much better user error handling experience than manually collection composition because of this!

Here are a few examples:

import fromtuple.*
import collection.immutable.{ListMap, ListSet}
val l1: ~[List[Int]] = (1, 2)
val ll1: ~[List[List[Int]]] = (l1, l1)
val ll2: ~[List[List[Int]]] = ((1, 2), (3, 4))
val ll3: ~[List[List[Long]]] = ((1, 2), (3, 4))
val ll4: ~[List[List[Double]]] = ((1, 2), (3, 4))
val l2: ~[Seq[Int]] = (1, 2)
val ll5: ~[Set[Seq[Int]]] = (l2, l2)
val ll6: ~[Seq[ListSet[Int]]] = ((1, 2), (3, 4))
val m1: ~[Map[String, Int]] = ("k1" -> 1, "k2" -> 2)
val m2: ~[ListMap[Int, String]] = (1 -> "v1", 2 -> "v2")
val ml1: ~[Map[String, List[Int]]] = ("k1" -> (1, 2), "k2" -> (3, 4), "k3" -> l1)
val ml2: ~[ListMap[Double, ListSet[Long]]] = (1 -> (1, 2), 2.0 -> (3L, 4), 3 -> (1, 2L))
case class Foo[T](x: T, y: Int)
class Bar(val x: Int, val y: Int, val z: String)
val c1: ~[Foo[Int]] = (1, 2)
val c2: ~[Foo[String]] = ("1", 2)
val c3: ~[Bar] = (1, 2, "3")
val c4: ~[Foo[List[Int]]] = ((1, 2, 3), 4)

Compiler error example:

import fromtuple.*
val x: ~[Map[Double, Set[Long]]] = (1 -> (1, "2"), 2.0 -> (3L, 4), "3" -> (1, 2L), 4)

-- Error: Spec.test.scala:2:45 -----------------------------------------------------------------------------------------------------------------------2 
  |val x: ~[Map[Double, Set[Long]]] = (1 -> (1, "2"), 2.0 -> (3L, 4), "3" -> (1, 2L), 4)
  |                                             ^^^
  |                                             Found: ("2" : java.lang.String)
  |                                             Required: scala.Long
-- Error: Spec.test.scala:2:67 -----------------------------------------------------------------------------------------------------------------------2 
  |val x: ~[Map[Double, Set[Long]]] = (1 -> (1, "2"), 2.0 -> (3L, 4), "3" -> (1, 2L), 4)
  |                                                                   ^^^
  |                                                                   Found: ("3" : java.lang.String)
  |                                                                   Required: scala.Double
-- Error: Spec.test.scala:2:83 -----------------------------------------------------------------------------------------------------------------------2 
  |val x: ~[Map[Double, Set[Long]]] = (1 -> (1, "2"), 2.0 -> (3L, 4), "3" -> (1, 2L), 4)
  |                                                                                   ^
  |                                                                                   Invalid `key -> value` pattern for Map
3 errors found

If we fix the Scala bug above then we can change the library so that there is no need to use ~[T].

mberndt · December 15, 2024, 1:42am

The idea of implicit tuple conversions is just bad bad because it doesn’t handle lists with one item in a sensible way.

lihaoyi · December 15, 2024, 1:42am

@soronpo one issue with the tuple-syntax approach is the ambiguity when it comes to 1-element collections: is val x = (1) an Int or a Seq[Int]?

soronpo · December 15, 2024, 1:46am

The conversion requires explicit type ascription. What is the ambiguity? val x = (1) triggers no conversion.

mberndt · December 15, 2024, 1:50am

There is no ambiguity, (42) is unambiguously an Int, which makes it impossible to express List(42) using that syntax.

Add to this the problem that this provides minimal utility because it’s limited to collections, and the problem that it can be confused with actual tuples, and you get a feature that I certainly wouldn’t use, nor would I recommend anybody else to use it. At least Martin’s proposal can’t be confused with tuples…