Pre-SIP: a syntax for aggregate literals

I think there is a way that can satisfy most concerns raised here and on the relative-scoping thread. Here is what I think we should do:

Relative-scoping

Relative scoping, as in accessing the target type’s companion object methods/values, will only be available for named argument placement. Due to this restriction, I think we can reduce the relative scoping token to a single ., since the ambiguity is removed, IIUC. For discussion: with this restriction, we can even consider removing the need for a leading relative scoping token entirely.

case class Foo(arg: Int, bar: Bar)
object Foo:
  def func(): Foo = ???
enum Bar:
  case Bar1, Bar2, Bar3

val foo1: Foo = Foo(arg = 0, bar = .Bar1)  //OK! (we can even consider removing the need for leading `.`)
val foo2: Foo = Foo(0, .Bar1) //error (relative scoping only available for explicit named arguments)
val foo3: Foo = .func() //error (we could allow this, but I think not)

Note: Due to the above restriction, we need to have named pattern matching officially in the language to have relative scoping within pattern matching.

Aggregated literals

Aggregated literals, as in dropping the explicit type constructor name when invoking apply, will be possible under the following restrictions:

  1. Only named values or argument placement.
  2. The syntax of [] is used, and must always have named argument positioning, unless there is a single varargs argument.
case class Foo(arg: Int, bar: Bar)
enum Bar:
  case Bar1, Bar2, Bar3
case class Baz(arg: Int*)

val foo1: Foo = [arg = 0, bar = .Bar1]  //OK!
val foo2: Foo = [0, Bar.Bar1]  //error: missing argument names
val baz: Baz = [0, 1, 2, 3] //OK!
1 Like

I’m opposed to restricting relative scoping to named arguments because it’s simply unnecessary in many (perhaps even most) cases. .of(year = 1958, month = 9, day = 5) isn’t clearer than .of(1958, 9, 5), it’s less clear because the signal-to-noise ratio is worse. And worse than that, named parameters don’t even work with Java methods.

What is so wrong with simply allowing the developers to make their own decisions, like adults do? I’m honestly sick and tired of being told that we can’t give powerful features to capable and responsible developers because of a few fools who might abuse them to shoot themselves in the foot. Especially given that pretty much the worst thing that can happen is that somebody needs to enable parameter name hints in their editor.

We don’t know better than other people how they should write their code.

2 Likes

Actually, it’s much clearer to me. But what I proposed does not prevent you from writing .of(1958, 9, 5). The named argument placement is a restriction for invoking relative scoping. As In, Foo(arg = 0, date = .of(1958, 9, 5)) works, but Foo(0, .of(1958, 9, 5)) won’t.

2 Likes

I find that hard to believe given that nobody writes dates like that anywhere ever. Humans are really very good at figuring things out from context and world knowledge, and that’s why ISO date format is 1958-09-05 and not year: 1958, month: 9, day: 5. Because everybody can figure out it’s a date just by glancing at it.

Why, why complicate things with additional arbitrary rules that make them less orthogonal than they need to be? Why this insistence that you know better than the people using the language how they should be writing their code? I’m sorry I’m getting emotional here, but this idea that adults are really children who need to be protected from themselves is spreading everywhere, and it needs to stop. You don’t make the world a better place by preventing every bad thing that could happen, however minor. You make it a better place by giving the capable and well-intentioned all the possibilities to make good things happen. Especially when the worst possible downside can easily be mitigated by decent tooling.

But getting back to the technical side: invoking relative scoping only for named parameters means I can’t use it for variable definitions, I can’t use it in a List, I can’t use it for a function’s return value, I can barely use it in a tuple (_2 = .Bar1, srsly?). To me, that’s unacceptable.

The restriction of limiting unqualified lookup to enum cases and case classes that inherit from a sealed trait should really be enough.

That is untrue. Without the context I really had no idea what did you mean by .of(1958, 9, 5). That’s why date = .of(1958, 9, 5) makes sense, but without any reference to what those numbers mean it’s an absolute hell. You need some kind of anchor for the reader to understand context. If we take away full constructors, we need to at least leave the argument name.

3 Likes

I’m going to once again come back to a point that I made earlier, which is that while you’re placing all your focus on this .of(...) expression, you’re neglecting all the cases where you have even less information of what the thing you’re passing to the Person constructor might be. Nobody is complaining about the fact that Person("Martin", _) is a perfectly valid expression to construct a function of type LocalDate => Person, nor did anybody ever insist that this must be written Person("Martin", birthday = _). The reason is that the language couldn’t possibly understand all the context that a human reader can factor in while reading the code, and thus whether to make this explicit is a choice that the programmer must make. In an expression like Person("Martin", .of(1958, 9, 5)), it is relevant context that I know what Martin looks like and therefore have rough idea of what his year of birth might be. Humans do this kind of subconcious cross-referencing all the time, and it’s not something that any programming language will ever understand. Another example of context is variable names. What if the expression isn’t .of(1958, 9, 5) but .of(year, month, day)? It’s just not possible to argue that anybody could mistake that for anything other than a date.

The irony here is that I probably would use a named parameter for this case in order to distinguish between a person’s birthday and other possible relevant dates (like wedding date, signup date or whatever). But when I just write LocalDate.of(y, m, d) I don’t have to do that either, so adding this rule for a relative scoping expression doesn’t really solve the problem, especially given that people can just import LocalDate.of. So you’re not enforcing readable code, you’re just enforcing longer import lists.
And there’s a lesson here: you cannot enforce readable code through language rules. Readable code is the result of developers giving a fuck, and no amount of language legislation is going to change that.
Another example is my zio-aws example from earlier. s3.createBucket([bucket = ["foo"]]) is good code, adding a parameter name is pure noise, and adding noise makes code worse, not better.

When it helps, developers have the possibility to use named parameters. I like named parameters, I probably use them more than the average developer. But I don’t want to have the language tell me when I need to use them, and to me, all these arbitrary restrictions (arbitrary as in not forced by technical reasons) frankly just feel like somebody else trying to force their ideas of what good code should look like on me, when they have no idea what my project or the people working on it are like.
It also shouldn’t be forgotten that these additional restrictions make the language not only harder to learn (because there’s more arbitrary rules to memorize) but also less fun to learn. We should strive to have a language where, while you’re learning it, you have those moments where it clicks and you realize that, wait, you can put those two things together in that way too? How cool is that? And that moment shouldn’t be destroyed because your mommy comes in and tells you, no, you can’t do that, it’s too dangerous for you.

I also see that you haven’t considered my other points. If relative scoping is only permitted for named parameters, they

  • can’t be used in val definitions
  • … or return values
  • … or Lists
  • don’t work in Java methods (no named parameters)
  • are ugly in tuples

Perhaps some teams like to put guardrails like that around themselves, and that’s fine, they can write a scalafix rule for it. I see this kind of rule firmly in the territory of linting tools, not the language proper.

1 Like

Oh and one more thing: while it’s probably pretty obvious that I don’t agree with everybody on everything, I very much do appreciate the civil discussion of ideas and the time that people have been putting into it. Thanks everybody for your continued engagement.

Adding new syntax and an entirely new way to impose restrictions seems suboptimal. If someone encounters these examples in a codebase, they won’t easily understand what is happening. The code is non-discoverable and difficult to search for.

I prefer the simplicity of @mberndt’s earlier proposal, which recommends using a new symbol to infer the companion object without any magic – just an import:

import scala.compiletime.companion as <>

val l: List[Int] = <>(1,2,3,4)

(Note: <> is chosen randomly since @ is not an allowed symbol)

Delegating the choice of something short to reduce boilerplate is slightly cheating, but it is still an improvement because this renaming import is needed only once, not per constructor.

Specifically, (companion.abc(xyc): T) would compile to T.abc(xyc), which seems nearly achievable with macros, except the inferred return type seems inaccessible.

This approach is clear in terms of documentation and how to find it. There are no new principles to learn, just some inferred types and forwarded methods.

It is not as concise as some other proposals, but it seems to be an improvement without real downsides, except possibly conflicting with a better proposal.

The problem seems not so much to automatically import members of the companion object but rather, if we suppose apply is among these, what to do with it. I think we cannot just take any matching parentheses expression as a constructor, it is stretching things too much. On the other hand if we want to use some symbol such as <> then why not just alias apply as in:

object Foo:
  def apply(abc: String): Foo = ???
  def <> = apply

import Foo.*

<>("xyz")

Then the question reduces to whether or not to automatically import companion object members, and I do not see a reason why not.

Importing (with or without renaming) the apply method needs to be done explicitly for each type.
Having a single global symbol that (essentially) resolves to the inferred type’s companion object solves that problem.

I do agree that automatically importing companion object members could also address the boilerplate concern. I think I would prefer automatic imports over new syntax (i.e, over [] and ..).

However, they seem also to have more far-reaching consequences. There was a similar feature in Kotlin which was restricted at some point if I recall correctly. Basically, the issue is that imports are also available in inner scopes, thus nested definitions get a lot of imports, and some may be undesirable. My hunch is that it would be not as bad in Scala, because Kotlin used that feature for mutable builders (thus, combined with automatic imports, you just had random side effects), whereas in Scala you would just construct immutable data, thus resolving an unexpected method would likely just not compile.

1 Like

Re the import scala.compiletime.companion idea: I think it wouldn’t really buy us much because it would still have very special semantics.

Let’s take a very simple example like

import scala.compiletime.companion as <>
class Foo(x: Int)

val foo: Foo = <>(42)

The issue here is that the expression that we need to figure out a type for in order to compile this is <>, but the expression that that type needs to be determined from is <>(42), not <>. This is completely different from how any kind of identifier in Scala today works, it will require support from the compiler, and hence I disagree about the idea that there are “no new principles to learn” here – there definitely are. And I think hiding very different behaviour behind a familiar syntax is actually more confusing than just having a separate syntax, like we do for _ lambda expressions, which are the most similar feature that we have today.

I do not propose that companion by itself has any interesting meaning, but (companion.abc(xyc): T) having the method call present, and the return type somehow inferred is necessary.

But I think if you want to, the way to think about companion[List[Int]] is that it returns List (the companion object). And type inference is adapted, such that companion.abc(xyz): T infers companion[T].abc(xyc).

To make this more concrete I implemented something as close to my proposed syntax as I could get using a macro (non-transparent), this is how it can be used:

case class SomeType(v1: Int, v2: String) derives Syntax

object SomeType { def test(): SomeType = new SomeType(1, "test") }

case class NestedType(v1: Int, v2: SomeType) derives Syntax

object Test {

  import companion as <>

  def main(args: Array[String]): Unit = {
    val res1: SomeType = <>(42, "apply")

    val res2: SomeType = <>.test()

    val res3: List[SomeType] = List(<>.test(), <>(-1, "list"))

    val res4: NestedType = <>(42, <>(12, "type 1"): SomeType)

    ()
  }
}

A major limitation is that the implicits hack I use to convince type inference to infer the return type to get the companion object is not very stable, so in res4 the annotation is necessary.
Also, because this is a macro based hack, it has terrible error messages when it does not work.

With the above limitations, this is clearly useless as implicit syntax, but a proper implementation in the compiler might be able to address those.

Being able to mostly express this scheme using existing concepts is what I meant with “no new principles to learn”.
Yes, this uses quite a bit of advanced features, but conceptually it’s just method calls (albeit, “generated” forwarder methods), and type inference (of the return type). I guess the strange part is why these methods would exist on companion, but :person_shrugging:.

A non-macro sketch of the concept is here:
Scastie - An interactive playground for Scala.
the macro is here:
Scastie - An interactive playground for Scala.

1 Like

Yes, I understand. My point is that this would be an identifier that behaves completely different than any other identifier (its type would be determined by a different expression), and so at that point it’s effectively a language feature that users would need to learn and it would have to be supported by the compiler and any other kind of sophisticated tooling (IDEs). It would be like allowing Scala users to use a different character than _ for abbreviated lambda expressions, and I don’t see the point of that.

I agree that this is nicer to read even if it may be a surprise that github is in scope. But I’m hesitant to introduce new syntax like [elem] for values.

What about this: @lihaoyi ?

  1. a language import that switch on that companion object members are in scope if the expected type has a companion
  2. a language import that switch on named tuple conversion to an apply call of the expected type

Something like:

import language.{companionScope, tupleApply}

def pomSettings: PomSettings = (
  description = artifactName(),
  organization = "com.lihaoyi",
  url = "https://github.com/com-lihaoyi/scalasql",
  licenses = apply(MIT),
  versionControl = github(
    owner = "com-lihaoyi",
    repo = "scalasql"
  ),
  developers = apply(
    (id = "lihaoyi", name = "Li Haoyi", url = "https://github.com/lihaoyi")
  )
)

We can also do away with one elem list apply(MIT) if we allow single-elem-tuples to be adapted as in (MIT), if we can accept that parens around an expression alters its meaning when the language.tupleApply is imported.

Didn’t we want to move away from language imports ?
(IMO for good reasons)

1 Like

I would like to hear your reasoning for this. New (experimental) syntax was recently added for named tuples, and unlike my proposed syntax, it could be used only to create those, so it has a much smaller power-to-weight ratio.
Actually, we could probably supplant the named tuple syntax with this proposal and use the [] syntax for named tuples as well. After all, it’s still experimental, so nobody is using it and backwards compatibility is not an issue (and the reason I’d prefer [] is that I’m absolutely positive that wrapping any expression in () must remain a no-op).

Why would it be limited to named tuples rather than tuples of any kind, or even mixed parameter lists where some arguments are named and others are positional? We can philosophize and debate all day long about what is readable and what isn’t. But we could also take a look at what people actually do in the real world, and we’d notice a pattern: they come up with all kinds of operators and DSLs to be able to write composite data structures with a compact notation. They use string interpolators, like ivy"org.slf4j:slf4j-api:1.7.25". Or they use operators, like "org.slf4j" % "slf4j-api" % "1.7.25". Why not just provide people with a convenient way to do this stuff, i. e. ["org.slf4j", "slf4-j-api", "1.7.25"] ?
Requiring argument names would also preclude this from working with collections, which makes it a non-starter as far as I’m concerned.

Please please no, parens around an expression must remain a no-op, otherwise you’re going to have people accidentally converting stuff to type-safe wrapper classes all the time. I don’t scare easily, but that way lies only madness. Parens around a single expression being a no-op is hard-wired into basically every programmer’s brain, except maybe Scheme programmers’. What’s wrong with just using unambiguous syntax like []?

:100:

2 Likes

We could use some new kind of brackets, like

  • [[]] – doubling has served to disambiguate in the past, for instance :: and : have opposite meanings vs. Haskell
  • Since we use [] the way other languages use <>, perhaps angle brackets are free to use for this
  • Maybe something with a colon or into would convey “this expression’s meaning is based on the expected type”
  • Or, perhaps we could do something like >() where > is “just an object” whose apply method is somehow defined according to the apply method of the expected type.

Hey @nafg,

thanks for joining the discussion.

Is there an ambiguity problem with the proposed [] syntax? Because I’m not aware of one, expressions cannot currently start with [. At first I thought that there might be a problem with infix operators:

object A:
  def +[A] = ()

But A + [Unit] is currently a syntax error and not, as one might expect, a call to the + method with an explicit type parameter.

I thought about it, but < and > are currently valid identifiers, so you can do this:

object < :
  object foo:
    def apply(a: Any) = ()

object >

Now < foo > is a valid expression.

That’s essentially @ragnar’s idea, no?

Anyway, I think the main point of contention here isn’t so much the syntax, it’s whether the feature in its unrestricted form (as I originally proposed) is too easily abused to produce unmaintainable code, and if that is the case, how it could be nerfed to prevent that. I feel that such concerns are misplaced (because whether something is maintainable or not depends too much on the context, and also because I think that enforcing arbitrary and taste-based “readability” rules is firmly in the territory of linters, not the compiler), but I think that’s the main objection.

At least one example of expressions that start with [] is Scala 3 polymorphic function literals

val e0 = Apply(Var("f"), Var("a"))
val e1 = mapSubexpressions(e0)(
  [B] => (se: Expr[B]) => Apply(Var[B => B]("wrap"), se))
println(e1) // Apply(Apply(Var(wrap),Var(f)),Apply(Var(wrap),Var(a)))

The [B] => (se: Expr[B]) => Apply(Var[B => B]("wrap"), se) is an lambda expression that starts with [

3 Likes

you don’t see a potential for confusion with type arguments/parameters? your single element ambiguity could be solved with trailing comma (MIT,)

2 Likes