Pre SIP: Named tuples

MateuszKowalewski · December 5, 2023, 9:12am

Well, it gains some explicitness.

That is clearly a value in some contexts.

But I think it’s counter-productive in the context where (named) tuples are most desirable, namely in some code dealing with ad hoc data (like often found in Spark processing, or DB query code).

I’m usually arguing for “theoretical purity”. But not in this case here, as I think this kind of “purity” ruins the most compelling use-case.

Also I hope now everybody sees clearly the parallel to positional function arguments. That’s just something that can’t be dismissed, imho!

Dear SIP Committee members, please try to look at the thing from the use case perspective and not insist on theoretical purity where it makes no sense. That’s just not Scala.

AMatveev · December 5, 2023, 9:21am

Will it be possible?

Use reflection(typetags) to create factory for any named tuple.
Use match types to convert types in a named tuple .

odersky · December 5, 2023, 9:49am

Yes, the representation of named tuples as NamedTuple instances is exposed. So one can use the usual generic programming for tuples also for named tuples. named-tuple-strawman-2.scala in my PR shows some examples (it’s still in a rough state).

EDIT that was intended as a reply to:

alvae · December 5, 2023, 12:01pm

This program won’t compile in Swift:

func f<S: Sequence<(a: Int, b: Bool)>>(_ s: S) {}
f([(1, true), (2, true)]) // Error!

We’ll get this error:

global function ‘f’ requires the types ‘(Int, Bool)’ and ‘(a: Int, b: Bool)’ be equivalent

The conversion from (Int, Bool) to (a: Int, b: Bool) will only work if it happens with a literal (my suggestion) or at some specific AST positions, like assignments and return values. These are the confusing rules that I think we should not emulate.

I have enough lines of Swift under my belt to confidently say that:

The above error is not making my life harder.
I would not be sad if we only had literal conversions and not the other ones.

But anyway, I’m happy to rest my case now.

My closing argument is that I think insisting on a subtyping relationship is making us miss an opportunity to explore a way to improve the whole tuple conversion problem. It is order of magnitudes harder to add restrictions than it is to lift them later. If we started without subtyping, we’d have a chance to better identify the conversions that really cause pain, if any, and we’d be better equipped to know how to solve them. If we rush into one, it will be difficult to go back if we realize that there was a hidden trap. Besides conversions, I am also quite convince that the absence of subtyping would make tuple operations simpler to define; and perhaps more importantly than anything else, better define the role of named tuples by creating less overlap with case classes. You can see my original post for my rationale.

IMO, the fact that we can nicely fit a subtyping relationship in the type system should weigh less than the cost of making that relationship counterintuitive. In fact, I’ll even say that it is an appeal to purity that runs against @MateuszKowalewski’s pleading. Further, the correspondence to named parameters seems incidental. Named parameters do not work like tuples in Scala because they are not positional. I’m also not convinced that modeling functions as arrows from tuples to tuples is an idea that scales in practice. It is appealing because it works on the surface but then you get into passing conventions, default arguments, etc.

Ichoran · December 5, 2023, 7:48pm

Actually, one of the biggest headaches I have is getting same-typed positional arguments in functions straight. I used one of the examples before: ranges specified by start and end vs ones specified by start and length. It’s even worse with coordinates. Is it (x0, y0, width, height)? Is it (x0, xN, y0, yN)? Maybe (cx, rx, cy, ry)? I return things in tuples and forget the order and muck stuff up that way all the time, alas. If I could get this straight reliably enough, I might like Python more than I do. So if it were possible to make some named arguments mandatory, I would leap on that in an instant.

I complain not for some abstract reason of purity, but rather because it’s an actual pain point. One of the worst remaining, actually; almost everything else in Scala 3 is a joy to work with. (I don’t see how this could be easily fixed, alas; much of the problem comes from Java libraries.)

However, in coming up with my working example of named opaque tuples without subtyping, I realized that I’m almost completely satisfied with a pure library-level solution if I want it the non-subtyped way, and I think most everyone else could be too. So the compiler can do whatever, and if it turns out not to be the preferred solution, people don’t have to use it and can have nearly seamless functionality with what they prefer.

Sporarum · December 5, 2023, 8:32pm

Except for the fact that once it’s in the compiler, it has to stay there forever !

It would be a shame to discover later that we decided on the wrong sub-typing relation, and be unable to change it !
(Deprecating a feature is kinda possible, replacing a feature by its opposite is impossible, even when breaking compatibility like in Scala 2 to Scala 3, as that would be extremely confusing to users)

I think the overwhelming take-away from this thread is that there is no consensus among us on what is best, even though we all want named tuples !
Therefore I think the wisest choice is to do as some have proposed:
No sub-typing between named and unnamed, in either direction, until we have enough examples under our belt to see what is more useful in practice

P.S: I really don’t buy the parallel with function parameters, for example the return type of a function does not depend on the correct ordering of parameters, whereas named tuples differ:

def foo(x: Int, y: Int) = x + y
foo(x = 2, y = 4) // : Int
foo(y = 4, x = 2) // : Int

(x = 2, y = 4) // : (x: Int, y: Int)
(y = 4, x = 2) // : (y: Int, x: Int)

jeremyrsmith · December 5, 2023, 10:20pm

You know, people say this about static types all the time

lihaoyi · December 6, 2023, 4:20am

I mean, the most popular language in the world is Python, and Python has names tuples, and does not allow automatic injection as I have shown above, in fact the subtype reationship is opposite this proposal. @alvae demonstrates the same for Swift. This argument seems objectively false.

Swift does this: named arguments are named, positional arguments are positional, according to how they are defined, and enforces that the order of the named arguments must match the definition. Python allows keyword-only arguments as well, albeit opt-in and without ordering enforcement. Swift maybe you could argue is just a legacy of Objective-C, but Python chose to add these explicitly in PEP 3102, which is an excellent read for why these are a reasonable idea.

IMO opt-in keyword-only-arguments would be strict improvement over Scala allowing every callsite to make a different choice, similar to how:

Scala 3’s strict empty-parens-in-method-call handling is better than Scala2’s “use as many or as few empty parens as you like” at callsite,
Scala 3 chose to only allow definition-sites marked as infix to allow infix alphanumeric methods, v.s. Scala 2 letting each callsite make a choice.
Scala’s definition-site variance being better than Java’s use site variance.

In all cases, they remove an unnecessary degree of freedom and room for error, while preserving the flexibility at the definition site to dictate how the callsites will function.

Maybe not everyone agrees on the details, but keyword-only arguments are not some unspeakable abomination that you seem to suggest it is. It’s a pretty reasonable choice, that many languages have made, that would actually fit perfectly into the Scala 3 goal of trying to remove unnecessary flexibility from the Scala language while preserving its core expressiveness.

I don’t think this is true? Since when was Scala the “convert everything to everything” language? We literally just discussed how we can limit or remove implicit conversions, so as to explicitly discourage “converting everything to everything” as a way of using Scala. We literally brought up the removal of JavaConversions convenient “convert everything to everything” import in this thread!

Every language has cases where it doesn’t do something implicitly because there is no obvious/agreed-upon semantics, I don’t see why users would be upset at all. It would simply be another day that ends with a “y”. I would expect professional software engineers to reject their colleagues’ code at code review if written in such a way that they could not agree on a semantic intuition, even if the code was technically valid.

@soronpo, @Ichoran, @alvae, and others have all made good arguments grounded in facts, often with code snippets demonstrating the ground truth. I have tried to provide these as well. The arguments in favor of unnamed <: named have not been convincing.

@alvae and @Sporarum are right to say that it is harder to remove things than to add them. This is objectively true. We all want named tuples, and we all disagree on which possible subtyping is a terrible idea. The obvious path forward is to go with named tuples without subtyping, have concise explicit conversions (as we do with .asJava, .asScala, or the proposed .convert), and leave the door open to adding subtyping or adding implicit conversions later.

MateuszKowalewski · December 6, 2023, 4:56am

Could you point to some discussion that argues to remove overloading because it’s evil?

I’ve never heard that someone said something like that. Some links would be helpful.

Instead I’ve heard about cases where people where asking for “even more overloading”… And that got actually implemented lately, no matter “that overloads are evil”:

https://docs.scala-lang.org/sips/multi-source-extension-overloads.html

But maybe I just misunderstood you? Maybe you don’t argue for removal of that evil feature, but instead to make it actually more powerful, so it works in more cases?

The later is something I’ve heard a lot actually: Overloads don’t work good enough, and it would be good if overloading got better and would work in more cases. (This got actually better with the introduction of @targetName btw.).

This looks like a bug to me…

And this error message would be just nuts in Scala. Because at runtime both types are equivalent!

There is also no “safety” argument which could be made. Because in case you want to be really safe you wouldn’t use tuples at all. You would write something like:

case class IntBoolTuple(a: Int, b: Boolean) // Excuse the not very creative name…

def f(s: Seq[IntBoolTuple]) = ???
f(Seq(IntBoolTuple(1, true), IntBoolTuple(2, true)))

So Scala gives you already the tools to be safe. If you think you need it, just go for it!

But Scala currently doesn’t give you the tools to be less explicit. That’s the whole point of this proposal!

This feature needs to be usable in a “dynamic” context. We don’t need just another “super safe” tool. We have plenty of them. One could even say that in the end Scala got ruined by the people who constantly insist on theoretical purity. Because in the mass-market nobody cares actually. The majority of people are using dynamic languages. JavaScript and Python eat the world! That’s the reality we need to deal with. (No matter whether it’s sane what the majority of people do. It’s a fact they won’t even look at something where they can’t do “the simple things” easily. They will just use Python and call it a day!)

Oh, I just realized that my above example isn’t safe and explicit enough. To make the people in this thread really happy we need to write it actually like:

case class IntBoolTuple(a: Int, b: Boolean) // Excuse the not very creative name…

def f(s: Seq[IntBoolTuple]) = ???
f(Seq(IntBoolTuple(a = 1, b= true), IntBoolTuple(a = 2, b = true)))

I don’t see what’s there to “explore”.

You need some form of “conversion”, no matter what. Because otherwise this feature is unusable.

The only question is how exactly this “conversions” look like.

We know from bitter experience that having bidirectional implicit conversions is a trap.

So there are actually only two possibilities left: Using subtyping in on direction and (implicit) conversions in the other. (Subtying in both directions doesn’t work of course). Or you could have only manual conversions (where one of them could be made implicit in the end).

But now we’re back to square one: The feature wouldn’t be convenient to use. You would require to put extra noise into your code just to achieve the obvious. That’s not Scala! That’s awful Java style, where the typesystem is not helpful at all, but instead stands the whole time in your way, nagging you to write down absurd amounts of noise just to make the compiler happy.

That’s true. But by this argument we should avoid any new feature. Because it could blow up later on.

Better safe than sorry, right? So just use assembly code because that’s at least very explicit! (Fun fact: That was actually the major line of argumentation against higher level languages and all the “implicit compiler magic” a few decades back).

The possibilities are already know. I’ve written them down. I don’t see that this could change anyhow.

This claim looks plain wrong to me.

Without subtying you wouldn’t be able to use generic methods on tuples without repeating all the field names. This would substantially hinder usability. Especially where named tuples would be used, namely in less strict typed code.

I don’t get this.

For me named tuples are “proto case classes”. Just “case classes” where you didn’t commit to giving them a name in the current state of development.

Intuition isn’t a very good argument. Because it looks different for different people.

For me having Tuple <: NamedTuple looked quite intuitive, given that I see a strong analogy to function parameters.

Nobody claimed otherwise.

I see some confusion around that because people seem not to distinguish the definition of a parameter list (which has always named arguments!) and the call side where you’re free to use the positional application, or some kind of “structural application” (using names).

Nobody proposed that. That’s a misunderstanding I guess.

ParameterLists aren’t NamedTuples. They are a much more complex beast, involving “structural typing”, and some other funny stuff.

All I’ve said is that NamedTuples can be seen, in some specific circumstances, as a special case of ParameterLists (hence the subtying relationship I’ve drawn).

If that’s a major source of bugs for you, why don’t you just use named parameters in functions? This would work even with Java APIs…

Or could it be that writing named parameter lists everywhere is just to tedious and creates too much noise in the code?

That sounds great!

So there could be some “super safe” lib for whoever wants that, and a build-in solution which employs “name inference” for the people who think you don’t need to be super explicit with everything when the compiler can easily figure out what you actually want without forcing you to write it down explicitly.

Sporarum:

I really don’t buy the parallel with function parameters, for example the return type of a function does not depend on the correct ordering of parameters, whereas named tuples differ:
def foo(x: Int, y: Int) = x + y
foo(x = 2, y = 4) // : Int
foo(y = 4, x = 2) // : Int

(x = 2, y = 4) // : (x: Int, y: Int)
(y = 4, x = 2) // : (y: Int, x: Int)

I think this example is misleading.

You’re using the “structural part” of parameter lists. So you would need to compare with passing Records, not (named) tuples. But we don’t have Records, so this is moot.

The correct analogy is imho:

var foo: (Int, Int) = ???
foo = (x = 2, y = 4) // : (x: Int, y: Int)
foo = (y = 4, x = 2) // : (y: Int, x: Int)

The question would be what the type of foo is each time after assignment.

One could argue that it’s the type noted in the comment, and such a type ascription should actually work… (But than only the first of the assignments would work, of course).

Exactly!

And than they go using PySpark…

One of the last bastions of Scala is falling. Exactly because it’s not convenient enough to use Scala for such tasks.

Going further in that direction will kill Scala. Please stop that. We need to appeal the mass marked right now, not some hardcore Haskell expats. The Haskell folks have already their toys. They don’t need even more.

I think there may be truth in that. Such a dedicated feature looks like an improvement to the language in general.

But first of all that’s not the proposal here on the table, and also this would be just another tool for when you need super strict code.

What is on the table are named tuples. And the question is whether we want “name inference” working, so you don’t need to commit to some specific field names in calling code (like you don’t need to commit to some explicitly given types when you rely on type inference).

The discussion whether type inference is “good”, or “plain evil” because you’re not explicit any more is imoh settled in Scala.

In other languages (like C++ with auto, or C# with var) people are still discussing, and the arguments look pretty close to what came up in this thread here: People are lamenting that with type inference there is room for bugs in case the compiler chooses to infer “the wrong type”, and therefore you need always to be explicit to be “safe”.

I think “name inference” for tuples isn’t much different than type inference…

That’s out of context.

The case is about converting named and unnamed tuples back and forth, not “everything”.

This is definitely needed, one way or another. Alone to satisfy LSP: Both tuple types are the same at runtime and therefore interchangeable, so you need to be able to exploit this fact in your code. (Ideally that’s zero cost at runtime).

If I want to express:

val bob: Person = ("Bob", 33)
bob.zip(("Miller", "years"))

I would find it super annoying if I would need to clutter my code with noise “only to make the stupid compiler happy” and force it to do the obvious. Code like:

val x: Person = (name = "Bob", age = 33)
bob.zip(name = "Miller", age = "years")

or

val x: Person = ("Bob", 33).named
bob.zip(("Miller", "years").named)

includes noise for no reason.

That’s not what I expect from Scala!

Being explicit for the sake of being explicit is the most terrible part of a language like Java.

The compiler and the type system aren’t helpful, they stand in the way!

And no, that has nothing to do with “safety”. If I want to be “safe” I would never use underspecified types like Ints or Strings, and I would likely not use tuples (named or unnamed) at all. Because that’s not what tuples are good for.

MateuszKowalewski · December 6, 2023, 5:44am

A completely unrelated idea (thinking out loud):

What would be if there wouldn’t be any “named tuples” at all, but we would have some syntax shortcuts to define “new types”, and also list the type params of a tuple at the value level in some convenient way so we could have named “selectors” on tuples?

type Person = ([Name = String], [Age = Int]) // or something like that…

Imagine the named type parameters would also automatically introduce “new type wrappers” (and some magic for named “selectors”).

You could also import (through some compiler provided magic) implicit conversions that could “lift” plain values into the “new type wrappers”—and for compatibility in the other direction the opaque “new type” would be a subtyp of the wrapped type.

Depending on whether you have an import like

import Person.given.implicitFieldWrapperConstructors // The name could be better…

in scope code that tries to assign ("Bob", 33) to something typed as Person would work, or would fail.

In case the implicit conversion is imported you could write:

val bob: Person = ("Bob", 33)
// or
val bob = ("Bob", 33): Person

Otherwise you would need to be explicit and use the compiler generated “new type” constructors:

val bob: Person = (Name("Bob"), Age(33))
// or even just:
val bob = (Name("Bob"), Age(33))

There is nothing to argue regarding subtyping, as there would be only “normal tuples”. Just that you could make them more expressive by using very specific types for the fields, which the compiler could mostly generate for you.

What would be missing is a way to get at the “field names” in user code. As there are actually no field names at all you need to have some way to get the types, and lower them to the value level (for example as singleton String types, for which some further type level machinery already exists). Something like that would also need to happen when you use “selectors” on the tuple.

The problem: Everything mentioned sounds very magic to me… Not sure this makes any sense at all…

I had this glimpse of an idea because I think part of the problem is the usage of underspecified types for tuple fields. All the “awkward” examples so far rely on that fact. But with more expressive (new) types you could never confuse an (Int, Int) with an (Int, Int). It would be something like (Int, Int) vs. (XAxis, YAxis) which is clearly not compatible by nominal typing without any conversions in place, even when XAxis <: Int and YAxis <: Int.

My point is:

Maybe there is no issue with “name inference” and the actual core of the problem is “primitive obsession”?

Maybe the langue would profit from some (general) tools to fight “primitive obsession” in the first place, so there wouldn’t be any real issue with “name inference” for tuples at all?

saulpalv · December 6, 2023, 6:38am

There are other languages with named/positional record blends for instance Dart | Records

which are fixed-sized, heterogeneous, and typed

Dart records are a generalization of both tuples and structural records where unnamed fields fallback to positional naming

MateuszKowalewski · December 6, 2023, 7:22am

I really like that they have kind of unified “records” with “parameter lists”.

Fields in record expressions and type annotations mirror how parameters and arguments work in functions.

But the rest looks confusing. It’s more like the Python design with separate kwargs as I read it.

Also I’m not sure what I should make out of:

void main() {
  // ({int x, int y, int z}) point = (1, 2, 3);
  // ^ Error: A value of type '(int, int, int)' can't be assigned
  //   to a variable of type '({int x, int y, int z})'.
  
  ({int x, int y, int z}) point = (1, 2, 3) as ({int x, int y, int z});
  // ^ (runtime) Uncaught Error: TypeError: Record (1, 2, 3): type '(int, int, int)'
  //   is not a subtype of type '({int x, int y, int z})'
  
  print('${point}');
}

I read it like: In Dart a “tuple” (Record (1, 2, 3)) has a different runtime representation than a “record” (type ({int x, int y, int z})).

So it makes sense that there is no subtype relation also at compile time (as seen in the first part of the code, that doesn’t use the as cast). So I don’t know what this means for Scala.

odersky · December 6, 2023, 9:36am

All other typed languages I know do allow implicit conversions in both directions. It seems Swift is no exception. In Redeclaration of named tuple members - Using Swift - Swift Forums I see this code:

let counter = 119
let foo = counter.quotientAndRemainder(dividingBy: 60)
let bar: (Int, Int) = foo
let final: (min: Int, sec: Int) = bar

odersky · December 6, 2023, 10:33am

Subtyping can be tricky because it depends on the precise details of how types are modelled. I remember many people including luminaries like Bertrand Meyer having argued heatedly that types like mutable arrays must be covariant, and yet, it’s not true. So to get to the bottom of this, and to be sure that the proposal is indeed sound, we need a detailed semantic argument, that I am going to attempt to give now. Please bear with me, if you are interested in this stuff.

Let me start with this puzzling question: Why is Nil a sub-type of all lists whereas {} is a super-type of all records?

I am going to give a lay-man’s domain-theoretic interpretation of subtyping to clarify this question. Domain theory models types as a kind of sets and allows for an explicit bottom value bottom to represent “undefined”, I wrote “a kind of sets” since values in domain theory have structure, with bottom being “smaller” than other values. Types in domain theory are downward closed sets wrt to this structure, that’s why they are called “ideals”. The type consisting of only bottom in domain theory is essentially Nothing in Scala.

Lists can be seen as functions from ints to element values. They are defined on all ints, since I can take xs(i) with any index i I choose. It’s just that the result might be bottom. It is bottom if the index is outside the domain of the list. So Nil is the type of exactly one function, the one that maps every index to bottom. That makes it a smaller type than the type of all lists (in the types-as-sets intuition). OK so far?

But records can also be seen as functions from names to element types N -> T_N, very similar to lists. So why is the empty record then a supertype of all other records? It’s worth pondering this for a while before moving on, because it is not at all obvious.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

One difference is that record selection is not allowed to return bottom. Therefore, {} cannot be a subtype of other records, since selection with any arbitrary name is not defined. But why is it a supertype? The short answer is, because people wanted it to be. The longer answer is, people found it convenient that longer records are in a subtype relationship with shorter records, so they came up with the following construction:

The values of a record like {name: String, age: Int} are functions that map "name" to a String, "age" to an Int, and every other name to anything, including possibly bottom.

So that’s why {name: String, age: Int} <: {age: Int}: Both map "age" to Int, and the left record type maps "name" to String but the right record type maps it to any value, including bottom. There are fewer record values in the left type than in the right type, hence, subtyping holds. And by the same construction, every record is a subtype of the empty record. Furthermore, since field selection is not allowed to return bottom, selecting undeclared fields of a record must be rejected at compile time.

Aside: This was not at all obvious when record subtyping was invented. In fact I remember a paper (I believe in CACM or SIGPLAN notices) with a heated argument that record subtyping was exactly the wrong way round and that Wirth was an idiot for having proposed it like this for Oberon. The argument went like this: Obviously, a record type with fewer fields describes fewer values than a record type with more fields, so shorter records must be subtypes of longer ones! I think that paper passed peer review, so more than one person must have thought like this at the time.

Now, going back to named vs unnamed tuples. They are both tuples, that is, functions that
map an index range to values of index-dependent types. The names come into play like this: They are a separate member function that maps names to either directly name-dependent types, or indices, watever modeling you choose. The function exists only at compile time, but this should not matter for the modeling. Let’s call these functions dictionaries. And let’s pick indices, so the naming function is a function from String to Int that is returns a non-negative number for every defined name, and bottom for the rest.

Now, for an unnamed tuple, this function is not defined anywhere, so it maps every name to bottom. By contrast, the function for a named tuple maps all defined names to integers, and the rest to bottom.

Concretely, we can model these dictionary functions as list of pairs of names and value types. A name might not be in the dictionary, then selecting that name will return bottom (which is a legal value at compile-time) and that will produce a compile-time error. An unnamed tuple has Nil as the type of its dictionary. Which means that its dictionary is a subtype of any named tuple’s dictionary. And that explains it.

In summary, it’s a subtle point, and it all comes down to whether bottom is a legal value or not when selecting a field or finding the index of a name at compile time. Or more implementation oriented: unnamed <:< named because dictionaries are lists, not records.

ADDENDUM: I have shown one domain-theoretic construction that shows that unnamed <: named has a model and is therefore sound. That does not say that the other direction is unsound. The other direction also has models, even though they are somewhat tricky to set up.

The most obvious approach (which i believe matches the intuitions of some of you) is that named tuples are records that have an erased dictionary field, and unnamed tuples are records without that field. But that does not work, since tuples are not records, they don’t have width subtyping. So the next model is that tuples are really single element records {values = <tuple>} and named tuples are two element records {values = <tuple>, names = <dictionary>}. That can work, but I find overall a bit simpler to say that tuples always have a value part and a dictionary part, and that the dictionary is Nil for named tuples.

I have argued at length before that unnamed <: named is useful and named <: unnamed is hurtful, so it should be clear which direction we should pick. And now that we have established soundness I see no obstacle to not picking the direction that is right.

Sporarum · December 6, 2023, 1:21pm

First, thank you for this very clear proof of soundness, I believe it to be extremely valuable

I however have an issue with one tiny part of your conclusion:

It presents a false dichotomy, before “which direction”, we really need to ponder “whether we need a direction”
(which I argue is “no”, and we can always add it later, especially given your prefered choice only adds 4 characters to the implementation !)
Even assuming we need to chose, I am still not convinced that is the right direction, I would need more examples using both systems, and in particular I am weary of cases like this:

// following the japanese convention
type Hito = (lastname: String, firstname: String)

// other file
type Person = Hito

// yet another file
("John", "Smith"): Person // (lastname = "John", firstname = "Smith")

This is a somewhat convoluted example, but I believe it highlights what can happen when there are multiple degrees of separation between the unnamed tuple value and the named tuple type

Sporarum · December 6, 2023, 1:30pm

Online not really, but in person it was a common occurence when discussing with compiler experts

As a user who loves regularity, I love overloading, and I dislike that some features do not work with it

As a compiler engineer, I hate edge-cases and obscure bugs, and overloading creates a lot of them

(I will not comment on it further in this thread, as it has a different purpose)

MateuszKowalewski:

You’re using the “structural part” of parameter lists. So you would need to compare with passing Records, not (named) tuples. But we don’t have Records, so this is moot.

The correct analogy is imho:
var foo: (Int, Int) = ???
foo = (x = 2, y = 4) // : (x: Int, y: Int)
foo = (y = 4, x = 2) // : (y: Int, x: Int)

I don’t really understand, we have records in the way of structural refinement types, and vars do not change their types, foo will always have type (Int, Int)

liufengyun · December 6, 2023, 2:52pm

According to the perspective of semantic subtyping, subtyping will be defined on top of semantics. Here, the runtime semantics of named tuples is defined to be the same as that of unnamed tuples, they are therefore equivalent from perspective of semantic subtyping.

However, that does not mean they should be defined to be equivalent in the language. In my opinion, the fact seems to suggest that the theory of semantic subtyping has no preference on the subtyping relationship of named/unnamed tuples. This is because type theory primarily cares primarily about runtime safety, but in practice we also want subtyping to improve usability, avoid more errors, engineering cost and etc.

On the other hand, from the perspective of semantic typing, there is a question: The erasure semantics of named tuples is essential or just accidental (an implementation convenience)? If there’s scope to give different semantics to named tuples, then the subtyping story would be different. In particular, programmers might have an intuitive (wrong?) semantics of named tuples that are different from the erasure semantics — that might impact usability/learnability.

morgen-peschke · December 6, 2023, 5:07pm

In Scala 2 the supertagged library provided extremely brief syntax for defining newtypes, and it avoids nearly every issue with AnyVal based wrappers. Having syntax for this would make named tuples functionally irrelevant, and potentially achievable via very simple sugar that side-steps most of the disagreements in this thread (using Scala 2 here because supertagged hasn’t been ported to Scala 3, and the differences are minimal):

object Test {
  def foo: ([Name = String], [Age = Int]) = ???
  
  // possible expansion (re-using any existing definitions of this type defined in the scope of Test):
  object Name extends supertagged.NewType[String]
  type Name = Name.Type
  object Age extends supertagged.NewType[Int]
  type Age = Age.Type
  def foo: (Name, Age) = ???
}

The way supertagged solved this was to provide supertagged.TagType[T] which has a subtype relationship with T and supertagged.NewType[T] which does not. In practice, this makes it extremely easy to distinguish and control the behavior you want WRT automatic conversion between the newtype and base type.

If this were elevated to syntax, it could be the difference between ([Name = Int]) and ((Name = Int)), mirroring open and closed bounds.

Ichoran · December 6, 2023, 6:23pm

Because it doesn’t work with Java:

scala> "fish".substring(startIndex = 1, endIndex = 3)
-- Error: ----------------------------------------------------------------------
1 |"fish".substring(startIndex = 1, endIndex = 3)
  |                 ^^^^^^^^^^^^^^
  |method substring in class String: (x$0: Int, x$1: Int): String does not have a parameter startIndex
-- Error: ----------------------------------------------------------------------
1 |"fish".substring(startIndex = 1, endIndex = 3)
  |                                 ^^^^^^^^^^^^
  |method substring in class String: (x$0: Int, x$1: Int): String does not have a parameter endIndex

And if it did work, I’d have to remember whether people used start and end or startIndex and endIndex or i0 and iN or from and until or what.

In Scala code I do sometimes use the names, though since they’re not required people haven’t put much thought into normalizing them:

scala> Vector(1, 2, 3, 4).slice(start = 1, end = 3)
-- Error: ----------------------------------------------------------------------
1 |Vector(1, 2, 3, 4).slice(start = 1, end = 3)
  |                         ^^^^^^^^^
  |method slice in trait IndexedSeqOps: (from: Int, until: Int): Vector[Int] does not have a parameter start
-- Error: ----------------------------------------------------------------------
1 |Vector(1, 2, 3, 4).slice(start = 1, end = 3)
  |                                    ^^^^^^^
  |method slice in trait IndexedSeqOps: (from: Int, until: Int): Vector[Int] does not have a parameter end

but

scala> Vector.range(from = 1, until = 3)
-- Error: ----------------------------------------------------------------------
1 |Vector.range(from = 1, until = 3)
  |             ^^^^^^^^
  |method range in trait IterableFactory: (start: A, end: A)(implicit evidence$3: Integral[A]): Vector[A] does not have a parameter from
-- Error: ----------------------------------------------------------------------
1 |Vector.range(from = 1, until = 3)
  |                       ^^^^^^^^^
  |method range in trait IterableFactory: (start: A, end: A)(implicit evidence$3: Integral[A]): Vector[A] does not have a parameter until

It’s not primarily that it creates too much noise in the code; it’s that the experience is too low-quality because of inconsistencies. Because function parameter names are always optional, they’re often not given very much thought. In my personal library, I try very hard to be internally consistent, but if I’m being consistent with naming, I can also be consistent with semantics, so I always use i0 and iN indices over start and length if possible. At that point, since it’s always indices, and I always use the exclusive end index, I don’t need the names any longer.

yangbo · December 6, 2023, 6:26pm

I can see named tuples are good in many use cases, but what is sad is that there are a lot of similar record-like languages features (structural types, case classes, named tuples), and the interoperability is not that good.

As a Scala user, it is a burden to choose the types of types that your API design is based on, and it is another burden if you want to convert between them.

I wonder if there is a possibility to design a single record-like feature that is:

supporting named fields
extensible and composable, i.e. multi-inheritance of records; adding new fields would not break ABI backward-compatibility
supporting type-level HMap / HList operators
supporting both Scala.js and JVM
as efficient as possible