Of course that’s possible:
def divmod(x; Double, y: Double): (divisor: Double, modulo: Double) =
(x / y, x mod y)
Of course that’s possible:
def divmod(x; Double, y: Double): (divisor: Double, modulo: Double) =
(x / y, x mod y)
That’s excellent. However I’m not sure about the sub typing relationship. To me it would seem natural to have named tuples as a sub type of their unnamed tuple structures. But maybe there’s major use case’s that this would cause a problem for that I’m not aware of.
I can’t think of a use case for mixing named and unnamed off the top of my head, but is there a specific reason why it can’t be allowed? Couldn’t the numeric position names be synthetically added to any params without an explicit name?
So:
val bob = (name = "Bob", 47)
val bob2 = (name = "Bob", _2 = 47)
assert( bob == bob2 )
I’ve just seen this comment from the PR discussion:
In the current implementation, we require all elements of a tuple to be named or none to be named. maybe we can relax this and only inquire that a tuple type consists of either all named fields or no named fields
While that isn’t exactly what I asked, it does seem to address my underlying thought. I like the idea of tuple types requiring names, but values being flexible. That aligns very nicely with the parameter list analogy.
It’d be good to see how this interacts with typeclass instance definitions. In particular, I’d expect the standard library to let me do:
type Person = (name: String, age: Int)
val bob: Person = (name = "Bob", age = 33)
val alice: Person = (name = "Alice", age = 35)
List(bob, alice).sorted // List(alice, bob)
Which looks like it would require a new instance:
given lexicographic[N <: Tuple, V <: Tuple]: Ordering[NamedTuple[N, V]] with
def compare(x: NamedTuple[N, V], y: NamedTuple[N, V]): Int =
summon[Ordering[Tuple]].compare(x.toTuple, y.toTuple)
… except this doesn’t work because we’re already missing a generic instance for Ordering
on Tuple, and I’m not exactly sure what’s the best way to define one (and how it would interact with the existing instances for Ordering on specific Tuples).
Finally!
Thank you so much, I love it!
The first question that comes to my mind: Could named tuples actually become “proper structs” on Scala Native?
The other thing is, as far as I understand this proposal, it’s not like this would somehow “close the road to enhanced structural types” at all. Quite the contrary, I think: Named tuples could in theory become the underlying structure for some future “HMaps”, so we could get structural types like in TypeScript also on top. It would take some (type level?) conversion mechanism to make this happen, where you “materialize” records / structural types as named tuples for the runtime representation. But this looks doable in the next step.
Nevertheless some possible future extension, named tuples are a great addition on their own. It was proposed so often. Great Scala listens once more to its community and makes such superb improvements! The language would get with this improvement one step closer to perfection, imho. We really just need a marketing department to spread the word.
While the symmetry between tuples and parameter lists is nice, I do think the relationship is fundamentally the wrong way around. I agree with @sjrd in https://github.com/lampepfl/dotty/pull/19075#issuecomment-1827403119 when he said that the direction of sub-typing is confusing: subtypes should have stronger contracts (whether encoded in the type system or not), more information known about them, and define more methods than their supertypes. In this case:
(x: Int, y: Int)
in place of (y: Int, x: Int)
, whereas all (Int, Int)
s are interchangeable.x
is not just a Int
, but a Int
along the X-axis)By this logic, named tuples should definitely be a sub-type of positional tuples.
@odersky mentioned in Named tuples experimental first implementation by odersky · Pull Request #19075 · lampepfl/dotty · GitHub that Python is not statically typed, and thus does not have precedence here, but that is incorrect: Python’s namedtuple
is unambiguously a subtype of tuple
When you look at it from a dynamic perspective an instance of namedtuple
supports both a namedtuple
’s .foo
/.bar
/.qux
, as well as a positional tuple
’s [0]
, [1]
, [2]
. By duck-typing, namedtuple
can be passed anywhere a tuple
can be, and is thus a subtype
namedtuple
s have a dynamic isinstance
relationship with tuple
, and not the other way around
from collection import namedtuple
Foo = namedtuple("Foo", "a b c")
isinstance(Foo(1, 2, 3), tuple) # true
isinstance((1, 2, 3), Foo) # false
NamedTuple
s as subtypes of Tuples
of the same arity:from typing import NamedTuple, Tuple
# Define a named tuple
class Point(NamedTuple):
x: int
y: int
# Function expecting a positional tuple
def process_tuple(t: Tuple[int, int]):
print(t[0], t[1])
# Create a named tuple
point = Point(1, 2)
# Call the function with the named tuple
process_tuple(point) # OK
# Function expecting a named tuple
def process_named_tuple(t: Point):
print(t.x, t.y)
# Call the function with a positional tuple
process_named_tuple((1, 2)) # main.py:23: error: Argument 1 to "process_named_tuple" has incompatible type "tuple[int, int]"; expected "Point" [arg-type]
IMO we should follow the Python convention that NamedTuple <: Tuple
, rather than the Scala argument-list convention that positional arguments can be used anywhere a named argument can be used.
Long-term, IMO we should probably have a stricter separation of position and named arguments the same way Python/Swift have. The status quo way we can swap named/positional parameters is looser than it should be:
/
and *
syntaxRather than twisting the named/positional-tuple subtyping relationship to match what we currently have for named/positional arguments, IMO we should follow Python’s lead for the subtyping of named/positional-tuples and slowly twist the named/positional arguments to be more Python/Swift-like instead.
Notable, Swift allows unrestricted conversions in both directions between named and un-named tuples, so both assignments below work.
var tuplePositional: (String, Int) = ("Suresh Dasari", 200)
var tupleNamed: (name:String, id:Int) = tuplePositional
print("Hello World \(tupleNamed)")
var tupleNamed: (name:String, id:Int) = (name: "Suresh Dasari", id: 200)
var tuplePositional: (String, Int) = tupleNamed
print("Hello World \(tuplePositional)")
I can understand why we may not want to do that at the subtype level, because subtyping transitivity would mean that they’re the same type. But we could conceivably provide implicit conversions since those are non-transitive by default. But we probably should just stick to a one-directional subtyping relationship and provide an explicit case/conversion method to go the other way
On an unrelated note, named tuples has a very nice synergy with https://contributors.scala-lang.org/t/unpacking-classes-into-method-argument-lists/6329/93:
case class
es as a type to be passed to unpack
unpack
makes conversion between named tuples and case classes trivial, you would be able to call MyCaseClass(*myNamedTupleInstance)
or (*myCaseClassInstance)
to convert between them without needing any explicit library or language support. These conversions would “just work”For reference, C# also allows conversions in both directions (but don’t ask how it’s implemented, IDK):
using System;
public class Program
{
public static void Main()
{
(String, int) tuplePositional = ("Suresh Dasari", 200);
(String Name, int Id) tupleNamed = tuplePositional;
Console.WriteLine($"Hello World {tupleNamed}");
(String Name, int Id) tupleNamed2 = (Name: "Suresh Dasari", Id: 200);
(String, int) tuplePositional2 = tupleNamed2;
Console.WriteLine($"Hello World {tuplePositional2}");
}
}
[ see: C# Online Compiler | .NET Fiddle ]
Note: In C# the names seem to be just some compile time entity. The code above prints two times the same.
Scala could do better, as I understand this proposal here.
OTOH I would argue against imitating Python. Their *args
/ **kwargs
distinction always felt like a major kludge to me. It looks like something that was made this way because it wasn’t feasible to hide this distinction when Python was created. But our computers and software is much more powerful today. Just let the machine figure out how to convert those structures as needed, and hide the tedious details from the user. (Also having static types helps with that, I guess. Something Python didn’t have when *args
/ **kwargs
was invented. Otherwise this would be just different types of tuples / arguments, I think…)
In general I like the idea to unify parameter lists and (named) tuples very much. Imho parameter lists are nothing else than some funky kind of named tuple(s). But there need to be conversions in all directions to make something like that convenient to use. Forcing end-users to fiddle with *args
/ **kwargs
is the opposite of convenience!
Conversions by subtyping in both directions are fundamentally wrong, since they make all names meaningless (detailed argument in the linked thread). Subtyping is supposed to be transitive. So with subtyping directions in both ways we get:
(foo: Int, bar: Int) <: (Int, Int) <: (baz: Int, bam: Int)
That’s clearly not something we want to have.
That leaves one direction available. Arguably unnamed <: named is much more useful than the other way around.
If we assume named <: unnamed then named tuples are regular tuples. The names are conceptually on the side and enable us to also do selection by name. We can use all tuple functions including _1
, _2
on named tuples. But that means we can also use functions like ++
on tuples. Both of the following would be OK:
Bob ++ (1, true)
(1, true) ++ Bob
and would yield an unnamed tuple. Furthermore, we can also write the erroneous Bob ++ Bob
and instead of reporting an error about duplicate names this would be forced to strip the names and return an unnamed 4-tuple. I have the impression this would lead to brittle code. Imagine concatenating two long tuples from a database that accidentally share a name and instead of an error you get an unnamed tuple!
Moreover we can re-define none of the operations of Tuple
. An example: Arguably map
on a named tuple should return a named tuple. But the map
operation is already taken on Tuple
, and it is an inline function, so we cannot override it. This means we need a new operation like namedMap
to do the “correct” map on NamedTuple
. And a user naively using map
on a named tuple would lose all the names. Yuck!
By contrast unnamed <: named is undeniably useful and has none of the shortcomings of the other direction. But what is its semantic intuition?
If unnamed <: named then tuples are named tuples. Semantically, a named tuple is modeled as a type level tuple of singleton types representing its names and a regular tuple of values. So, what are the names of an unnamed tuple? By the subtyping relation we are forced to have them be Nothing
. So an unnamed tuple is a named tuple where each name type is Nothing
. Once you think of it, it makes sense: There is no name, so the type of the name is Nothing
. Just like the type of head
on an empty list is Nothing
since it does not return a result.
Nothing
is not intuitive at all at first, but arguably Scala made some breakthroughs in PL design by embracing it. This seems to be another case where Nothing
is really useful!
As an aside, Kotlin creator Andrej Breslav gave a nice talk recently that recounted that C# specifically suffered from not having Nothing, which meant that throw
had to be a statement instead of an expression. Funny that C# and Typescript fall into the same trap again for named tuples.
I have shown that if we take subtyping seriously then unnamed <: named
is the only relation that is useful and makes sense. The reverse relation named <: unnamed
is actually hurtful, since it forces us to re-use all operations on tuples unchanged also on named tuples even where this does not make sense. I also want to already stave off any arguments that we should fiddle with subtyping, for instance by adding some additional rules or restrictions. IMO that’s a rabbit hole not worth jumping into.
My argument was strictly about subtyping. Conversions between tuples are another topic entirely. Of course you should always be able to map named to unnamed tuples by an explicit conversion. In fact it already exists:
val pair: (String, Int) = Bob.toTuple
We could add that functionality to a Conversion
on tuples, so the conversion could be used implicitly.
Should we forgo subtyping altogether and instead define implicit conversions in both directions? Arguably, that’s a bad idea since we would then be never sure what conversion the compiler applied, which would in turn determine the type of a tuple. We quickly realized that bijective implicit conversions were a really bad idea when we defined JavaConversions
. Let’s not make the same mistake here!
Stripping names is what Python does when you append tuples and namedtuples. I agree it’s not an ideal result.
Would one solution to have some kind of “partially named” tuples? That’s effectively what we have with argument lists, where you can call foo(1, 2, 3, x = 4, y = 5, z = 6)
. What if we define (1, 2, 3) ++ (x = 4, y = 5, z = 6)
to be (1, 2, 3, x = 4, y = 5, z = 6)
? Would such a thing be possible?
Moreover we can re-define none of the operations of
Tuple
. An example: Arguablymap
on a named tuple should return a named tuple. But themap
operation is already taken onTuple
, and it is an inline function, so we cannot override it. This means we need a new operation likenamedMap
to do the “correct” map onNamedTuple
. And a user naively usingmap
on a named tuple would lose all the names. Yuck!
This seems like an implementation issue. Not to say those aren’t real, but I wonder if a way could be found to work around it? e.g.
.map
be made into a normal method or extension method, rather than inline, such that named tuples could do a covariant override with a more specific return type?.map
be made transparent inline
, such that when called on a named tuple it would use an inline if
to also return a named tuple?Using Nothing
works, but I’m not sure it’s good. By that logic, any type is a subtype of anything else!
scala.Tuple2[T1, T2]
is a sub-type of scala.Tuple3[T1, T2, Nothing]
java.lang.Object
is a sub-type of java.lang.CharSequence
, with covariant overrides def length(): Nothing = ???
and def charAt(index: Int): Nothing = ???
In effect, using Nothing
is just throwing out static typing and going to a Pythonic “call whatever method you want, it’ll just blow up with an exception if it’s invalid”. The fact that Nothing
can be used in types/expressions more convenient than throw
being only usable in statements, but typically it’s still just throwing exceptions at runtime, and is a code smell that usually indicates you have your subtyping relationships wrong or backwards.
I’m not sure if there’s some clever Scala 3 type-level logic stuff or compiletime.ops
stuff going on that will mean this isn’t as dangerous as normal usages of Nothing
in Scala code. And sometimes you really can’t capture everything you want in the type system, and have to resort to Nothing
/throw
-ing in corner cases. But the fact that the proposal intentionally uses Nothing
in a type definitely seems suspicious and suggests that we’re doing it wrong.
Basically the objections to NamedTuple <: Tuple
seem like implementation restrictions that in theory could be made to work while preserving the Liskov Substitution Principle, while the implementation of Tuple <: NamedTuple
using Nothing
seems like a direct violation of LSP. It’s possible that there may be be other factors that may weigh in on this decision, but the issues with Tuple <: NamedTuple
definitely seem more serious: implementations can change and be tweaked and restrictions lifted, but LSP is pretty fundamental and isn’t going way anytime soon, and typically violating LSP means an endless stream of edge cases that will never work quite right
Just thinking out loud:
What if all functions would just take NamedTuple
s as arguments?
If you pass the subtype, an (unnamed) Tuple
, it can always be “up-cast” to the named version, as you can map the positional arguments onto the names (the tuples need to have same length to be even considered related by subtyping). The “up-cast” seems also somehow related to the fact that function params are contra-variant.
Passing positional arguments to functions (which have param lists, which have named arguments) works just fine in today’s Scala. The names of function parameters are forgotten / erased at runtime. The same would be the case with NamedTuple
s at runtime.
Tuples in general look really very related to param lists…
Thinking in subtypes, I guess, it would look something like:
Tuple <: NamedTuple <: ParameterList
Functions always take ParameterList
s.
But it’s always fine to pass a subtype (LSP). So passing positional arguments (a Tuple
) is OK, and will never do harm.
No, Nothing
used in that way does not compromise type safety. My argument did not assume that any part of Scala changes. In particular, a type with a field x: Nothing
was never equivalent to a type without that field and isn’t now either. So the argument that this amounts to dynamic typing is wrong.
No, it isn’t.
No, that does not hold either.
Is LSP violated? That’s an interesting question. LSP is about runtime behavior, where all tuples are equivalent, so that means LSP must hold in that sense. But there is something interesting going on with Nothing
when it comes to compile time.
If I say, x: String
and then I refine to x: Nothing
, can I still do everything with it I could do before? Of course not: For instance I can’t take the length of x
if its type is Nothing. Even though I could upcast x
to String
and then take the length again. So that shows that LSP does not work for compile-time subtyping. (there are other violations as well which are not linked to Nothing
for instance connected to overloading or implicit resolution.)
If unnamed <: named
then what I would have expected semantically is an unnamed tuple is a private case of a named tuple where all the fields are named _1, _2, ...
. So (Int, Int) <: (arg1: Int, arg2: Int)
does not make sense to me unless also (otherName1: Int, otherName2: Int) <: (arg1: Int, arg2: Int)
.
So intuitively I would expect either all tuples with the same field types are considered the same type or that named tuples are subtypes of their unnamed counterparts.
The accessors _1
, …, _22
are not defined for named tuples. It’s wrong to equate
(Int, String) with (_1 = Int, _2 = String)
If you do that, you will no longer be able to map between named and unnamed tuples at all.
To repeat, the following model holds for the type structure at compile time:
(name: String, age: Int) ~~ (("name", "age"), (String, Int))
(Int, String) ~~ ((Nothing, Nothing), (String, Int))
Here, "name"
and "age"
are the singleton types with these string literals as values.
I’m not considering the (current draft) implementation in my argument, nor do I think it should be part of the conversation.
My modeling was not primarily related to the implementation. It’s the modeling we need to keep in our heads to understand the semantic intuition of the proposal.The implementation just follows that model (and quite loosely, at that, since instead of structural expansion like this it uses an opaque type that reflects it).
I just want to give a few precisions about Swift and its treatment of named and positional things.
In Swift, functions always have positional arguments only. If you write this:
func divide(a: Int, b: Int) -> Int { a / b }
The compiler will enforce that a
appears before b
at all call sites.
divide(b: 2, a: 6) // error: argument 'a' must precede argument 'b'
The reason is that Swift lets you add labels on positional parameters. Those labels don’t even have to match the name of the parameters:
func divide(dividend a: Int, divisor b: Int) { a / b }
print(divide(dividend: 6, divisor: 2))
Why is this important? Because we can also understand the “names” of a tuple the same way. They are just labels for positional elements. All tuples in Swift, named or otherwise, can be accessed via integer indices.
func divide(_ a: Int, by b: Int) -> (quotient: Int, remainder: Int) {
(quotient: a / b, remainder: a % b)
}
let x = divide(6, by: 2)
print(x.0) // 3
print(x.remainder) // 0
So one way to think about named tuples is to see them as just tuples with some information that lets the compiler relate a name to a position. This interpretation gives us more lenience to define subtyping and/or conversion.
I have to admit that I’m not particularly moved by the beauty of the theory and/or implementation of named <: unnamed
or unnamed <: named
. What matters to me is how useful the relation will be in my programs, and part of that includes intuition. It does not take mental gymnastic to understand that throw E
returns Nothing
, regardless of what that particular interpretation of throws
buys for the calculus or the implementation. The same can’t be said about the arguments that have been presented in favor of unnamed <: named
.
The subtyping relationship in Swift between types with labels is murky so I don’t think it is necessarily something to emulate. My (likely unpopular here) personal opinion is that not having a subtyping relationship at all might not be a bad bet. I’d also add that it’s always easier to relax constraints than tightening the screws after the fact. So perhaps it would be best to start without subtyping and identify where exactly that choice causes unbearable pain.
That being said, one thing that gets annoying without implicit conversion is the boilerplate necessary to create new tuple instances. Let me rewriting my Swift example in Scala to illustrate:
def divide(a: Int, b: Int) -> (quotient: Int, remainder: Int) = {
(quotient = a / b, remainder = a % b)
}
I claim that it would be mighty convenient if we didn’t have to repeat the labels/names of the tuple in the return value. Similarly, if we have a method def f(x: (a: Int, b: Int))
we probably want to be able to call f((1, 2))
. But I also claim that the conversion isn’t that important in other use cases. For example, I don’t think it is the end of the world if the compiler complains when I write this:
def foo((Int, Int)) -> Int = ???
foo(divide(6, 2))
After all, there is a very real possibility that I misused the result of divide. Having to pause and say “here’s how you get from (quotient: Int, remainder: Int)
to (Int, Int)
” might actually be beneficial for the understandability of this code.
I think one way to define very simple conversions between named and unnamed tuples is to restrict them to tuple literal expressions. If locally the compiler is able to infer the named that we left out, then all is well. Otherwise, (a: T, b: U)
is neither super type nor subtype of (T, U)
and the user must take the appropriate step to convert their types. We can always bikeshed syntax for that.
There are many reasons why I’m not riding the subtyping train, but for tuples specifically, one is that I don’t think tuples should be a substitute for named types. In fact, I claim that the opening example shows a bad use case for named tuples:
type Person = (name: string, age: Int)
val amy: Person = (name = "Amy", age = 33)
What is the argument for not having defined a case class here? For essentially the same number of keystrokes, we get a type that also supports pattern matching and for which subtyping is clearly defined and unambiguous. So if we want to do fancy things with implicit conversions on assignment or at function boundaries, we already have the right tools for the job.
The fact that a case class has a heavier bytecode footprint isn’t a very compelling argument to me either. It’s good to know if I have to optimize my code one day but otherwise I’ll always lean on the side of using fewer features.
What I think is far more compelling is to have
a convenient lightweight way to return multiple results from a function
This use case doesn’t deserve a sophisticate subtyping relationship, only a simple way to create instances, match on them, and select their members. The simple conversion scheme that I described above is sufficient for that.
FWIW, I’ll add that in my experience with Swift, a lot of code starts with a tuple (labeled or not) and ends with a named struct because eventually one wants to properly document a type and their properties. So most uses of tuples in Swift are at function boundaries in things like Dictionary.init(uniqueKeysWithValues: Sequence<(key: Key, value: Value)>)
.
I’m not at all familiar with database oriented applications so I won’t comment on it. I’ll only say that I strongly suspect database people have thought of ways to deal with records sharing names and that is where we should look for answers if we haven’t yet.
If we adopt the view that “names” are merely labels for positional things, then there are a few restrictions we can lift.
For example:
It is illegal to mix named and unnamed elements in a tuple
Why? That is perfectly fine in Swift:
let x: (a: Int, String) = (1, "hello")
print(x.1)
We can just get tuples that happen to not have labels for some specific elements. Anyway, we can still access those elements using their position, as shown in the example.
or to use the same same name for two different elements.
Why?
That’s a little more experimental (at least we can’t do it in Swift) but we could simply say that if multiple elements have the same label, then the compiler reports an ambiguity if we try to use it. Again, all elements can be unambiguously accessed by their position anyway.
I think this approach also solves the problem of concatenating two tuples with overlapping names. We just get one whose unambiguous elements can be accessed by name and the other must be accessed by position. The label information is still useful because if we later split the combined tuple we might be able to unambiguously name its parts.
Inventing syntax and APIs because I don’t know how to express these operations in Scala:
val x = (a = 1, b = 2)
val y = (c = 3, b = 4)
// z has type (a: Int, b: Int, c: Int, b: Int)
val z = x ++ y
// compile-time error
print(z.b)
// OK
print(z(1))
// w has type (a: Int, b: Int, c: Int)
val (w, _) = z.splitAt(3)
// OK
print(z.b)
For what it’s worth, my intuition was also that named <: unnamed, and it seems to make more sense
I was about to write something along the lines of:
But Alvae beat me to it ^^’
There is precedent for this:
val x: Double = 2 // implicit conversion from int literal to double literal