If Tabula wants to make you type extra stuff, because the culture in Scala is to type less stuff than Java but more stuff than Python, that’s its prerogative. However, there is no reason why you can’t get from that to something that is at least as compact as R.
val ds = Dataset(
"Name" -> c(
"Braund, Mr. Owen Harris",
"Allen, Mr. William Henry",
"Bonnell, Miss. Elizabeth",
),
"Age" -> c(22, 35, 58),
"Sex" -> c("male", "male", "female")
)
is a trivial amount of code away:
def c[A](elements: A*) = A.toSeq
extension (dsc: Dataset.type)
def apply(kvs: (String, Seq[?])*) = Dataset(kvs.toMap)
With a lot more effort, one could make it work with tuples. With more effort yet, one could make it work with named tuples.
val ds = DataSet((
name = (
"Braund, Mr. Owen Harris",
"Allen, Mr. William Henry",
"Bonnell, Miss. Elizabeth",
),
age = (22, 35, 58),
sex = ("male", "male", "female")
))
We can do this already. Someone just has to want to enough.
Language features do attract use cases. But they can also clash with the existing feel. And that’s where the current proposal is iffy. It’s not that the feature itself is bad; I have hardly seen anyone saying, “I love extra boilerplate!” Rather, it’s that there don’t seem to be good choices that don’t induce some sort of pretty fierce clash between how the language feels and the new feature.
And, furthermore, we don’t seem to have a clear consensus on how to minimize the clash.
For example, I think that the correct way to interpret [22, 35, 58] is as a type, because [...] is always a type in Scala, but which can be reified into the corresponding collection. This would go for [xs.length, 2 + 7, foo(foo(foo(1)))], too; but that would require an extension to the type system to allow code literals to have a corresponding type, and for a code literal type to be treated as an inlineable piece of code. I haven’t worked through all the details, but I think this is a sound and self-consistent, albeit rather ambitious way to make the [...] syntax make sense within Scala. It’s a huge amount of work. Other people have other ideas, with their own arguments for and against. But overall, the biggest barrier is the friction, because (...) already means something, [...] already means something, and {...} already means something.
There are various things that don’t mean anything, but they also don’t mean anything in other languages, so the familiarity is zero. For instance, this is totally compatible, but I’m not sure anyone would go for this:
val ds = Dataset $
"Name" -> $
"Braund, Mr. Owen Harris"
"Allen, Mr. William Henry"
"Bonnell, Miss. Elizabeth"
"Age" -> $ 22; 35; 58
"Sex" -> $ "male"; "male"; "female"
Here $ would be a multi-expression token, which opens a block where the value from every statement is returned, with the whole thing typed as a tuple or array depending on expected type.
No language I know of has anything like this. It’s arguably even cleaner than Python. But it looks super weird.