I’m not sure that this problem is worth solving, collection literals are pretty much guaranteed to be of limited length. And even if it does turn out to be significant, it should be possible to solve it on a library level – we can rewrite Seq(1,2,3)
to the equivalent builder code with a macro. I also think it would be a mistake to conflate it with aggregate literals because then you need to rewrite a lot of existing code to use the new feature in order to get the performance benefits. Let’s try to make something that works well in existing code bases.
Good point, we can always just make def apply
a macro as necessary
One issue with naive target-type companion-apply-inference approach is that it doesnt work well when the types arent exact.
e.g. In Mill we would like to be able to use aggregate literals where Task[Seq[T]]
is expected, and IIRC other libraries in the ZIO or Cats ecosystems have similar requirements
On further thought we may be able to work around this issue in Mill by providing a variadic apply method in Task
s companion object, and it can be a macro if necessary (probably is for Mill)
Another approach would be to make aggregate literals always return a SeqLiteral[T]
type, and then the various libraries can all define implicit conversions from SeqLiteral
to Seq
or Vector
or Task[Seq]
as desired. Not sure what the tradeoffs are but it seems like this approach would work as well
That’s a very interesting idea!
It would avoid to tie this feature to some specific lib. It would make it nicely generic and extensible.
In theory one could even change the syntax post-factum without affecting any users as even different syntax would expand to a SeqLiteral
(or a MapLiteral
) and any further handling happens in user space (some libs).
I think it would also allow to experiment with “object literals” later on without breaking anything because interpretation of SeqLiteral
s (or a MapLiteral
s) as object constructors would be again something in user space.
The idea is very much in the spirit of Scala where features are expressed as types in the compiler. (There was a meme on Reddit some time ago joking about the amount of “types of types” in Scala, but having a lot of specialized machinery implemented this way is actually a good design I think).
A different thing:
Do we also need SetLiteral
s, to complete what Python offers?
Actually, I have identified this problem before for ZIO’s Optional
type, and also proposed a solution: make it possible to override which type’s companion object a []
expression should refer to.
At the time I couldn’t find any other use for it besides Optional
, but it would be useful for Mill’s Task
as well.
We could also consider a scheme where, if the companion object (Task
, in this case) doesn’t have an apply
method, it looks for implicit conversions to the expected type in the companion object and if it finds one, it will try the convertee type’s companion object’s apply
method.
Example:
enum Optional[+A]:
case None
case Some(a: A)
object Optional:
implicit def toOptional[A](a: A): Optional[A] =
Optional.Some(a)
val foo: Optional[List[Int]] = [42]
In this example, it would see that Optional
doesn’t have an apply
method, so it would look for a suitable implicit conversion in the companion object and would find toOptional
. So it unifies the return type Optional[A]
with the expected type Optional[List[Int]]
and finds that A =:= List[Int]
. And List
has a variadic apply
method, so [42]
is desugared to List(42)
.
If there are several implicit conversions, we can ignore those whose return type doesn’t unify with the expected type of the aggregate literal, as well as those where the convertee type doesn’t have a variadic apply
method.
Such an approach would make things like Optional
or mill’s Task
work while not requiring any additional language features.
Ok actually this is not right, [<FOO>]
would unify with Optional[A]
so therefore <FOO>
must be the A
so then A =:= List[Int]
and then 42
doesn’t work.
What has been done in Swift/Rust and even Scala with generic number literals is just some type class that specifically resolves from the syntax to a type:
e.g. given [A] => FromArrayLiteral[List] => List[A] = ???
Hi @bishabosha,
I think there’s a misunderstanding here. I was proposing to extend the “expected type” mechanic in order to better handle implicit conversions.
The expected type in my example is Optional[List[A]]
. The idea would be that if the expected type’s companion object – Optional
in this case – doesn’t have a variadic apply
method, then it would look for an implicit conversion inside Optional
whose return type can be unified with the expected type. So it finds the implicit def toOptional[A](a: A): Optional[A]
and unifies its return type – Optional[A]
, with the expected type, Optional[List[Int]]
, which yields A =:= List[Int]
. And now it looks at the convertee of this implicit conversion, namely a: A
, and sees if it can find a variadic apply
method in its companion object. Since unification established that A
is List[Int]
, the relevant companion object is now List
, and it has a variadic apply
method. Hence, the expression [42]
would be desugared to List(42)
.
I hope this explanation made the idea clearer.
I think in balance I’d prefer a scheme where we need a type class to decide the
result type of an aggregate literal.
Something like this:
/** A typeclass to map sequence literals with `T` elements
* to some collection type `C`.
*/
trait FromArray[T, +C]:
inline def fromArray(inline xs: IArray[T]): C
FromArray
is what I call an inline type class: It’s a type class with inline methods that can be implemented with macros. Here are some given instances:
/** Straightfoward mapping to Seq */
given [T] => FromArray[T, Seq[T]]:
inline def fromArray(inline xs: IArray[T]) = Seq(xs*)
/** A more specific mapping to Vector */
given [T] => FromArray[T, Vector[T]]:
inline def fromArray(inline xs: IArray[T]) = Vector(xs*)
/** Some delayed computation */
case class Task[T](body: () => T)
/** A delaying mapping to Task */
given [T] => FromArray[T, Task[Seq[T]]]:
inline def fromArray(inline xs: IArray[T]) = Task(() => Seq(xs*))
The idea is that an aggregate literal like [a, b, c]
with elements of type A
and expected type C
will search for a FromArray[A, C]
instance fa
. If one is found, it will expand to fa.fromArray(IArray(a, b, c))
. Since fromArray
is an inline method with an inline parameter it can be implemented as a macro that inspects its argument. So it could even produce some builder pattern. In other words the aggregate literal is treated by the compiler as if it was a call seqLit(IArray(a, b, c))
where seqLit
is defined as follows:
inline def seqLit[T, C](inline xs: IArray[T])(using inline fa: FromArray[T, C]): C =
fa.fromArray(xs)
If the expected type of an aggregate literal is undefined the implicit search will be ambiguous. In that case we can default to some type. The most user friendly option is probably to default to Seq
for plain aggregate literals and to Map
for literals where all elements are pairs of the form a -> b
.
Note that if seqLit
was not declared an inline method, the code would be rejected with an error:
-- Error: seqlits.scala:21:15 --------------------------------------------------
21 | f.fromArray(xs)
| ^^^^^^^^^^^^^^^
| Deferred inline method fromArray in trait FromArray cannot be invoked
In other words, methods of inline type classes can be invoked only in a context where the type class instance is statically known. I think that’s what we want here, anyway.
I prototyped this scheme in a test file that is added in A strawman for aggregate literals by odersky · Pull Request #21993 · scala/scala3 · GitHub.
FWIW, as far as I understand this approach is precedented in Swift, which has already been mentioned early in this thread. Specifically, one can peruse ExpressibleByArrayLiteral.
I like the type class approach. Seems in line with other things in the collections.
Type classes are now a broadly used and well know concept in modern programming languages, so there is now no reason to avoid them (like it happened in the fight against CanBuildFrom
which was justified by “it’s confusing to newcomers”). We should imho actually use more type classes across the whole std. lib finally; but that’s another point.
So after the important things here are taken care of, I guess we can do some bike shed discussion?
The point is: I don’t like the proposed syntax.
Before Martin came here more or less everybody agreed that using []
is a very bad idea as this syntax is reserved exclusively for types.
I don’t get why it’s now OK to break this long standing rule more or less en passant.
Scala went even against almost all languages and does not use []
for indexing. But now it’s OK to use this syntax for something not really found in other languages in this sense as here? (For example []
is an heterogeneous array in JS, and a few other languages, so more like Scala tuples).
I liked the prefixed parens much more for the sequence literals.
Maybe not using #
but instead *
as this is also used for var args, which is related to sequences. (Also *
is one key left to (
on an US keyboard).
If I’m not mistaken, this concept is basically identical to the macro implicit conversions that the com-lihaoyi libraries make heavy use of today, with the same “can be invoked only in a context where the type class instance is statically known” requirement. It also seems very similar to what we already do in the experimental Numeric Literals
The main difference is the inline typeclass as described here would require a bootstrap def seqLit
to trigger, whereas an implicit conversion can trigger either:
- a mismatch between an expression type and a target type
- a method call to a non-existent method on the expression type
(1) is the case where com-lihaoyi
needs often (sourcecode.Text
, os.PathChunk
, mill.Task
), while (2) is the case where com-lihaoyi usually does not want but sometimes does (e.g. in FastParse)
I suspect that with a bit of tweaking, we’d be able to re-use the same inline type class concept to represent all three concepts (aggregate literals, numeric literals, macro implicit conversions) maybe with a bit less power than present-day implicit conversions (i.e. we usually want (1) above and usually do not want (2)), and maybe have it extensible to other use cases users may come up with in future that we may not agree on standardizing yet (haskell-style overloaded strings?? aggregate literals for case classes???)
I agree that there are downsides of overloading the meaning of [ ]
Can’t we just use the tuple syntax (a, b, c)
and turn it into seqLit(a, b, c)
using the type class scheme proposed by @odersky and perhaps some compiler magic if needed?
No, there is no seqLit
. seqLit
was just an articfact to simulate the behavior before we have a syntax for seqLiterals.
Yes, I think we can use the same approach also for numeric literals and macro conversions. I already outlined a draft for macro conversions in a comment for SIP-66 - Implicit macro conversions by Iltotore · Pull Request #86 · scala/improvement-proposals · GitHub).
The advantage of a typeclass approach over implicit conversions is that implicit conversions come with strings attached: you need a language import to enable them. In the future we might offer escape hatches where this is not needed, but that’s not fully worked out yet. Since we don’t need the full power of a conversion for aggreate and numeric literals I prefer not to use an implicit conversion for them in the first place.
We almost can, but the issue is the single-element-in-parenthese (foo)
already has a specific meaning that is explicitly not a tuple.
-
We could fake it with compiler magic or implicit converisons, but that adds either some sketchy conversions from
Seq[T]
toT
, or some sketchy compiler magic to do the same -
We could have some special syntax for one element lists, like Python’s
(foo,)
single element tuple syntax, but as you know most sequences are small and so this one-element-list scenario probably comes up a lot
In the end the issue is do we overload parens (used for tuples and grouping) or do we overload square brackets (used for types). Both have some degree of overloading, and both could work. Overall I fall on the side of preferring brackets because of the universality of that syntax across all other languages, which for me wins over sharing syntax with Scala tuples
Thanks. You almost convinced me but I’m still on the fence… Still hoping for some solution that nobody has thought of yet that is as clean as
(a, b, c)
I wonder if ‘*’ could be used judiciously here since it’s already related to Seq
s, and varargs / multi-valued parameters…
(Is this the “splat” operator? Sorry I’m not sure what the Scala community calls this operator/syntax)
I think it’s valuable to keep (a,b,..)
immediately recognizable as a tuple, distinct from regular collections.
It’s usage in both value and type positions doesn’t seem to cause any issues.
val t: (Int, String) = (1, "a")
Furthermore [A, B, ...]
already means some kind of sequence (of types), so using it as a sequence of values is not too far off.
@bishabosha wrote this in another thread, but relevant also here:
Nice with more fresh ideas on the syntax dilemma on the table! I think it makes sense, given Scala’s other syntax choices.
<
and >
are legitimate terms/operator tokens. I don’t see this as a viable option without ambiguities in the parser.