Relative scoping for hierarchical ADT arguments

jducoeur · May 12, 2024, 6:58pm

Oh, I like that: not only is it unambiguous, it is intuitively sensible. The ..Red form very much reads as “I’m eliding the stuff that goes here” in a way that IMO leads one to ask the right questions.

soronpo · May 12, 2024, 9:06pm

That’s also fine with me.

lihaoyi · May 12, 2024, 10:33pm

Just to push on this a bit more, I don’t think it’s as bad as you make it seem. We already have ad hoc rules for semicolon inference, operator precedence/binding, and underscore shorthand, which dont cause too much confusion in practice

There are certainly edge cases in Scala semicolon inference where e.g. two newlines behaves differently from one newline, but by and large it is not a real problem for users. We even had breaking syntax changes in Scala 3 around this (e.g. Surprising line continuations in Scala 3)
Precedence/binding confusion does happen sometimes, but not more than any other programming language. And adding parens to adjust precedence issues, similar to what you described users would need to do for dot companion shorthand, is something people have been doing since they were 8 years old in math class
We have a similar bunch of ad hoc rules for _.foo shorthand, which binds to the nearest enclosing (), ,, or = sign. Again, there are edge cases, and people do hit them sometimes e.g. how { println("hello"); _.foo } desugars, but only very rarely

These features could easily have been made unambiguous by adding more syntax - explicit semicolons, explicit parens around every expression, explicitly named/scoped parameters for every lambda - but I think they are better off being concise despite the edge cases.

I think dot companion shorthand falls in a similar category of language feature, and would benefit from the single-dot making it as concise as possible while still being semantically unambiguous

It’s arguable that these shorthands can cause compounding confusion more than the sum of their parts, hence the caution of adding one more shorthand despite the existing ones being OK. Coffeescript may be an example of that. But Swift has a very similar syntax as Scala: semicolon inference, method chaining, operator precedence, and dot companion shorthand. The ambiguity in theory does not turn out to be a problem in practice, and people seem to read and write code like the snippet below including both of these language features without issue

let newButton = UIButton(type: .custom)
    .backgroundColor(.blue)
    .title("Just a button").titleStyle(font: .systemFont(ofSize: 12), textColor: .white)
    .touchUpInside(target: self, selector: #selector(buttonAction))

som-snytt · May 13, 2024, 1:01am

The rule is not ad hoc. The informal explanations are ad hoc.

But I think “leading infix” from point 1 deserves its own line item separate from semicolon inference.

As with optional braces, “proper formatting” makes everything just work, but deviation suddenly requires rules you can’t remember (and wouldn’t want to if you could).

Maybe the test is: What are the unintuitive ways people will bend or break the syntax?

Ichoran · May 13, 2024, 3:41am

Well, one can certainly exaggerate how bad it is, though underscore shorthand was bad enough to get substantial changes from 2 to 3; and operator precedence is an ongoing pain point when it gets too ad-hoc–the situation with : in extension methods vs regular methods isn’t great.

So, yes, it’s not necessarily a disaster. Wouldn’t be the first time.

However, it’s bad enough that I think alternatives are worth thinking about.

foo(Color.Red)  // works

foo(
  println("Hi")
  Color.Red
) // works

foo(.Red) // works

foo(
  println("Hi")
  .Red
)  // FAILS

foo(
  println("Hi")
  (.Red)
)  // works

This isn’t obviously a rare use-case.

So, anyway, I agree that it’s not a showstopper. But given that (1) .Foo is kinda clunky anyway and (2) there seems to be a pretty good solution, I think it’s worth carefully assessing whether ad-hoc rules are worth it for enabling .Foo.

SethTisue · May 14, 2024, 7:00pm

2 posts were split to a new topic: Status of experimental numeric literals

odersky · May 20, 2024, 8:32am

After thinking about it for a while, I have come to the conclusion that the .Red syntax is not a good fit for Scala. It causes syntactic ambiguities and I find it an eyesore, since . is so entrenched as an infix operator. Even if we make an analogy with path separators /, the prefix . is still different since prefix / indicates the global scope but prefix . indicates a very specific local scope.

That said, we could come back to the alternative without the .. Why was that dismissed? Ambiguities could be resolved by ordering, i.e. Red as a member of the companion of the target type would be considered only if it does not resolve to anything by other means. We do lots of disambiguation rules like that for selections. So far, it would be the first for simple identifiers, but there’s no hard rule why identifiers could not have fall-back resolvers.

The main reason I can see against is that it would be fragile. An identifier like Red in a program would be OK or give a “not found” error, depending where it appears. If you see lots of code that uses Red without qualification, you might be surprised if your use does not pass. And the reasons for this could be subtle. For instance, adding an overloaded variant to a method would mean that the method arguments now need full qualification since no target typing is available.

Sporarum · May 20, 2024, 10:16am

I don’t think this is a good analogy, (hence why it seems counterintuitive) I think a better analogy would be " 's " in english: Mike’s red ↔ Mike.Red, in that way, a prefix period is like using “one’s”, “someone’s” or “his”: his red ↔ .Red

In that light, it makes a lot more sense, and feels more intuitive

Maybe we could add something instead of removing the period, to be more in line with “his”, for example *.Red, ?.Red or ...Red

(But regardless, I am not sure either analogy is useful in deciding if the period is a good choice)

lihaoyi · May 20, 2024, 3:23pm

I don’t think “dismissed” is the right word here. They were discussed in detail and their tradeoffs enumerated repeatedly. I can repeat some of them again below:

.foo syntax is ambiguous in a small number of cases due to method chaining over multiple lines. But un-prefixed foo syntax is ambiguous all the time with any variable that you may have in scope with the same name
Un-prefixed foo can be disambiguated by resolution fallback rules in the typer, which is less obvious both to machines and to humans than .foo syntax which can be disambiguated by precedence rules in the parser. Having to run a full typechecking name resolution to figure out where the feature is taking effect is much more involved for machines and humans than simply parsing the code in question (even if you then can only expand the .foo to its fully qualified path during/after typechecking)
Un-prefixed foo can come in two variants: either it is opt-in with a flag (per-method param, or per-type) to enable, or it applies universally to every definition side
1. If it is opt-int, that means it cannot be used on existing libraries unless retrofitted. This is less than ideal, since there’s a ton of existing code that could benefit right away. e.g. all code using enums, sealed traits with their cases in the companion, types with factory methods in their companion, etc.
2. if it is on by default for everyone, then it probably brings into scope way too much stuff into every expression that has a target type. “everything in the companion object Foo” is a lot of stuff to bring into scope every time a type Foo is expected
3. Alternately, it could only apply to special members of the companion, e.g. only the constructor. This limits the scope pollution, but limits the usefulness v.s. the original implementation in Swift where calling factory methods of the target type on its companion was a major use case (including those taking parameters)
IIRC @odersky himself has repeatedly rejected the idea of scope injection, or bringing additional identifiers into scope in a user-configurable way. I can’t google up any examples at the moment, but I quite clearly remember that being the case, going back long before this particular proposal. Therefore it is not surprising that an approach that involves bringing new identifiers into scope in a user-configurable way is deemed unlikely to get support

It’s not so much dismissal as a study of pros and cons. I don’t dispute that .foo has an “ick” factor and looks very unusual. But Swift seems to demonstrate that the “ick” factor is a non-issue for a wide base of not-necessarily-sophisticated users (iOS app developers), which to me indicates that the Scala community should be able to get used to it as well.

If we decide to go with an un-prefixed foo, I would be fine with that too. It’s just the arguments in favor of having a prefixed .foo do seem very reasonable

odersky · May 20, 2024, 6:16pm

If it is on by default for everyone, then it probably brings into scope way too much stuff into every expression that has a target type. “everything in the companion object Foo ” is a lot of stuff to bring into scope every time a type Foo is expected

I think after an initial experimental phase it should be on always. or we should drop the idea. I am against adding additional mode switches. About the concern of bringing into scope “way too much stuff”, I was imagining to restrict it to members that actually return a value of the target type. E.g. if the expected type is Color then Red could be referenced unqualified but values could not.

It’s true that this is a form of scope injection and I am generally not a fan of that. So I am still sitting on the fence here. However, if we want to have this form of target scoping, then would prefer unqualified over prefix ..

soronpo · May 20, 2024, 7:34pm

I gave a solid use-case above where this limitation is too restrictive.

Sporarum · May 20, 2024, 8:17pm

Which one ?
I could not recall an example where expected type alone was not enough

soronpo · May 20, 2024, 8:58pm

lihaoyi · May 21, 2024, 3:39am

I think this sounds like a reasonable restriction. That should significantly cut down on the scope pollution, while still bringing in everything that would be useful. And if we make it a fallback scope only looked up if the existing name resolution falls through, it would be 100% source and binary compatible

Restricting it to only members that return a value of the type does rule out things like factory methods inside nested objects, e.g.

trait Foo
object Foo{
  object stuff{
    def nestedFactory(): Foo
  }
}

But maybe that’s an uncommon enough use case it’s OK.

One thing I’d like to call out, that maybe hasn’t been said explicitly here, is that this “relative scoping” should work for pattern matching as well. e.g. This is the case in Java enums and switch statements, where un-qualified names are required:

enum Level {
  LOW,
  MEDIUM,
  HIGH
}
class HelloWorld {
    public static void main(String[] args) {
        Level myVar = Level.MEDIUM;
        switch(myVar){
            case LOW: System.out.println("low"); break;
            case MEDIUM: System.out.println("medium!"); break;
            case HIGH: System.out.println("high!!!"); break;
        }
        
    }
}

And in Swift, where qualified names are allowed, but dot-prefixed shorthand is normally used:

        switch state {
        case nil:
            removeLoadingSpinner()
            removeErrorView()
            renderContent()
        case .loading?:
            removeErrorView()
            showLoadingSpinner()
        case .failed(let error)?:
            removeLoadingSpinner()
            showErrorView(for: error)
        }

rjolly · May 21, 2024, 12:31pm

What about marking at the definition site which parameter allows relative scoping, as in:

final case class Shape(@relative geometry : Shape.Geometry, @relative color : Shape.Color)

Or even do this for target typing as such:

final case class Shape(@tagetType geometry : Shape.Geometry, @targetType color : Shape.Color)

It would allow to disambiguate in case of overloaded variants.

Sporarum · May 21, 2024, 1:38pm

Wouldn’t it work tho ?

options.CompilerOptions.ParserLogLevel has a companion object options.CompilerOptions.ParserLogLevel with a member INFO of that type ?
(since ParserLogLevel <: LogLevel)

soronpo · May 21, 2024, 1:55pm

Not with the proposed restriction. LogLevel.INFO is not the type of ParserLogLevel. It can only be that through implicit or explicit conversion.

Sporarum · May 21, 2024, 1:59pm

But it is of type LogLevel, no ?

Ichoran · May 21, 2024, 5:37pm

Definition site differences aren’t apparent when reading code. Having magic be fickle in response to the whimsy of the library designer just means you can’t rely on it, so your code style standard for readability should be: always use the fully qualified name.

So I think this would defeat the point.

hepin1989 · May 22, 2024, 6:40am

github.com/dart-lang/language

Import shorthand syntax

opened 02:52AM - 29 Oct 19 UTC

lrhn

feature small-feature import-shorthand

Most recent version: https://github.com/dart-lang/language/blob/main/working%2F0…649%20-%20Import%20shorthand%2Fproposal.md Original design below (which also has some good parts, a final result might be something in-between). ----------------------------- This is a proposal for a shorter import syntax for Dart. It is defined as a shorthand syntax which expands to, and coexists with, the existing import syntax. It avoids unnecessary repetitions and uses short syntax for the most common imports. ## Motivation Dart package imports are fairly verbose because they are based on URIs with no shorthands. A fairly typical import would be: ```dart import "package:built_value/built_value.dart"; ``` The repetition alone is grating, and Dart imports can typically be split into three groups: * Platform libraries, `import "dart:async";`. * Third-party packages, `import "package:built_value/built_value.dart";`. * Same package relative import, `import "src/helper.dart";`. The package imports are the ones with most overhead. For the rest, the surrounding quotes and trailing `.dart` are still so ubiquitous that they might as well be assumed. ## Syntax The new syntax uses no quotes. Each shorthand library reference is provided as a *URI-like* character sequence containing no whitespace, and consisting only of identifiers/reserved words separated or prefixed by colons (`:`), dots (`.`) and slashes (`/`). The allowed formats are: * A single shorthand Dart package name. * A shorthand Dart package name followed by a colon, `:`, and a relative shorthand path. * A `/`, `./` or `../` followed by a relative shorthand path. A *shorthand Dart package name* is a *dotted name*: A non-empty `.` separated sequence of Dart identifiers or reserved words. Such a sequence can have just a single element and no separator. A *relative shorthand path* is a non-empty `/` separated sequence of dotted names. The grammar would be: ``` # Any sequence of letters, digits, `_` and `$`. <SHORTHAND_IDENTIFIER> ::= <INTEGER_LITERAL> | <INTEGER_LITERAL>? (<IDENTIFIER> | <RESERVED_WORD>) <DOTTED_IDENTIFIER> ::= <SHORTHAND_IDENTIFIER> | <DOTTED_IDENTIFIER> '.' <SHORTHAND_IDENTIFIER> <SHORTHAND_PATH> ::= <DOTTED_IDENTIFIER> | <SHORTHAND_PATH> '/' <DOTTED_IDENTIFIER> <SHORTHAND_URI> ::= <DOTTED_IDENTIFIER> (':' <SHORTHAND_PATH>)? | './' <SHORTHAND_PATH> | '../' <SHORTHAND_PATH> | '/' <SHORTHAND_PATH> <uri> ::= ... | <SHORTHAND_URI> ``` Since a shorthand URI can only occur where a URI is expected, and a URI is currently always a string, there is no ambiguity in *parsing*. Tokenization is doable, but will probably initially allow whitespace between tokens because it doesn't yet know that it's a shorthand sequence. When it recognizes that a URI is expected and a non-string follows, it must combine the following tokens only as long as there is no space between them. We can allow spaces between identifiers/keywords and `:`, `.` and `/`, but it will be harder to read and it makes the grammar less extensible. The shorthand syntax can also be used for `export` and `part` declarations. It does not work for `part of` declarations because `part of foo.bar.baz;` is already valid syntax. We could allow only *relative* (`./` or `../`) shorthands for `part of` declarations, or we may want to disallow this existing syntax so that you can use the full shorthand syntax with no exceptions. (Please do disallow the old part-of format where you use the parent library *name*). (Not sure this works as written. Something like `x.2.4e2` can be parsed as containing a double literal. The grammar above doesn't allow that. A less distinguishing approach could be: * Require a shorthand to start with `[a-zA-Z_$0-9./]`. * Include every character up to the first following `;`, whitespace, quote or comment. (Tokenization will be very confusing if the shorthand URI can contain `//` or `/*`, or a string quite.) * Check later whether its valid according to the grammar above. That seems more viable than trying to guess which tokens can be accepted, just accept any that can be included whole into the combined lexeme.) ## Semantics An import of a single-identifier package name, `name`, is equivalent to an import of `"package:name/name.dart"`. This is the most common form of package imports, and it gets the shortest syntax. An import of a dot-separated package name, `some.prefix.last`, is equivalent to an import of `"package:some.prefix.last/last.dart"`. The single-identifier case is just the special case where there is no prefix. An import of a package-colon-path sequence, `name:path` is equivalent to an import of `"package:name/path.dart"`. (Notice the added `.dart`). This is used for packages which expose more than one library. An import of a relative URI path, *path*, one starting with `/`, `./` or `../`, is equivalent to an import of <code>"*path*.dart"</code>. The package name `dart` is special cased so that an import of `dart:async` will import `"dart:async"`, and an import of just `dart` is not allowed because there is no `dart:dart` library. This allows us to treat `dart:` as a platform supplied package with libraries `core.dart`, `async.dart`, etc., which may actually be an improvement over the current special-casing that we do. It does mean that `dart` is not available as a package name for user packages. Examples: * `import built_value;` means `import "package:built_value/built_value.dart";` * `import built_value:serializer;` means `import "package:built_value/serializer.dart";`. * `import dart:async;` means `import "dart:async";`. * `import ./src/helper;` means `import "src/helper.dart";`. * `import /src/helper;` means `import "package:current_package/src/helper.dart";`. The leading `./` for relative files in the same directory, is needed because otherwise we cannot distinguish whether `import foo;` means import the `foo` package or the local `foo` file. * `import hide hide hide;` is valid and means `import "package:hide/hide.dart" hide hide;`. * `import pkg1 if (dart.libraries.io) pkg2;` works too, each URI is expanded individually. ## Consequences Programmers can write less code. There will be some paths which cannot be written in the shorthand syntax, perhaps because they contain non-identifier characters or path segments starting with a digit. Those will still have to be written the old way, inside delimited strings. The parser needs to be a little clever. If it tokenizes identifiers, reserved words, dots, colons and slashes first, then it has to combine them back into a single shorthand URI and check for separating whitespace. The reason this proposal does not allow even more complicated shorthand URIs is that it would make parsing even more problematic. The chosen design attempts a trade-off between allowing most existing package URIs to be written with the new syntax and allow the syntax to be parsed without too much overhead.