Relative scoping for hierarchical ADT arguments

tarsa · May 8, 2024, 11:44am

in the solution proposed in this thread, one of the gains is that the companion object you’re selecting from is chosen based on type inference. in case of non-trivial type inference, the computed types can vary a lot depending on particular call site, so to emulate it with imports you would need to do the type inference manually and import from all inferred objects. anyway, after some deliberation, this argument looks somewhat weak (but still valid) to me, and other arguments (which revolve aroud uncluttering code) should be discussed instead.

however, if you go by the import ...{something as $} route then you probably need to introduce many short symbols sometimes and that’s a bit of ugliness already, e.g.

import html.{Tags as <, Attrs as ^, Colors as C} // ... and so on

you can bundle all of it together under one scope (that’s probably often the case with html constructs), but often you woudn’t want to.

Ichoran:

def environment(light: Light, water: Water) {}
sealed trait Light {}
object Light {
  object Off extends Light {}
  object On extends Light {}
}
sealed trait Water {}
object Water {
  object Off extends Water {}
  object On extends Water {}
}
then environment(Off, On) throws away all the safety beyond just having a boolean. So linters could catch that and say something like, “ambiguity in inferred import: Light.Off imported, but not distinguishable from Water.Off”.

should there be ambiguity at all? for me, the natural choice would be to inject scope separately per argument, so there would be no ambiguity:

// example invocation, copied from quote above, but with my comments
environment(
  // here the target type of this argument is `Light`, so we have injected `import Light._` just for this argument
  Off,
  // here the target type of this argument is `Water`, so we have injected `import Water._` just for this argument
  On
)

could you elaborate on that? the notation without leading dot still looks convincing to me.

the compatibility is on the source code level, not binary compatibility.

to minimize suprises when using notation without leading dot, we could require language import to enable the feature. compiler would always work as if the scope injection is enabled, but then the language import would be checked and if it’s not in scope, then throw compilation problem (waring or error).

example where it works:

import scala.language.scopeInjection

foo(injected) // works

example where compilation fails:

// no import from scala.language

foo(injected) // compilation fails. injection detected, but not allowed.

note that the above stuff is just a suggestion, food for thought. i haven’t deliberated much on it. maybe the notation with leading dot would be way easier to implement and maintain.

soronpo · May 9, 2024, 3:59am

Without a leading dot you need to set precedence from outside scoping to relational scoping. So theoretically could be a code somewhere that accepts an argument Foo from external scope but once this feature is enabled and Foo exists in the relational scope then you have an unexpected error or worse, a silent change that isn’t picked up.

Language feature flag are a big no-no. Scala is trying to get rid of them, not add more (aside from things that are explicitly experimental).

soronpo · May 9, 2024, 4:42am

I also wish to add a use-case that wasn’t discussed here thus far (and didn’t exist for me when I initiated the idea several years ago).

I call it “by type” argument assignment (instead of by order or by name), and it’s using opaque types + givens + implicit conversion to do something which I think is cool.

What I do is that I have a configuration context object where each value gets its own unique type

import wvlet.log.LogLevel //using an external logger library
import CompilerOptions.* //importing the new configuration types defined in the companion

final case class CompilerOptions(
    parserLogLevel: ParserLogLevel,
    linterLogLevel: LinterLogLevel,
    backendLogLevel: BackendLogLevel
)
object CompilerOptions:
  given default(using
      parserLogLevel: ParserLogLevel = LogLevel.WARN,
      linterLogLevel: LinterLogLevel = LogLevel.WARN,
      backendLogLevel: BackendLogLevel = LogLevel.WARN
  ): CompilerOptions =
    CompilerOptions(
      parserLogLevel = parserLogLevel, linterLogLevel = linterLogLevel, backendLogLevel = backendLogLevel,
    )

  ///////////////////////////
  // New Types
  ///////////////////////////
  opaque type ParserLogLevel <: LogLevel = LogLevel
  given Conversion[LogLevel, ParserLogLevel] = identity
  object ParserLogLevel:
    export  LogLevel.*

  opaque type LinterLogLevel <: LogLevel = LogLevel
  given Conversion[LogLevel, LinterLogLevel] = identity
  object LinterLogLevel:
    export  LogLevel.*

  opaque type BackendLogLevel <: LogLevel = LogLevel
  given Conversion[LogLevel, BackendLogLevel] = identity
  object BackendLogLevel:
    export  LogLevel.*

So when users run the compile command def compile()(using CompilerOptions)..., they can easily define non-default configuration options like so:

import lib.*

given options.CompilerOptions.ParserLogLevel  = options.CompilerOptions.ParserLogLevel.INFO
given options.CompilerOptions.BackendLogLevel = options.CompilerOptions.BackendLogLevel.DEBUG

compile()

I wish to enable the user just do:

given options.CompilerOptions.ParserLogLevel  = .INFO
given options.CompilerOptions.BackendLogLevel = .DEBUG

So in this “by-type” argument passing use-case we have:

An enumeration from an external library
Assigned values that are not essentially the same type as the destination type.

Without a leading dot, or some alternative explicit relational scoping syntax, it can easily bring unexpected naming collisions. And I think this is a very worthy use-case. I started implementing it in my library and it’s extremely convenient, especially since more than one command requires these options as a context. I can have dozens of different options and sub-option objects and the user just doesn’t need to care what goes where. The type-system takes care of everything.

Ichoran · May 11, 2024, 3:21pm

This is true either way.

case class Head(h: Int) {}
object Head {
  val head = Head(0)
}

def thing: Head =
  List(Head(1), Head(3))
  .head

What do we get? Head(1) or Head(0)?

Sporarum · May 11, 2024, 3:48pm

I would say Head(1), just like with:

val x =
  1
  - 1 // {1 - 1} = {0}

vs

val x =
  1
  
  - 1 // {1; -1} = {-1}

Ichoran · May 11, 2024, 6:03pm

So would I, but it’s not a different issue from the dotless case. You have to decide what takes precedence. Having thus decided, it’s not an issue any longer.

The point remains that you can change things far away (e.g. import an extension method) and behavior changes whether it’s dotless or dotted notation. The question is thus about the frequency and non-obviousness of collisions, not a qualitative this-collides-that-doesn’t. Or it suggests that another symbol or keyword is needed.

soronpo · May 11, 2024, 6:05pm

It is. You’re referring to grammar rules. I was referring to scoping rules.

odersky · May 11, 2024, 9:03pm

That’s indeed pretty bad. Sequences with leading . are very common. It’s a pitfall waiting to happen that any of these could accidentally become a constructor. It looks like relative scoping interacts badly with semicolon inference.

lihaoyi · May 11, 2024, 10:22pm

We could restrict the feature to avoid this ambiguity.

The only place where it appeasrs to be ambiguous would be in a block, where you dont know if the consecutive .foos are dot-prefixed standalone statements or chained method calls. We could limit the .-prefix syntax to disallow that

Note that Swift also has semicolon inference, and the feature works fine. Definitely a concern, but IMO not a blocker. We just need to be careful in exactly how we spec the syntax

soronpo · May 12, 2024, 12:01am

We can also choose $. as the relative scope placeholder. This removes any ambiguity.

lihaoyi · May 12, 2024, 5:58am

That’s definitely an option, but my gut feeling is that it makes it sufficiently ugly that it becomes unattractive as a language feature

Swift manages to make it work, with very similar syntactic constraints as Scala has, so I think we should be able to make it work without user-facing sacrifices in its syntax. Just need to figure out how to tweak the grammar properly.

e.g. we could limit dot companion shorthand to only apply in cases where it is unambiguous: immediately after a (, ,, =, or in the case of pattern matching immediately after case or |. That would cover all the useful scenarios while avoiding any ambiguity with fluent method chains brought up earlier in this thread

soronpo · May 12, 2024, 1:19pm

I agree. I also think it should not be available in infix positioning (aside from the | under pattern matching).

Ichoran · May 12, 2024, 6:44pm

We could also choose .. as the lookup signifier. This has three advantages that I can think of:

It’s not a valid pattern in the language, so it’s completely unambiguous.
It’s still very fast and easy to type
.. already looks like “you know the start, it’s just the end bit that we need tell you about”, so the form helps suggest meaning. Usually in pseudocode you’d write ... but .. is close enough.

It also might be a useful hint to the typer about where to start searching for alternatives (I’m not familiar enough with the internals to know).

It has two disadvantages that I can think of:

.. is used in most languages for ranges.
It is more clunky than . (or nothing).

I agree with @lihaoyi that this could work syntactically, but my above suggestion might be worth thinking about instead. The reason is that if we have to add a bunch of ad-hoc rules to prevent syntactic ambiguity, we also implicitly require programmers to learn all those rules. Failing that, they’d have to resort to disambiguating patterns like (.Green) so that they don’t have to remember the rules.

Minimizing the number of places where things “just work except when they don’t” I think should be a high-priority goal.

val redCircle = Shape(..Circle, ..Red)

def isRed(s: Shape) = s match
  case Shape(_, ..Red) => true
  case _ => false

seems not too bad to me, or at least closer to the . form than the . form is to the bare form.

jducoeur · May 12, 2024, 6:58pm

Oh, I like that: not only is it unambiguous, it is intuitively sensible. The ..Red form very much reads as “I’m eliding the stuff that goes here” in a way that IMO leads one to ask the right questions.

soronpo · May 12, 2024, 9:06pm

That’s also fine with me.

lihaoyi · May 12, 2024, 10:33pm

Just to push on this a bit more, I don’t think it’s as bad as you make it seem. We already have ad hoc rules for semicolon inference, operator precedence/binding, and underscore shorthand, which dont cause too much confusion in practice

There are certainly edge cases in Scala semicolon inference where e.g. two newlines behaves differently from one newline, but by and large it is not a real problem for users. We even had breaking syntax changes in Scala 3 around this (e.g. Surprising line continuations in Scala 3)
Precedence/binding confusion does happen sometimes, but not more than any other programming language. And adding parens to adjust precedence issues, similar to what you described users would need to do for dot companion shorthand, is something people have been doing since they were 8 years old in math class
We have a similar bunch of ad hoc rules for _.foo shorthand, which binds to the nearest enclosing (), ,, or = sign. Again, there are edge cases, and people do hit them sometimes e.g. how { println("hello"); _.foo } desugars, but only very rarely

These features could easily have been made unambiguous by adding more syntax - explicit semicolons, explicit parens around every expression, explicitly named/scoped parameters for every lambda - but I think they are better off being concise despite the edge cases.

I think dot companion shorthand falls in a similar category of language feature, and would benefit from the single-dot making it as concise as possible while still being semantically unambiguous

It’s arguable that these shorthands can cause compounding confusion more than the sum of their parts, hence the caution of adding one more shorthand despite the existing ones being OK. Coffeescript may be an example of that. But Swift has a very similar syntax as Scala: semicolon inference, method chaining, operator precedence, and dot companion shorthand. The ambiguity in theory does not turn out to be a problem in practice, and people seem to read and write code like the snippet below including both of these language features without issue

let newButton = UIButton(type: .custom)
    .backgroundColor(.blue)
    .title("Just a button").titleStyle(font: .systemFont(ofSize: 12), textColor: .white)
    .touchUpInside(target: self, selector: #selector(buttonAction))

som-snytt · May 13, 2024, 1:01am

The rule is not ad hoc. The informal explanations are ad hoc.

But I think “leading infix” from point 1 deserves its own line item separate from semicolon inference.

As with optional braces, “proper formatting” makes everything just work, but deviation suddenly requires rules you can’t remember (and wouldn’t want to if you could).

Maybe the test is: What are the unintuitive ways people will bend or break the syntax?

Ichoran · May 13, 2024, 3:41am

Well, one can certainly exaggerate how bad it is, though underscore shorthand was bad enough to get substantial changes from 2 to 3; and operator precedence is an ongoing pain point when it gets too ad-hoc–the situation with : in extension methods vs regular methods isn’t great.

So, yes, it’s not necessarily a disaster. Wouldn’t be the first time.

However, it’s bad enough that I think alternatives are worth thinking about.

foo(Color.Red)  // works

foo(
  println("Hi")
  Color.Red
) // works

foo(.Red) // works

foo(
  println("Hi")
  .Red
)  // FAILS

foo(
  println("Hi")
  (.Red)
)  // works

This isn’t obviously a rare use-case.

So, anyway, I agree that it’s not a showstopper. But given that (1) .Foo is kinda clunky anyway and (2) there seems to be a pretty good solution, I think it’s worth carefully assessing whether ad-hoc rules are worth it for enabling .Foo.

SethTisue · May 14, 2024, 7:00pm

2 posts were split to a new topic: Status of experimental numeric literals

odersky · May 20, 2024, 8:32am

After thinking about it for a while, I have come to the conclusion that the .Red syntax is not a good fit for Scala. It causes syntactic ambiguities and I find it an eyesore, since . is so entrenched as an infix operator. Even if we make an analogy with path separators /, the prefix . is still different since prefix / indicates the global scope but prefix . indicates a very specific local scope.

That said, we could come back to the alternative without the .. Why was that dismissed? Ambiguities could be resolved by ordering, i.e. Red as a member of the companion of the target type would be considered only if it does not resolve to anything by other means. We do lots of disambiguation rules like that for selections. So far, it would be the first for simple identifiers, but there’s no hard rule why identifiers could not have fall-back resolvers.

The main reason I can see against is that it would be fragile. An identifier like Red in a program would be OK or give a “not found” error, depending where it appears. If you see lots of code that uses Red without qualification, you might be surprised if your use does not pass. And the reasons for this could be subtle. For instance, adding an overloaded variant to a method would mean that the method arguments now need full qualification since no target typing is available.