Proposal: Main methods (`@main`)

tarsa · April 21, 2020, 4:04pm

odersky:

Then the argument processing is “just one method call away”. I.e. if I want to connect to my add method defined previously, I could write:
@main def run(xs: String*) = processArgs(xs).mapN(add)
Or something like that. Only, it’s not so simple. My processArgs method has to know what arguments add expects, what the names are, and what possible defaults they have. So there’s a lot of info to pass to processArgs ! And it’s duplicated info since the same info already exists in the signature of the add method.

Instead of

@main def run(xs: String*) = processArgs(xs).mapN(add)

one could do:

@main def run(xs: String*) = parseAndRun(xs, add _)

where parseAndRun is a macro that recognizes whether the function just forwards to a method and in such cases analyzes parameters of that metod. No need to duplicate anything.

Syntax like:

is pretty limiting. Often a CLI has many dependent parameters. For example area calculator. First parameter is shape, next parameters depend on shape:

./area_calculator circle --radius 5
./area_calculator square --side 3
./area_calculator rectangle --height 3 --width 8
...

Similar story with git, openssl or other programs. Syntax is:
git|openssl|something_else command --parameters-dependent-on-command

LPTK · April 22, 2020, 7:12am

Ammonite has the ability to do this:

If you have only a single @main method, any arguments that you pass to the script get used as arguments to that @main . But if you have multiple @main methods, the first argument to the script is used to select which @main method to call . e.g. given:
// MultiMain.sc

val x = 1

@main
def mainA() = {
  println("Hello! " + x)
}

@main
def functionB(i: Int, s: String, path: os.Path = os.pwd) = {
  println(s"Hello! ${s * i} ${path.relativeTo(os.pwd)}.")
}
You can call it via

amm MultiMain.sc mainA

Or

amm MultiMain.sc functionB 2 "Hello"

I think the same could be done here.

olafurpg · April 22, 2020, 8:04am

odersky:

One could adopt a simpler scheme. For instance, that the only allowed signatures for main methods are:
@main def f(): Unit
@main def f(xs: String*): Unit
@main def f(xs: Array[String]): Unit

I think this is the best solution, especially after being convinced by @lihaoyi on the following point

Overall, I think the idea is a good one, but I do not think the current proposal passes the bar: I think it is too narrow and too incomplete to be worthy of including in the Scala standard library, where once the user passes “hello world” they will find it immediately inadequate and need to discard it. This risks it becoming a “good for slides and tutorials and nothing else” feature which we have to warn people against using: scala.util.parsing all over again.

Automatic command-line parsing based on the method signature has one problem that puts beginners on the wrong path: you have to repeat all the arguments when passing them to a helper method and it’s easy to screw up the order if several parameters have the same type.

@main def happyBirthday(age: Int, name: String, uppercase: Boolean = false) =
  // ...
  birthdayMessage(age, name, uppercase) // need to repeat all parameters

def birthdayMessage(age: Int, name: String, uppercase: Boolean) =
  val message = s"Happy $age birthday, $name"
  if uppercase then message.toUpperCase else message

Compare that to a implementation that uses a case class with a hypothetical CommandLine library instead


case class HappyBirthday(age: Int, name: String, uppercase: Boolean = false) derives CommandLine {
  def message =
    val m = s"Happy $age birthday, $name"
    if uppercase then m.toUpperCase else m
}
@main def happyBirthday(args: String*) =
  val birthday = CommandLine[HappyBirthday].parse(args)
  // ...
  birthday.message // no risk of screwing up parameter order

At the cost of small additional verbosity the second solution lends itself to easier extensibility and testability.

lihaoyi · April 22, 2020, 12:55pm

@odersky

Options

I think there are basically two local maxima here:

We make the @main method dumb: @main def main(args: Array[String]): Unit, and then assume the user will need to delegate to some runtime parseArgsAndDoSomething(args) call to parse the arguments and do what they want.
- We may or may not want to bundle a default parseArgs implementation with the standard library. Python bundles one, and while not perfect it is definitely a great convenience to have built in. And if someone needs something fancies, they can write their own (e.g. Python, there’s click, fire, and others)
We make the @main method smart, using the method signature (argument names, types, defaults, doc-comments, annotations, return type) to perform the argument parsing.
- In this case, we definitely want to make the “compiler magic” as thin as possible, and delegate as much logic as we can to a user-land library
- The Scala standard library can provide a basic implementation of the user-land logic, but it should be swappable so people can provide alternate implementations without having to throw out the feature entirely.
- We could do this via a method signature, or a case class constructor as @olafurpg has suggested. I don’t mind either way: method signatures and constructor signatures have about the same data model in either case

Case (1.) is trivial: if we want to do this, the solution is obvious. It would already be an improvement over the status quo, though a small one. Let’s consider case (2.)

Significant Method Signature Requirements

In this case, we want to method signature to be significant, and used as part of the argument parsing logic. Despite the fact that we can already do this in “user-land” as @julienrf has noted, I must say that it is extremely convenient to use the method signature. Ammonite scripts, Mill commands, and Cask endpoints all use this feature very heavily. It is very, very, very convenient compared to the user-land implementation where we construct the data-model of method specification manually.

Looking at your second trait MainAnnotation, that is missing a few things:

We need to be able to list out all the arguments and all their metadata at once, rather than just fetching values from them one at a time. This is required for useful --help messages
We want to perform the argument validation applicatively, rather than monadically/imperatively. This is required if we want error messages to be useful and tell us everything we did wrong rather than trickling in one error after another
We want to be able to support multiple @main methods! Maybe this is not a hard requirement, but a lot of people really like this feature in Ammonite (as we can see from this thread!) and I make great use of this ability in Mill and Cask.
We want to be able to return the “remaining” un-parsed arguments to match existing command line conventions. e.g. ssh you pass in a bunch of flags and then the remaining tokens get treated as a command to run, or python you pass in a bunch of args, then the remaining tokens get treated as the script name and then script arguments

I’ll need to think a bit more about a possible API that can satisfy this, without the cruft that has accreted onto the Ammonite/Mill/Cask copy-pasta implementations. But these are the non-obvious, non-trivial requirements that are not satisfied by your proposed interfaces above.

lihaoyi · April 22, 2020, 1:44pm

@odersky Here’s a sketch of a potential solution that accommodates all the requirements, inspired by the Ammonite/Cask/Mill implementations with a bunch of crufty crusty cruft removed:

// General framework

// Extend this to get magic compiler expansion
trait MainAnnotation extends StaticAnnotation{
  type Parser[T]
  type MainAnnotationWrapper
}

// This models an `@main` annotated method: all the metadata about it,
// along with an `invoke` handle to actually invoke the damn thing.
//
// Note we explicitly pass in `Self` to `EntryPoint#invoke` and `ArgSig#default`. This ensures
// we aren't relying on an enclosing "this", and the EntryPoint can be inspected in a vacuum
// without needing to evaluate any enclosing code
case class EntryPoint[Parser[_], Self, Result](
  name: String,
  argSignatures: Seq[ArgSig[Self, _]],
  doc: Option[String],
  varargs: Boolean,
  invoke: (Self, Map[String, Any], Seq[String]) => Result
)

// Models the metadata for a single argument: name, typeString and docs,
// default value factory, and a Parser[T]
case class ArgSig[Self, Parser[_], T](
  name: String,
  typeString: String, // This could be a richer data structure if we want
  doc: Option[String], // This could be either an `@doc` annotation or pulled from the method scaladoc
  default: Option[Self => T]
)(implicit val parser: Parser[T])

// "user-land" @main annotation definition.

// Simply specifies the `Parser[_]` typeclass, then
// delegates most heavy lifting to `type MainAnnotationWrapper = MainWrapper`
class main extends MainAnnotation{
  type Parser[T] = FromString[T]
  type MainAnnotationWrapper = MainWrapper
}

// This will be extended by our final wrapper object. Here we provide a JVM-style 
// main method entry point with some default parsing logic, but a user could easily
// define their own main method that defines their argument parsing logic, or doesn't
// define a JVM-style main method at all! 
class MainWrapper(ep: EntryPoint[FromString, Self, Unit]){
  def main(args: Array[String]): Unit = {
    val (parsedArgs: Map[String, Any], remaining: Seq[String]) = 
      parseArgsAndThrowIfInvalid(args, ep)

    ep.invoke(self, parsedArgs, remaining)
  }
}

// Use Site
@main
def myMain(s: String, i: Int = 0): Unit = {
  ???
}

// Generated code
object MyMain extends main#MainAnnotationWrapper(
  EntryPoint[FromString, Self Unit](
    name = "main",
    argSignatures = Seq(
      ArgSig[Self, main#Parser, String](name = "s", typeString = "String", doc = None, default = None),
      ArgSig[Self, main#Parser, Int](name = "i", typeString = "Int", doc = None, default = Some(_ => 0))
    ),
    doc = None,
    varargs = false,
    invoke = (self, args, remaining) => 
      myMain(s = args("s").asInstanceOf[String], i = args("i").asInstanceOf[Int])
  )
)

The goal is to make the compiler generate code containing all the metadata + a reference to invoke the method with the parsed arguments, but we leave it up to the MainAnnotationWrapper to:

Define the def main(args: Array[String]): Unit method, or not
How they want to parse the arguments before running the invoke method. Or not! Maybe there’s multiple invalid arguments and we want to print some error messages, maybe the input was --help and we want to reflect on the EntryPoint to print the help message, in both these cases we don’t run invoke at all

The above API is roughly equivalent to taking the user-land API that @julienrf described earlier, generating it from the signature of an annotated method, but still leaving the rest of the implementation details to the user to provide in the type MainAnnotationWrapper of their @main annotation. This could provide the JVM entrypoint def main(args: Array[String]): Unit, but it could also just generate the object MyMain extends main#MainAnnotationWrapper in other contexts as well that we could use for e.g. doing HTTP routing in Cask, or for the script runner to dispatch to multiple @main methods in Ammonite.

In this proposal, the compiler knows nothing about program entry points or CLI arg parsing. All it knows how to do is reflect on method signatures and make the metadata available; using that metadata to provide a JVM main method with argument parsing is a library concern (whether standard library or third party). Other libraries could use this metadata for other things.

Note that a bunch of APIs here return Any or _ and are untyped; this is due to the variadic nature of this API. Fiddling around with shapeless or typed HLists is orthogonal to the core logic and API, and can be layered on top of this later if desired.

There’s a bit of subtlety that I haven’t covered:

Do we need the Self parameter for the @main implementation of MainAnnotation?
Could we make this useful for something like Cask which allows stackable annotations?
How to support effect-typed main methods that return IO[Unit] or similar?
How do we handle passing arguments to the @main annotation? (Cask does this!)
Do we need concrete case classes for EntryPoint and ArgSig, or would traits + factory methods be good enough? Does it matter?

And of course, there is tons of room for bikeshedding the exact method signatures above. Nevertheless, I’m confident that given this fundamental design, those details can be worked out to satisfaction.

julienrf · April 22, 2020, 5:04pm

Thank you @lihaoyi for this detailed post!

Yeah, I agree that the experience is not the same and being able to just write def main(s: String, i: Int = 0): Unit is really nice. Yet, I want to challenge again the footprint that the feature you described would add to the language. It seems that if we switch to class-based program arguments we could reuse the existing typeclass derivation infrastructure instead of re-inventing another type-directed derivation mechanism:

@main def main(args: MyArgs): Unit = ???

case class MyArgs(s: String, i: Int = 0) derives Args

And we could reuse the existing typeclass derivation system to synthesize the code that does the argument parsing.

odersky · April 22, 2020, 5:09pm

I like your requirements, but it turns out that the proposed 2nd version of the MainAnnotation already fulfils all of them.

We need to be able to list out all the arguments and all their metadata at once, rather than just fetching values from them one at a time. This is required for useful --help messages

That functionality is provided by the second version of MainAnnotation. The annotation can choose to simply collect all getArg calls and store the meta-information. Then, when it encounters a --help as actual argument, print out all the stored info.

We want to perform the argument validation applicatively, rather than monadically/imperatively. This is required if we want error messages to be useful and tell us everything we did wrong rather than trickling in one error after another

That’s also possible with the second version of MainAnnotation. The annotation can choose to keep all validation errors in a buffer that are then printed out together when done is called. Alternatively, it can record all getArgs calls as meta-data and validate everything together when done is called.

We want to be able to support multiple @main methods! Maybe this is not a hard requirement, but a lot of people really like this feature in Ammonite (as we can see from this thread!) and I make great use of this ability in Mill and Cask.

Since each main function generates its own wrapper class, I don’t see a problem with that. I believe that’s already supported in the current implementation.

We want to be able to return the “remaining” un-parsed arguments to match existing command line conventions. e.g. ssh you pass in a bunch of flags and then the remaining tokens get treated as a command to run, or python you pass in a bunch of args, then the remaining tokens get treated as the script name and then script arguments

I don’t see a problem with that either. The main annotation gets all the actual arguments. So it can do whatever it wants with the arguments that were not requested by the method.

Of course, it’s possible and probably desirable to reify all important info relating to a main method as data, which is what your proposal does. But one does not need to, and I would argue that this reification should not be part of the compiler contract but should be done in a library (maybe the standard library, that would be OK). To give some perspective: I think that even the reliance on Seq of the compiler is a mistake. A compiled program should not demand anything fancy in terms of interfaces or (even more so) classes. Requiring a MainAnnotation interface with three methods all taking simply typed arguments is about as fancy as it should get.

My MainAnnotation proposal is arguably a minimalistic way to describe a main method: The compiler-generated code simply issues calls, one for each argument, that contain the info relevant to this argument. The main method responds for each argument with a closure that will produce the argument value, if all arguments validate, and that is allowed to fail otherwise. Validation is handled with a simple done call. Nevertheless, I believe one can implement with this contract a MainAnnotation that then generates the EntryPoint, ArgSig and MainWrapper classes that you sketched out.

There are two things I am not yet clear about.

First, there’s currently no way in my proposal to handle results of main methods. It’s assumed that the result is Unit. For Java that looks OK since if one wants an exit value, one can simply call System.exit. But I am not sure about the general case. Are there important use cases that demand a free choice of return type? What’s the best way to abstract over that? [I guess: Using something like the `ResultHandler` that you had in your earlier proposal].

Second, the current design produces Java main methods in the end so the whole proposal is Java specific. It would probably also work on Native, since the main methods for Java and Unix are basically the same. So is there a need to generalize this further? And, if yes, what’s the simplest way of doing this?

For reference, here’s the latest tweaked MainAnnotation design:

MainAnnotation class, defines the contract for the compiler.

  trait MainAnnotation extends StaticAnnotation:

    type ArgumentParser[T]

    // get single argument
    def getArg[T](argName: String, fromString: ArgumentParser[T], defaultValue: => Option[T] = None): () => T

    // get varargs argument
    def getArgs[T](argName: String, fromString: ArgumentParser[T]): () => List[T]

    // check that everything is parsed
    def done(): Boolean

Sample main class, can be freely implemented:

  class main(progName: String, args: Array[String], docComment: String) extends MainAnnotation

Sample main method

object myProgram:

  /** Adds two numbers */
  @main def add(num: Int, inc: Int = 1) = println(x + y)

Compiler generated code:

  class add:
    def main(args: Array[String]) = 
      val cmd = new main("add", args, "Adds two numbers")
      val arg1 = cmd.getArg[Int]("num", summon[cmd.ArgumentParser[Int]])
      val arg2 = cmd.getArg[Int]("inc", summon[cmd.ArgumentParser[Int]], Some(1))
      if cmd.done() then myProgram.add(arg1(), arg2())

AMatveev · April 22, 2020, 6:19pm

IIUC: It is not so.

trait Cmd
case class Add(url:String,isForced: Boolean) extends Cmd
class DeleteAll extends Cmd

@myMain def main(cmd: Cmd, others: String*) = ...

It can be implemented with lihaoyi’s proposal.
But I do not understand how it can be implemented with the 2nd version.

summon[cmd.ArgumentParser[...]] has no sence here
there is no information about type of ‘Cmd’

odersky · April 22, 2020, 6:47pm

Typeclass derivation does not work for this use case. Typeclass derivation generates typeclass instances in the companion object of a sum or product class. But for main methods, we need to generate a new global class that forwards to the actual main method (which can be anywhere as long as it is accessible statically). This exceeds what typeclass derivations should be allowed to do.

If we have macro-annotations (let’s say in a Tasty-based code generation framework) that can inspect methods and generate new top-level classes we can do it, and relegate main method generation to a meta programming library. But we are not nearly there yet, and we need to do something now, since App will go away. We can solve the specific problem in the compiler now. If we get the right kind of macro annotations later, it’s an implementation detail to take the whole thing out of the compiler and move it to a standard library.

The fundamental question is whether we should do a minimalistic solution now, i.e. main taking Array[String] or do we want to propose a standard that’s actually usable without resorting to 3rd party libraries and duplication of information. My tendency would be to go for that, if we can do it in a simple way, and I think we are very close.

odersky · April 22, 2020, 6:50pm

I have no idea what you are trying to achieve here. If you want to make a point, it would be good to follow the same schema I showed. Explain what is the meta-trait, what is the annotation class, what should the compiler generate, how is that info communicated to the actual main method?

AMatveev · April 22, 2020, 7:53pm

I want to be able work with complex types and annotations. I am not sure wether it is possible to make ‘summon[cmd.ArgumentParser[…]]’ optional so, it is cut out in the example.
There is not shown how to work with annotations also, if it is important I can show it later.

The most common use case for me, when there are several command with different options.
It is a common use case in other libraries for example:
https://jcommander.org/#_more_complex_syntaxes_commands
I can describe a model in such way:

class Cmd;
@Command(names = "--add", description = "add some file")
class Add extends Cmd{
  @Parameter(names = "--url", description = "some url")
  var url:String = _;
}
@Command(names = "--deleteAll", description = "delete all files")
class DeleteAll extends Cmd{
}

IIUC: the schema can be something like:

trait MainAnnotation extends StaticAnnotation:

    // get single argument
    def getArg(argName: String, defaultValue: => Option[AnyRef] = None): () => AnyRef

    // get varargs argument
    def getArgs(argName: String): () => List[AnyRef]

    // check that everything is parsed
    def done(): Boolean
    

class main(progName: String, args: Array[String],types:Array[String], docComment: String) extends MainAnnotation    
    
    
object myProgram:

  /** Run command */
  @main def run(cmd:Cmd) = cmd match {
     ...
  }
  
  
class run:
    def main(args: Array[String]) = 
      val cmd = new main("run", args,Array("Cmd"), "Run command")
      val arg1 = cmd.getArg("cmd")().asInstanceOf[Cmd]
      if cmd.done() then myProgram.run(arg1)

The main differences are:

there is no need to have ArgumentParser, so there will not be errors like:
No implicit argument of type util.FromString[Cmd] was found
there is information about a class, so I can use class loader to parse annotations and load meta model.

odersky · April 22, 2020, 8:38pm

cmd.getArg(“cmd”)().asInstanceOf[Cmd]

How would getArg of a general purpose main method know that you want to convert a string to a Cmd? So how can this cast not fail?

lihaoyi · April 22, 2020, 10:19pm

@odersky I think we’re using different terminology here. When I refer to the “API” that we expose to users, I am referring to both the trait interfaces as well as the exact shape of the generated code. After all, these are both going to need to be “hardcoded” into the compiler, and both will constrain the flexibility of user-land developers trying to make use of this feature.

Although your trait MainAnnotation could theoretically be used in a variety of ways, your Compiler generated code locks it down to only be used in a single hard-coded way!

My proposal above moves much of the logic for your Compiler generated code into a configurable user-land implementation trait. This allows the logic to be customized or swapped out by the user, and having the compiler only generate the minimal metadata that is truly necessary for the user-land implementation trait to do its work. Perhaps that is possible by changing the Compiler generated code in your proposal, but as currently written your last proposal above does not allow a user to change that.

lihaoyi · April 22, 2020, 10:38pm

@odersky Here’s one concrete thing I can’t figure out from your proposal: how does one tweak your class main implementation to allow applicative validation? It seems the compiler generated code only allows the validation of each argument to be done sequentially.

odersky · April 22, 2020, 10:44pm

I don’t see that. What kind of lock downs do you observe? What the compiler code does is

pass the program name, doc comments, and actual command line arguments to the main annotation
explain each of the arguments of the main method to the main annotation, giving its name, default, whether it’s a vararg, and how the argument string should be parsed (which is under control of the main annotation).

From an information-theoretic point of view, that’s exactly what there is! Or otherwise put, we convey the maximal amount of info available to the main annotation. I am intentionally ignoring @doc annotations since I want to re-use the doc comment for that, which is passed along.

I am about to write up a strawman main that shows how applicative validation is done.

[EDIT] Here is what I quickly threw together. It can be improved, I am sure, but it fleshes out the
principle. The last scenario at the bottom shows that multiple errors can be reported.

AMatveev · April 23, 2020, 12:39am

I am not sure I understand the question, so I have made some testcase.
It is a simple wrapper over JCommander

package ru.bitec


import com.beust.jcommander.JCommander
import com.beust.jcommander.Parameter
import com.beust.jcommander.Parameters

class Cmd;
@CommandName("add")
@Parameters(commandDescription = "Add some files")
class Add extends Cmd{
  @Parameter(names =Array("--url")
    ,description = "some url"
    ,required = true)
  var url : String = _;
}

object Example {
  //parameters: add --url test
  def main(args: Array[String]): Unit = {
    val m = new Main("ProdName",args,Array("ru.bitec.Cmd"),"")
    val arg0 = m.getArg("cmd")().asInstanceOf[Cmd]
    println(arg0)
    println(arg0.asInstanceOf[Add].url)
  }
  class Main(progName: String, args: Array[String],types:Array[String], docComment: String){
    //for test case I just write subclass map manually
    //val map = readByClass(types(0)])
    val map = Map("add"->"ru.bitec.Add"
    )
    val instanceMap = map.transform{case (_,v) =>
      map.getClass.getClassLoader.loadClass(v).newInstance()
    }
    val jcb = JCommander.newBuilder
    instanceMap.foreach{ case (k,v) =>
      jcb.addCommand(k,v)
    }
    val jc = jcb.build
    jc.parse(args: _*)
    val result = instanceMap(jc.getParsedCommand)

    def getArg(argName: String, defaultValue: => Option[AnyRef] = None): () => Any = {
      //assume we always have only one parameter
      ()=>{
        result
      }
    }
  }

}

There is no problems with cast at all.

[EDIT] There is need to add list of arg names :

val m = new Main("ProdName",args,Array("cmd"),Array("ru.bitec.Cmd"),"")

lihaoyi · April 23, 2020, 1:57am

@odersky got it, I think I understand. Essentially the two APIs are mostly isomorphic, except yours is a visitor-based API rather than a data-structure-based API.

That makes yours more minimal: whoever needs a data structure can define a Visitor to assemble the data structure themselves using mutable state. Although the () => ??? looks a bit awkward, such code is common when working with visitors. I do this in uPickle all over the place. That approach looks good to me overall!

There are two remaining difference in our two APIs:

Yours only allows control of executing the method or not via a Boolean, whereas mine allows the user-land code to call the method arbitrarily. That allows you to e.g. wrap the method call in a try-catch-finally, place setup/teardown code around the method call (e.g. Cask uses this to setup threadlocal database connections) , or execute the method call more than once (e.g. Cask uses a @retry annotation that does that).
Yours still is hardcoded to generate a def main(args: Array[String]): Unit Java-style entrypoint, whereas mine allows the user to simply specify a class for the wrapper to inherit from. This allows mine to e.g. generate a HTTP endpoint that can be discovered later (whether by registration or by reflection) and executed via some other code.

For both of these areas, it is straightforward to adjust your Visitor-based API to allow that flexibility if we so desire.

To answer your earlier questions, having a typeclass and/or handler function for the return types would definitely be useful:

That would allow @main methods to be used in places where the return type is an Int we want to pass to System.exit, places where the return type is some JSON-serializable data type to return from a HTTP endpoint, or something else (e.g. Ammonite just uses println on it if I remember right).
It would provide a hook for people who want to return Future[Unit] or IO[Unit] or ZIO[Unit] to handle their IO monad and do their unsafeRunSync thing

In terms of Java/Unix specificity, I think it is a very small amount of work to allow it to be more general. As mentioned earlier, it is trivial to code-gen an object inheriting a class, rather than code-gen-ing the def main method directly:

The def main JVM/Unix entrypoint to be just one of many possible wrapper objects:
Ammonite would not use the JVM entrypoint directly, but instead call it indirectly through the script running infrastructure
Cask would not use a JVM entrypoint at all and instead use this to define HTTP endpoints that get/receive JSON, form-encoded POSTs, and other things
Test frameworks could define @test annotations that register the method call to be called by the surrounding test harness
Sjsonnet would use this as a way of defining JVM-intrinsic functions which are implemented in Scala as functions or methods, but exposed to the Jsonnet code to be called.

Rather than being Java/Unix specific, we can look at this as a mechanism for reifying a method definition to be usefully manipulated at runtime. The 5 example use cases above, 3 of them already in wide usage, have almost identical requirements:

We want to resolve a typeclass for the type of each parameter of the method, and a (potentially different) typeclass for the return type
We want to reify the method signature metadata for programmatic usage: arguments names, default values, types & scaladoc/doc-annotation (for help messages, if nothing else)
We want to wrap the method call somehow: try-catches, setting threadlocal context, retries, etc.
We want to register the reified method somewhere for someone (JVM, Ammonite script runner, Cask webserver, Sjsonnet interpreter, Test framework, etc.) to inspect and invoke at runtime

Rather than being a hard-coded mechanism for defining JVM entrypoints, this method-reification-mechanism has the potential to be as fundamental and widely applicable as Python’s @decorators: to register tests, CLI entryoints, web endpoints, and others, but using typeclasses and code-generation we can do so in a much more type-safe and high-performance fashion!

odersky · April 23, 2020, 1:36pm

I like where this is going! Let me try to address these additional requirements, always following the Principle of Least Power .

The first set of requirements is

(1) We want to wrap the call to the main-method, potentially invoking it several times, and also with the possibility of post processing.
(2) The main method should be able to return a result with a framework-defined type
(3) The wrapper class should inherit from a framework-defined parent so that it can implement additional functionality.

(with framework-defined I mean: defined by the concrete @main annotatioin class).

A variant that addresses these requirements is tests/pos/main-method-scheme-class-based.scala. In fact, I believe the new design is not only more powerful, but also cleaner than the previous one, so this is nice progress!

In that design I did not cater for a typeclass to handle the result of the main method. Instead, there’s a fixed type MainResultType that can be instantiated by concrete @main annotations. That’s simpler and has the potential for better error messages. I am not completely opposed to investing in a second type class, but according to PLP, we should not do it unless there’s a clear need for it. My main hesitation about adding this typeclass is that we’d have to invent it from scratch. For argument passing, there’s the standard FromString class that would usually instantiate ArgumentParser. But for result parsing I believe we will have to use a do-nothing dummy class by default, which is ugly and points to possible over-engineering. So maybe a framework-defined result type is better.

Abstracting over Wrapper Class Generation

The remaining requirement is that we would like more control what wrapper class is generated. So far, we have arguably abstracted the main method faithfully, but the wrapper class generation scheme is fixed. Here’s the example what a wrapper class looks like in the latest iteration:

object add extends main:
  def main(args: Array[String]) =
    val cmd = command(args)
    val arg1 = cmd.argGetter[Int]("num", summon[ArgumentParser[Int]])
    val arg2 = cmd.argGetter[Int]("inc", summon[ArgumentParser[Int]], Some(1))
    cmd.run(myProgram.add(arg1(), arg2()), "add", "Adds two numbers")
end add

What additional parts of this wrapper should we leave open for customisation? I argue not the body of main since that embodies the essential protocol that we are defining. But everything else is fair game:

the name and location of the wrapper class itself. Instead of add it could be something else.
the name of the wrapper method. Instead of main it could be something else.
the argument type of the wrapper method. Instead of Array[String] it could be something else.
the result type of the wrapper method. Instead of Unit, it could be something else.
the question whether a wrapper class contains a single wrapper method or whether there could
be several.
possibly, an annotation to add to the wrapper method, e.g. one which can be used for registering the method.

These customisations are enabled by tests/pos/main-method-scheme-generic.scala. Compared to the previous iteration tests/pos/main-method-scheme-class-based.scala, where arguably the added flexibility was free since it led to a cleaner design, the new customisations do have a price in footprint. Essentially, we need in class MainAnnotation three new type members and an inline method:

  /** The type of the command line arguments. E.g., for Java main methods: `Array[String]` */
  type CommandLineArgs

  /** The return type of the generated command. E.g., for Java main methods: `Unit` */
  type CommandResult

  /** An annotation type with which the wrapper method is decorated.
   *  No annotation is generated if the type is left abstract.
   */
  type WrapperAnnotation <: Annotation

  /** The fully qualified name to use for the static wrapper method
   *  @param mainName the fully qualified name of the user-defined main method
   */
  inline def wrapperName(mainName: String): String

So it’s not free. But it looks like a reasonable price to pay for the added flexibility.

odersky · April 23, 2020, 3:12pm

As far as the SIP process is concerned my recommendations would be the following:

It makes no sense to put a restricted @main over Array[String] in the language spec. We have seen that much nicer functionality can be had with modest conceptual cost.
We should have an implementation of wrapper methods satisfying @lihaoyi’s criteria in the compiler and language runtime. But at present any solution looks too detailed to fit confortably in a language spec.
In light of this, I think it’s best if the language spec does not talk about main methods at all. There will be an ergonomic and flexible implementation of @main in the Scala 3 distribution, and tutorials will likely use that. But we can relegate all this to the question of host system interop, which means that the language spec and the SIP process need not deal with it.

Ichoran · April 23, 2020, 8:12pm

This latest scheme looks really powerful, elegant, and simple. I hope very much that something like this goes in! Together with a good default subclass of MainAnnotation, this would have saved me dozens of hours of futzing with command-line parsing.