Better number literals

but what if you need 10_200_531 instead? 10.million + 200.thousand + 531? This can become quite verbose.

I think the main objection to a compile-time string interpolator was macros’ future. Maybe we could solve that while avoiding a language change by adding the string interpolator(s) to the standard library. We already have the f interpolator.

Incidentally, how does Dotty use the standard library when it has macros?

If you try to use any Scala 2 macro like the f"" interpolator in Dotty, you’ll get a MethodNotFoundException at runtime (we should probably detect this at compile time).

Lots of good suggestions here.

I am completely in favor of allowing _ in numeric literals. If someone wants to open an issue laying down the desired syntax, we can act on this quickly.

Regarding string interpolators: I am not sure why they need to be whitebox macros. The type of a string interpolator such as f could well be (Any*)String. It is then the job of the interpolator to refuse any argument that does not conform to its format string. So f is still a macro, but not a whitebox macro.

Concerning BigDecimal literals themselves, I think only extensible solutions are worth considering. Don’t stop at BigDecimals. Can we have a scheme where we can have arbitrary user-defined types that allow some form of numeric literal? One way to do it is to say that a numeric literal N is of type T if

  • T is the expected type at the point where the literal is defined
  • There is an instance of typeclass Literally[T] defined for T where
trait Literally[A] {
  def fromString(s: String): A
}

So, the following would both work, assuming BigDecimal and BigInt define Literally instances that
convert strings of digits to the type itself.

val x: BigDecimal = 1234567890.9
(123: BigInt)

They would expand to

val x: BigDecimal = BigDecimal.literally.fromString("1234567890.9")
(BigInt.literally.fromString("123"): BigInt)

This assumes that BigDecimal has a companion object like

object BigDecimal {
  implicit def literally: Literally[BigDecimal] = new {
    def fromString(digits: String) = new BigDecimal(digits)
  }
}

We can improve on this solution by making fromString a macro that checks the digits string
for syntactic validity.

One issue is how to define a syntax for general numeric literals N without mentioning the result type. One candidate syntax would be:

  • a digit 0-9
  • followed by a sequence of digits or letters,
  • which can also contain one or more '_'s, if followed by a digit or letter,
  • which can also contain one or more '.'s if followed by a digit.

This would allow for many different ways to write numbers (even IP addresses would be covered!). It does not cover floating point numbers with exponents, but I am not sure these are worth generalizing.

3 Likes

I like this proposal, but having to rely on the expected type makes for verbose expressions, e.g:

(1232432432: BigDecimal) + (3432432: BigDecimal)

I would rather write:

1232432432bd + 3432432bd

This would also be consistent with the fact that today I can write 3432432l. We could allow this by rewriting:

123foo

as:

NumericalContext("123").foo

That would nicely mirror the string interpolator syntax.

4 Likes

It looks very useful.
It will be great if it works with pattern matching.
If NumericalContext provide unapply method:

n match {
   case 1232432432bd => println("great")
}

If I understand correctly
https://scala-lang.org/files/archive/spec/2.12/13-syntax-summary.html
In the code:

    1 match {
      case -1 => println("-1")
      case _ => println("_")
    }

The '-'s is part of literal.

So may be, we can use the '+'s or '-'s to define type aliace:

type Bd = BigDecimal
val a = Bd+1234567890.9
val b = Bd-1234567890.9
a + b match {
  case Bd+0 => 
  case _ => 
}

It will be work with any letters:
For example

val date = Dd+1d.2m.2004y_00h.5min.24sec

me too.

Seth (SIP committee member)

None of the proposals so far give me octal literals back.

Unless I can use leading underscore for that purpose?

val permissions = _755

Given that the prefix 0x means hex, can we do 0o for octal and 0b for binary literals? This would give

val permissions = 0o755

This is in accordance with ECMAScript 6 and Julia, both of which has 0b and 0o literals besides the standard 0x.

3 Likes

@smarter I feel generalized suffix strings and NumericLiteralContexts mostly generalize a wart. I always found 123l very bad, already from a typographic standpoint. 123L is a bit better, but still ugly. Types are good. Writing

   val x: BigInt = 1234567890

is longer than

    val x = 1234567890bi

but also much clearer. Furthermore, in many cases types are known from the context, in which case the suffix-less syntax is shorter anyway.

1 Like

@odersky I was planning to submit a small SIP to deprecate a lower-case l to mark Longs. I’ve mistaken l for 1 enough times that I think the change outweighs the incompatibility cost: I would call it a bug that it’s possible at all, and I really can’t think of any good reason why there is this one single incidence of case-insensitivity in the language; why the choice?

I’ve already done the trivial code changes in Dotty, which was a fun first experiment.

But… I prefer your more general proposal!

3 Likes

@som-snytt If we decide to perpetuate the madness of allowing l to indicate Longs, then maybe we could give you an O suffix for octal literals? :wink:

There are cases where there is no 1:1 correspondence between lexical representations and types. For example the octal representation of a long. How would that fit into the picture?

I like this and I even said why in parallel topic.

What you suggest is in fact (probably, not technically) is that a literal is somewhat a function taking (probably) an implicit typeclass argument (like Numeric or Fractional or so) and returning a value of this type, e.g. 1234567890: (implicit ev: Numeric[A]) => A.

When we do this, why special-case this for numbers? The proposal doesn’t define numbers, just syntax, and the syntax is not particularly number-like. It starts with a digit, but after that, we have a sequence of numbers or letters, so it’s a (runtime, modulo macro and suitable datatypes) literal for any alpha-numeric string that starts with a digit, and can contain '-''s and '.''s to any target type that defines a parser.

This would allow for many different ways to write numbers (even IP addresses would be covered!).

Would it though? I can see how this covers IPv4 addresses only. We could quickly also allow “:” to also be able to cover IPv6 address, but the allowed character set starts to look pretty arbitrary.

Also, should the candiate syntax start with a leading optional + or -?

EDIT:

Some examples you could do with the proposal, which are neat and scary and I’m not sure whether they’re more neat than scary:

val coffeetime: LocalTime = 2.20PM
val birthdate: LocalDate = 20.06.1982
val lausanne: WGS84 = 46.519962N6.633597E495A

or even declare some arbitrary binary encoding for any datatype, base64 encoded, and prefixed with a zero.

Some examples of things you can’t do with this proposal

val quarter: Fraction = 1/4
val complex: Complex = 2+2i
val notalot: Double = 2e-20
val lausanneLat: Latitude = 46°31'11.8632''N

I can’t entirely see the justification for the distinction of what should and/or shouldn’t be allowed.

I’m not sure this is generally true for arithmetic expressions, since arithmetic operators are often overloaded for convenience (e.g. Long#+).

The former looks like an implicit conversion but isn’t really, that could be confusing. I would argue the latter has better discoverability because in an IDE I can just do “Go to definition” on bi to get to def bi: BigInt somewhere.

2 Likes

Relying on the expected type also seems insufficient to do pattern-matching on literals as suggested in Better number literals

I actually did not realize Scala doesn’t support 0b; that would be nice for Java parity. 0o would also be nice while we’re at it (though Java doesn’t support that).

1 Like

I fully agree with you.
I think it is imortant to have the abilit to write math expressions shortly:

May be it is obvious question:
If it is easy to add ‘_’ s to number literal it may be easy to implement somethink like:

val a = 1.5$bd+5$bd*115$bd

Or with any other no letter symbol?

I think it will be more comfortable but it’s not very important. In such case we can use string interpolation:

val a = bd"1.5"+bd"5"*bd"115"