Better number literals


#18

Lots of good suggestions here.

I am completely in favor of allowing _ in numeric literals. If someone wants to open an issue laying down the desired syntax, we can act on this quickly.

Regarding string interpolators: I am not sure why they need to be whitebox macros. The type of a string interpolator such as f could well be (Any*)String. It is then the job of the interpolator to refuse any argument that does not conform to its format string. So f is still a macro, but not a whitebox macro.

Concerning BigDecimal literals themselves, I think only extensible solutions are worth considering. Don’t stop at BigDecimals. Can we have a scheme where we can have arbitrary user-defined types that allow some form of numeric literal? One way to do it is to say that a numeric literal N is of type T if

  • T is the expected type at the point where the literal is defined
  • There is an instance of typeclass Literally[T] defined for T where
trait Literally[A] {
  def fromString(s: String): A
}

So, the following would both work, assuming BigDecimal and BigInt define Literally instances that
convert strings of digits to the type itself.

val x: BigDecimal = 1234567890.9
(123: BigInt)

They would expand to

val x: BigDecimal = BigDecimal.literally.fromString("1234567890.9")
(BigInt.literally.fromString("123"): BigInt)

This assumes that BigDecimal has a companion object like

object BigDecimal {
  implicit def literally: Literally[BigDecimal] = new {
    def fromString(digits: String) = new BigDecimal(digits)
  }
}

We can improve on this solution by making fromString a macro that checks the digits string
for syntactic validity.

One issue is how to define a syntax for general numeric literals N without mentioning the result type. One candidate syntax would be:

  • a digit 0-9
  • followed by a sequence of digits or letters,
  • which can also contain one or more '_'s, if followed by a digit or letter,
  • which can also contain one or more '.'s if followed by a digit.

This would allow for many different ways to write numbers (even IP addresses would be covered!). It does not cover floating point numbers with exponents, but I am not sure these are worth generalizing.


#19

I like this proposal, but having to rely on the expected type makes for verbose expressions, e.g:

(1232432432: BigDecimal) + (3432432: BigDecimal)

I would rather write:

1232432432bd + 3432432bd

This would also be consistent with the fact that today I can write 3432432l. We could allow this by rewriting:

123foo

as:

NumericalContext("123").foo

That would nicely mirror the string interpolator syntax.


#20

It looks very useful.
It will be great if it works with pattern matching.
If NumericalContext provide unapply method:

n match {
   case 1232432432bd => println("great")
}

#21

If I understand correctly
https://scala-lang.org/files/archive/spec/2.12/13-syntax-summary.html
In the code:

    1 match {
      case -1 => println("-1")
      case _ => println("_")
    }

The '-'s is part of literal.

So may be, we can use the '+'s or '-'s to define type aliace:

type Bd = BigDecimal
val a = Bd+1234567890.9
val b = Bd-1234567890.9
a + b match {
  case Bd+0 => 
  case _ => 
}

It will be work with any letters:
For example

val date = Dd+1d.2m.2004y_00h.5min.24sec

#22

me too.

Seth (SIP committee member)


#23

None of the proposals so far give me octal literals back.

Unless I can use leading underscore for that purpose?

val permissions = _755


#24

Given that the prefix 0x means hex, can we do 0o for octal and 0b for binary literals? This would give

val permissions = 0o755

This is in accordance with ECMAScript 6 and Julia, both of which has 0b and 0o literals besides the standard 0x.


#25

@smarter I feel generalized suffix strings and NumericLiteralContexts mostly generalize a wart. I always found 123l very bad, already from a typographic standpoint. 123L is a bit better, but still ugly. Types are good. Writing

   val x: BigInt = 1234567890

is longer than

    val x = 1234567890bi

but also much clearer. Furthermore, in many cases types are known from the context, in which case the suffix-less syntax is shorter anyway.


#26

@odersky I was planning to submit a small SIP to deprecate a lower-case l to mark Longs. I’ve mistaken l for 1 enough times that I think the change outweighs the incompatibility cost: I would call it a bug that it’s possible at all, and I really can’t think of any good reason why there is this one single incidence of case-insensitivity in the language; why the choice?

I’ve already done the trivial code changes in Dotty, which was a fun first experiment.

But… I prefer your more general proposal!


#27

@som-snytt If we decide to perpetuate the madness of allowing l to indicate Longs, then maybe we could give you an O suffix for octal literals? :wink:


#28

There are cases where there is no 1:1 correspondence between lexical representations and types. For example the octal representation of a long. How would that fit into the picture?


#29

I like this and I even said why in parallel topic.

What you suggest is in fact (probably, not technically) is that a literal is somewhat a function taking (probably) an implicit typeclass argument (like Numeric or Fractional or so) and returning a value of this type, e.g. 1234567890: (implicit ev: Numeric[A]) => A.


#30

When we do this, why special-case this for numbers? The proposal doesn’t define numbers, just syntax, and the syntax is not particularly number-like. It starts with a digit, but after that, we have a sequence of numbers or letters, so it’s a (runtime, modulo macro and suitable datatypes) literal for any alpha-numeric string that starts with a digit, and can contain '-''s and '.''s to any target type that defines a parser.

This would allow for many different ways to write numbers (even IP addresses would be covered!).

Would it though? I can see how this covers IPv4 addresses only. We could quickly also allow “:” to also be able to cover IPv6 address, but the allowed character set starts to look pretty arbitrary.

Also, should the candiate syntax start with a leading optional + or -?

EDIT:

Some examples you could do with the proposal, which are neat and scary and I’m not sure whether they’re more neat than scary:

val coffeetime: LocalTime = 2.20PM
val birthdate: LocalDate = 20.06.1982
val lausanne: WGS84 = 46.519962N6.633597E495A

or even declare some arbitrary binary encoding for any datatype, base64 encoded, and prefixed with a zero.

Some examples of things you can’t do with this proposal

val quarter: Fraction = 1/4
val complex: Complex = 2+2i
val notalot: Double = 2e-20
val lausanneLat: Latitude = 46°31'11.8632''N

I can’t entirely see the justification for the distinction of what should and/or shouldn’t be allowed.


#31

I’m not sure this is generally true for arithmetic expressions, since arithmetic operators are often overloaded for convenience (e.g. Long#+).

The former looks like an implicit conversion but isn’t really, that could be confusing. I would argue the latter has better discoverability because in an IDE I can just do “Go to definition” on bi to get to def bi: BigInt somewhere.


#32

Relying on the expected type also seems insufficient to do pattern-matching on literals as suggested in Better number literals


#33

I actually did not realize Scala doesn’t support 0b; that would be nice for Java parity. 0o would also be nice while we’re at it (though Java doesn’t support that).


#34

I fully agree with you.
I think it is imortant to have the abilit to write math expressions shortly:

May be it is obvious question:
If it is easy to add ‘_’ s to number literal it may be easy to implement somethink like:

val a = 1.5$bd+5$bd*115$bd

Or with any other no letter symbol?

I think it will be more comfortable but it’s not very important. In such case we can use string interpolation:

val a = bd"1.5"+bd"5"*bd"115"

#35

Another unused syntactical niche is leading single quote as used for Symbol, especially if they get rid of Symbol.

val x: Long = '42 ; val y: BigDecimal = ''1.5 or 'B1.5

IDEs could render literals by type, using upper and lowercase (“lining”, “old-style”) for Long and Int, and increased font size for BigDecimal.


#36

Today, I’m looking at val n = 1234 and it looks like the numbers are SHOUTING at me.

I don’t think my rendering proposal would work because lining numbers look like JAVA constants; someday only old-style numbers will look normal.


#37

I prefer the type-driven parsing that you suggest here! Having types is almost always a good thing, and from types I think people can get almost everything they want.

If someone really wants a bi suffix, then they can get almost all the way there with a BigIntLiteral companion that has a single bi method returning a BigInt. Then they can 1923847189571892375618923798471985.bi; the search for the .bi method will find only BigIntLiteral, which will then parse correctly and return a BigInt as desired. (Implicit search might have to be tweaked a bit to get this to work right.)

Alternatively, import math.{BigInt => BI} and use 1239857189716:BI. Not too bad.

If someone wants prefixes like 0x, there can be a desugaring rule that 0x98145718923751 desugars to 98145718923751.prefix_x or somesuch.

If people want random other stuff in the string, they can either call the method directly, or we can provide a string interpolator version, e.g. lit"fe80::d55e:d7b:14d6:50d9" which then goes by type, or can have helper class+extension method to allow a short suffix to determine the type.