Better number literals

@som-snytt If we decide to perpetuate the madness of allowing l to indicate Longs, then maybe we could give you an O suffix for octal literals? :wink:

There are cases where there is no 1:1 correspondence between lexical representations and types. For example the octal representation of a long. How would that fit into the picture?

I like this and I even said why in parallel topic.

What you suggest is in fact (probably, not technically) is that a literal is somewhat a function taking (probably) an implicit typeclass argument (like Numeric or Fractional or so) and returning a value of this type, e.g. 1234567890: (implicit ev: Numeric[A]) => A.

When we do this, why special-case this for numbers? The proposal doesn’t define numbers, just syntax, and the syntax is not particularly number-like. It starts with a digit, but after that, we have a sequence of numbers or letters, so it’s a (runtime, modulo macro and suitable datatypes) literal for any alpha-numeric string that starts with a digit, and can contain '-''s and '.''s to any target type that defines a parser.

This would allow for many different ways to write numbers (even IP addresses would be covered!).

Would it though? I can see how this covers IPv4 addresses only. We could quickly also allow “:” to also be able to cover IPv6 address, but the allowed character set starts to look pretty arbitrary.

Also, should the candiate syntax start with a leading optional + or -?

EDIT:

Some examples you could do with the proposal, which are neat and scary and I’m not sure whether they’re more neat than scary:

val coffeetime: LocalTime = 2.20PM
val birthdate: LocalDate = 20.06.1982
val lausanne: WGS84 = 46.519962N6.633597E495A

or even declare some arbitrary binary encoding for any datatype, base64 encoded, and prefixed with a zero.

Some examples of things you can’t do with this proposal

val quarter: Fraction = 1/4
val complex: Complex = 2+2i
val notalot: Double = 2e-20
val lausanneLat: Latitude = 46°31'11.8632''N

I can’t entirely see the justification for the distinction of what should and/or shouldn’t be allowed.

I’m not sure this is generally true for arithmetic expressions, since arithmetic operators are often overloaded for convenience (e.g. Long#+).

The former looks like an implicit conversion but isn’t really, that could be confusing. I would argue the latter has better discoverability because in an IDE I can just do “Go to definition” on bi to get to def bi: BigInt somewhere.

2 Likes

Relying on the expected type also seems insufficient to do pattern-matching on literals as suggested in Better number literals

I actually did not realize Scala doesn’t support 0b; that would be nice for Java parity. 0o would also be nice while we’re at it (though Java doesn’t support that).

1 Like

I fully agree with you.
I think it is imortant to have the abilit to write math expressions shortly:

May be it is obvious question:
If it is easy to add ‘_’ s to number literal it may be easy to implement somethink like:

val a = 1.5$bd+5$bd*115$bd

Or with any other no letter symbol?

I think it will be more comfortable but it’s not very important. In such case we can use string interpolation:

val a = bd"1.5"+bd"5"*bd"115"

Another unused syntactical niche is leading single quote as used for Symbol, especially if they get rid of Symbol.

val x: Long = '42 ; val y: BigDecimal = ''1.5 or 'B1.5

IDEs could render literals by type, using upper and lowercase (“lining”, “old-style”) for Long and Int, and increased font size for BigDecimal.

Today, I’m looking at val n = 1234 and it looks like the numbers are SHOUTING at me.

I don’t think my rendering proposal would work because lining numbers look like JAVA constants; someday only old-style numbers will look normal.

I prefer the type-driven parsing that you suggest here! Having types is almost always a good thing, and from types I think people can get almost everything they want.

If someone really wants a bi suffix, then they can get almost all the way there with a BigIntLiteral companion that has a single bi method returning a BigInt. Then they can 1923847189571892375618923798471985.bi; the search for the .bi method will find only BigIntLiteral, which will then parse correctly and return a BigInt as desired. (Implicit search might have to be tweaked a bit to get this to work right.)

Alternatively, import math.{BigInt => BI} and use 1239857189716:BI. Not too bad.

If someone wants prefixes like 0x, there can be a desugaring rule that 0x98145718923751 desugars to 98145718923751.prefix_x or somesuch.

If people want random other stuff in the string, they can either call the method directly, or we can provide a string interpolator version, e.g. lit"fe80::d55e:d7b:14d6:50d9" which then goes by type, or can have helper class+extension method to allow a short suffix to determine the type.

1 Like

Baby step of taking underscore and single quote for separators, as proposed previously, at this PR. Separators must be internal, so no trailing underscore, sorry @som-snytt.

1 Like

For me current state is somehow ok.

underscore separator:

  • has rather no gain in business applications (in my current codebase (~80K loc) i don’t see ANY place where I would to use it)
  • could be handy in scientific field (not sure)
  • could be implemented to be in sync with java here (but it is rather minor thing and i guess don’t necessary for most usecases)

other ideas:

  • All ideas with 123bi looks odd for me. It is hard to parse at the first glance. 123.bi or (123:BingInt) looks better.
  • those crazy ideas introduces to much complexity to scala parser and i’m against them:
val quarter: Fraction = 1/4 
val complex: Complex = 2+2i
val lausanneLat: Latitude = 46°31'11.8632''N
  • disallow 123l notation in favor of 123L looks good but maybe we should disallow also small d and f for consistency? Not sure. In scala 2.13 we could warn when 123l is used.

About those crazy ideas, those were asking mostly why it would be disallowed to have

val lausanneLat: Latitude = 46°31'11.8632''N

but allowed to have

val lausanne: WGS84 = 46.519962N6.633597E495A

Why would

val ip: IPv6 = 2001:db8:85a3::8a2e:370:7334

be disallowed, but

val ip: IPv4 = 192.168.12.16

be allowed?

Why rule out

val notalot: Double = 2e-20

but be OK with

val coffeetime: LocalTime = 2.20PM

Just to note I find current syntax fine for Latitude and Longitude:

val penzance = 50.06 ll -5.68
val trevoseHead = 50.55 ll -5.03
val nwDevon = 51.18 ll -4.19
val parrettMouth = 51.21 ll -3.01
val chepstow = 51.61 ll -2.68
val stDavids = 51.88 ll -5.31   

All of these suggestions are nice, but at the end of the day, they’re asking a lot from what the language accepts as a literal. My proposal is just to allow _ within literals, with no extra syntactic or semantic meaning. This is not meant to make it possible to write BigDecimal or IP literals; just to make it easier to read existing number literals.

If you’re writing a library that uses a lot of binary or hex constants for fancy bitwise magic, or numeric constants for mathematical witchcraft, adding an _ periodically in the literal can drastically improve readability.

val magicNumber = 0b10000001010001100000000000100001
// vs
val magicNumber = 0b_1000_0001_0100_0110_0000_0000_0010_0001

The latter separates the 32bit number into 4bit chunks, making it easier to process the number mentally without accidentally skipping a bit or losing track of whether you’re at the 17th or 18th bit.

You might even add extra _s in the middle to break up the sections even more visibly (since there are 8 of them, which is a decent number)

val magicNumber = 0b_1000_0001_0100_0110__0000_0000_0010_0001

I would like to keep this proposal to the bare minimum of improving readability of literals, without attempting to allow defining new literals for arbitrary types. I think it could be valuable to allow the definition of arbitrary literals as well, but that requires significantly more work and is a major language change, while being slightly more flexible with existing literals requires less design discussion and bikeshedding.

4 Likes

That sounds fine, so long as 0x_ desugars to Integer.parseInt(_, 16).

1 Like

That sounds right to me (at compile time, I assume)

I agree. We still can use string interpolation for the other stuff, which is good enough I think.

I disagree :))
We often write something like:

val a = 10.nn  /* return BigDecimal  with null support*/

If we need a real number we must write:

val a = "10.1".nn 
or 
val a = nn"10.1"

But in pattern matching we always must use string interpolation.

We have a lot of dibate on it, like:

I think it is great when the language can make such debate redundant.

When 80% of numeric constants is Bigdecimal in your business logic, such debate really annoy