Inline convertion in pattern matching's `if` guard

Encounter this problem when using fastparse to parse jsonpath RFC.

  private def `int`[_: P]: P[Int] = P("0" | ("-".? ~ `DIGIT1` ~ `DIGIT`.rep)).!
    .map(_.toLong)
    .collect {
      case value if MIN_INTEGER < value && value < MAX_INTEGER =>
        Math.min(Int.MaxValue, Math.max(Int.MinValue, value)).toInt
    }

currently, I have to to do toLong conversion first, is it possible to avoid that, eg:

  private def `int`[_: P]: P[Int] = P("0" | ("-".? ~ `DIGIT1` ~ `DIGIT`.rep)).!
    .collect {
      case x <- value.toLong if MIN_INTEGER < x && x < MAX_INTEGER =>
        Math.min(Int.MaxValue, Math.max(Int.MinValue, x)).toInt
    }

otherwise, there will be two toLong conversions.

List(42).collect { case x if { val y = x.toLong; y > Int.MaxValue } => false }

I would use an extractor to use a bound variable:

List(42).map { case X(y) => y }

where the extractor performs arbitrary tests and conversions.

2 Likes

thanks, seems if block is ok, but the y can’t be used in the => block. and an extractor introduces allocation

val p: PartialFunction[Long, Boolean] = {
  case (y: Long) if y > Int.MaxValue => false
}

List(42).collect(p.compose(_.toLong)): List[Boolean]

An other way.

A name-based extractor can’t help too, because it’s two steps.

This seems more appropriate for https://users.scala-lang.org than for the contributors forum?

1 Like

Some people might need answers during their AoC window.

I noticed this works:

object X:
  inline def unapply(inline i: Int) = X(i.toLong)
class X(x: Long) extends AnyVal:
  def get = x
  def isEmpty = x < 0

class C:
  def f(i: Int) =
    i match
    case X(j) => j
    case _ => -1L

yields

  public long f(int);
    descriptor: (I)J
    flags: (0x0001) ACC_PUBLIC
    Code:
      stack=3, locals=9, args_size=2
         0: iload_1
         1: istore_2
         2: iload_2
         3: i2l
         4: lstore_3
         5: getstatic     #22                 // Field X$.MODULE$:LX$;
         8: lload_3
         9: invokevirtual #26                 // Method X$.isEmpty$extension:(J)Z
        12: ifne          31
        15: getstatic     #22                 // Field X$.MODULE$:LX$;
        18: lload_3
        19: invokevirtual #30                 // Method X$.get$extension:(J)J
        22: lstore        5
        24: lload         5
        26: lstore        7
        28: lload         7
        30: lreturn
        31: ldc2_w        #31                 // long -1l
        34: lreturn

and similarly in Scala 2 with -opt:inline:'<sources>'

object X {
  def unapply(i: Int): X = new X(i.toLong)
}
class X(val x: Long) extends AnyVal {
  def get = x
  def isEmpty = x < 0
}

class C {
  def f(i: Int) =
    i match {
      case X(j) => j
      case _ => -1L
    }
  def g(ns: List[Int]) =
    ns.collect {
      case X(j) if j < 10 => j
    }
}

yields similar

  public long f(int);
    descriptor: (I)J
    flags: (0x0001) ACC_PUBLIC
    Code:
      stack=3, locals=4, args_size=2
         0: getstatic     #17                 // Field X$.MODULE$:LX$;
         3: pop
         4: iload_1
         5: i2l
         6: lstore_2
         7: getstatic     #17                 // Field X$.MODULE$:LX$;
        10: lload_2
        11: invokevirtual #21                 // Method X$.isEmpty$extension:(J)Z
        14: ifne          23
        17: getstatic     #17                 // Field X$.MODULE$:LX$;
        20: pop
        21: lload_2
        22: lreturn
        23: ldc2_w        #22                 // long -1l
        26: lreturn

but the partial function for collect is less savory. Well, I guess applyOrElse takes an int but returns a boxed Long.

  public final <A1 extends java.lang.Object, B1 extends java.lang.Object> B1 applyOrElse(A1, scala.Function1<A1, B1>);
    descriptor: (ILscala/Function1;)Ljava/lang/Object;
    flags: (0x0011) ACC_PUBLIC, ACC_FINAL
    Code:
      stack=4, locals=5, args_size=3
         0: getstatic     #27                 // Field X$.MODULE$:LX$;
         3: pop
         4: iload_1
         5: i2l
         6: lstore_3
         7: getstatic     #27                 // Field X$.MODULE$:LX$;
        10: lload_3
        11: invokevirtual #31                 // Method X$.isEmpty$extension:(J)Z
        14: ifne          34
        17: getstatic     #27                 // Field X$.MODULE$:LX$;
        20: pop
        21: lload_3
        22: ldc2_w        #32                 // long 10l
        25: lcmp
        26: ifge          34
        29: lload_3
        30: invokestatic  #39                 // Method java/lang/Long.valueOf:(J)Ljava/lang/Long;
        33: areturn
        34: aload_2
        35: iload_1
        36: invokestatic  #44                 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
        39: invokeinterface #50,  2           // InterfaceMethod scala/Function1.apply:(Ljava/lang/Object;)Ljava/lang/Object;
        44: areturn
1 Like

Thanks, that’s why I want it been supported directly.

I think this is a language problem, even if I can use the name-based extractor, eg how Akka’s OptionVal works, it is still a boilerplate.

I see. It’s not obvious to me what a solution would look like, though.

I will say that “an extractor introduces allocation” is probably not going to change minds here. I believe we consider it normal in Scala that a lot of common idioms allocate a short-lived object; doing so is incredibly cheap on the JVM. “Incredibly cheap” is not zero of course, but at that level of micro-optimization, I think a tradeoff between expressiveness and nanoseconds is often inevitable.

Personally, the .map(_.toLong) seems fine to me in isolation; or if you’re doing it over and over again, and extractor seems like an appropriate solution.

2 Likes

The fastparse macro will fusing all these into a single code block as lihaoyi once said in his blog.
but what if this is not a fastparse macro and then the inline will not happing.

I would like to see these is scala is targeting wasm and llvm nowadays , a common optimizer will help all these backend, this is yes a macro optimization, but that will help high performance code where every bits maters:)

object RefExtractor extends App {

  class Ref[T](val ref: AtomicReference[T]) extends AnyVal {
    def get(): T = {
      println("call ref.get")
      ref.get()
    }
    def isEmpty: Boolean = {
      println("call ref.get")
      ref.get() == null
    }
    def set(value: T): Unit = ref.set(value)
  }

  object Ref {
    def unapply[T](ref: AtomicReference[T]): Ref[T] = new Ref(ref)
  }

  val ref = new AtomicReference("hello")

  ref match {
    case Ref(value) => println(value)
  }

}

But what if the x is not directly available?

then you will see:

call ref.get
call ref.get
hello

It would be nice to save one of the call ref.get, @odersky is that possible? I knew this is possible at bytecode level, not sure how to express it at language.

object Ref {
  def unapply[T](ref: AtomicReference[T]): Option[T] =
    Option(ref.get())
}

thanks, but then there is an additional allocation of Some().

I want something like :

ref.get()  match {
 case null => println("not yet")
 case value if  ... => println("done")
}

which has no additional allocation of Some(value).

But the jmh says the allocation rate <0.001 mb/s, seems been optimized.

C# has an out parameter, that will help in some cases.