Algebras of opaque types and pattern matching

bblfish · August 31, 2021, 2:07pm

I have a use case where pattern matching on opaque types when done via the unapply method of the companion would make a lot of sense. This is a use case where we do not want to pattern match on the underlying opaque object directly though. We only want to give users of the library a way to have a view on the opaque object that can be used in pattern matching.

The general use case comes to light when writing libraries that abstract between many different implementations of a particular standard or Mathematical abstraction.

In our use case we have a large and evolving suite of standards from the W3C that have been implemented in Java in at least 4 ways, in JavaScript the same, in WASM in C and Rust (not to mention Ruby, and many other languages) over the past 20 years. When I worked at Sun Microsystems in 2005 I asked the Java communities if they wanted to agree on a common interface for their RDF libs so that their frameworks could be interoperable, but they felt that the time was not ripe. Later they felt there was too much invested in the frameworks. The usual Java Solution is to use the Adapter Pattern, (two java libs from Apache do that) but that means every object ends up needing to be wrapped, producing double the number of objects, which is a lot when you move from every relation consisting of 4 objects, to a situation where every relation consists of 8 objects.

In 2011 I discovered that Scala allows us to sidestep that whole problem and that one could write a library that would just abstract these differences, but without adding those indirection layers that end up being expensive in terms of memory and cpu.
The answer in Scala3 is to use Opaque Types, which amazingly ends up incurring no cost at all!

But there is a problem as we wish to design a developer friendly library, in that opaque types don’t allow pattern matching.

So if one looks at the test case:

bKt.asMatchable match
	case Triple(sub,URI(rel),obj) => assertEquals(sub,bbl)

we see two problems:

I need to use asMatchable for the Triple.unapply to work
The code won’t compile because it can’f find that URI is a matchable, i.e. it cannot find the URI.unapply

In both those cases I don’t want the unapply or pattern matching to give me the underlying structure of each of the libraries, the one hidden by the opaque keyword. Rather we want to be able to have a coherent view that is the same over all implementations. This would be given by the unapply function in the companion objects.

So here we have an algebra of opaque data structures that fit together in a coherent way, and we want to abstract over the implementations so that people can write generic scala RDF code that will compile to use any of the 10 libraries mentioned and more. E.g. starting from a generic class MyProgram[Rdf<:RDF](...) we should be able to produce one version for HP’s Jena object ProgHp extends MyProgram[Jena] and one for IBM’s object ProgIBM extends MyProgram[Rdf4J], etc…

I have written projects using banana-rdf that work just like that, and it is a real pleasure not to have to worry about implementation issues. That was written in Scala 2, used type projections, which were leaky (one could easily come to end up being confused with methods from the underlying types), and also did not work well with pattern matching. Now we have to rewrite them for Scala 3 and having pattern matching work correctly would make for a very good library.

PS. I hope this is the right forum to post this idea to.

kai · August 31, 2021, 9:58pm

@bblfish
You can use TypeTest (TypeTest | Scala 3 Language Reference | Scala Documentation) to defeat Matchable like so:

import scala.reflect.TypeTest

trait RDF {
  type Triple
  implicit def typeTest: TypeTest[Any, Triple]
  object Triple {
    def unapply(t: Triple): Option[Triple] = typeTest.unapply(t)
  }
}
object RDF {
  type Triple[R <: RDF] = R match {
    case GetTriple[t] => t
  }
  type GetTriple[T] = RDF { type Triple = T }
}

object Jena extends RDF { 
  override opaque type Triple = String
  given typeTest: TypeTest[Any, Triple] with {
    def unapply(s: Any): Option[s.type & String] = s match
      case s: (String & s.type) => Some(s)
      case _ => None
  }
}

def typeMatch[Rdf <: RDF](a: Any)(using tt: TypeTest[Any, RDF.Triple[Rdf]]): Option[RDF.Triple[Rdf]] = {
  type T = RDF.Triple[Rdf] // workaround for https://github.com/lampepfl/dotty/issues/13433
  a match { 
    case res: T => Some(res)
    case _ => None
  }
}

def unapplyMatch(a: Any) = {
  a match {
    case Jena.Triple(t) => println(s"Is Triple: $a")
    case _ => println(s"Is NOT Triple: $a")
  }
}

@main def main = {
  println(typeMatch[Jena.type]("abc"))
  println(typeMatch[Jena.type](1))
  unapplyMatch("abc")
  unapplyMatch(1)
}

The reason you need .asMatchable is because Scala 3 is now more strict towards checking the input type of unapply. Applying the unapply is a two step process - first .asInstanceOf check to ensure the input type is in fact Triple, then application and checking of resulting Option.

In Scala 2 the first step would be skipped for abstract types - that have no known runtime class or instance of ClassTag. Inside that function it’s impossible to know the JVM class, because you could pass any RDF to it.

In Scala 3 the step is only skipped explicitly by using .asMatchable, otherwise if there’s no known JVM class, it will try to find an implicit instance of TypeTest (or ClassTag) and use it instead of .asInstanceOf

bblfish · September 6, 2021, 10:32am

Thanks @kai for the idea of using pattern matching on types and the TypeTest idea.
TypeTests do allow me to now pattern match on types as shown here

bKt.asMatchable match
	case Triple(sub: URI,rel,obj: URI) => ...

For simple types such as URIs it could be well argued that this is all that is needed.
For more complex types such as RDF literals, which are elements of the disjunctive type String + String⨉Lang + String⨉URI we may want to pattern match inside the pattern. But if I try to change the type test here to this

t.asMatchable match
   case Triple(Bbl,Name,Literal(lit)) => ...

I get the error message

case Triple(Bbl,Name,Literal(lit)) =>
      			     ^^^^^^^^^^^^

pattern selector should be an instance of Matchable, but it has unmatchable type rdf.Node instead

Ideally I would want to go one step further with something functionally equivalent to this, but more elegant:

t.asMatchable match
   case Triple(Bbl,Name,Literal(LangLit("Henry",en))) => ...

but that does not compile either.

kai · September 6, 2021, 10:57am

I disabled -source:future in the build and then it works as expected.

bblfish · September 6, 2021, 11:58am

Ah you are right: it compiles after this commit. And it then also compiles without the .asMatchable statement.

Does that mean I am less future proof now?

It also made a minor commit to see if I could use RDF.Graph[Rdf] instead of givn.Graph to specify a type. Both seem to work.

It feels like I am close to being able to rewrite the banana-rdf for Scala 3 now.

kai · September 6, 2021, 1:41pm

I wouldn’t worry about diminished future-proofness without -source:future - once something is real/y slated for removal or change, you’ll get deprecation warnings in the default mode first. Yet there are more experimental changes included in source:future that aren’t as polished as the default mode.

smarter · September 6, 2021, 3:27pm

If you find some usability issues with -source future it’s a good idea to report them (this flag is probably not very well tested currently), in this case it seems like we shouldn’t emit an error about Matchable if a TypeTest instance is present?

mbloms · September 6, 2021, 3:54pm

I have to admit I’m not completely sure I understand what you are trying to accomplish here.

Are you simply trying to use

bKt match
	case Triple(sub,URI(rel),obj) => assertEquals(sub,bbl)

as sugar for

val (sub,x,obj) = Triple.unapply(bKt)
val rel = URI.unapply(x)
assertEquals(sub,bbl)

Or are you actually trying to do a real type test on runtime?

Matchable probably shouldn’t be required when the scrutinee conforms to the argument type of the unapply method.

I’m also wondering if in this case, maybe you actually want to give your type aliases Matchable as upper bound, thereby permitting pattern matching. But I’m not sure I understand the code well enough to know if that would be problematic or not.

bblfish · September 7, 2021, 7:45am

@smarter it is a bit difficult for me to know if there is a bug somewhere or if my understanding is partial, or if the behaviour is as intended. It seems that -source:future does not allow one to have nested pattern matches on an opaque type, even when there are the relevant .unapply methods available that would make the pattern match workable. Ie. we are not trying to look inside the opaque type illegitimately here. In point 3 below, the LiteralI is just a view on an opaque type.
What do you think? Is this worth a bug report?

You have it right @mbloms, with your example. But in order for the use case to be realistic your example needs the following adaptation.

the relation (a.k.a Predicate) is always a URI in RDF, so that type test on the relation would not be interesting. The subject and object on the other hand can be blank nodes (existential quantifiers) or Literals (Strings, Ints, floats, etc.) as well as URIs. So if we just shift your type test to the object we would get something that makes more sense for this example

bKt.asMatchable match
	case Triple(sub,rel,URI(obj)) => ...

That had trouble compiling with -source:future.

(Note: Folk may be interested to know that the best model of RDF is not in the category of Set and functions but in the bicategory of relations. See Knowledge Representation in Bicategories of Relations.)

But there is not much interesting information in a URI, so that example would be better written as pointed out above as the simple type test

bKt.asMatchable match
	case Triple(sub, rel, obj: URI) => ...

and that compiled with -source:future and the TypeTest givens. So that is fine.

To get a more realistic example of the problem, I turned to look at literals, which have some interesting structure that programs may want to know about. To make that closer to the notation of the popular Turtle syntax, I made a commit that allows me to write the test as

bblName.asMatchable match
   case Triple(s,r,Literal("Henry" `@` lang)) => ???

So this compiles without the -source:future flag and even without .asMatchable, but it does not compile with -source:future on. So what seems to happen is that -source:future does not allow patterns within patterns on opaque types in case clauses that would be the equivalent of

val (sub,rel,obj) = Triple.unapply(bblName)
val Some("Henry" `@` lang) = Literal.unapply(obj)
assertEquals(lang,"en")

Indeed in this case the LiteralI type inside the literal is actually an enum. So Literal.unapply(node) returns an Option[LiteralI] where LiteralI is an enum thought of as an Interface for presenting literals to any object instantiating the RDF trait, ie for any library implementing the RDF standards of the W3C.

mbloms · September 8, 2021, 4:32pm

Ok, so there are some nuances here.

I minimized all REPL snippets.
TL;DR: TypeTest isn’t the issue, extractor patterns are.

Currently the compiler (using -source future-migration or higher) always requires the selector type to conform to `Matchable.

In the simplest case, where pattern matching is just sugar for calling the unapply on an extractor object, this is clearly excessive. It’s actually quite unfortunate, since it disallows the popular newtype-pattern.

Minimized REPL example

scala> object lit {
     |   opaque type Literal = String
     |   
     |   object Literal {
     |     def apply(str: String): Literal = str
     |     def unapply(lit: Literal): Some[String] = Some(lit)
     |   }
     | }
// defined object lit

scala> import lit.*

scala> Literal("x") match {case Literal(str) => str}
-- Warning:
1 |Literal("x") match {case Literal(str) => str}
  |                         ^^^^^^^^^^^^
  |                         pattern selector should be an instance of Matchable,,
  |                         but it has unmatchable type lit.Literal instead
val res0: String = x

scala> val Literal(str) = Literal("y")
-- Warning:
1 |val Literal(str) = Literal("y")
  |    ^^^^^^^^^^^^
  |    pattern selector should be an instance of Matchable,,
  |    but it has unmatchable type lit.Literal instead
val str: String = y

These warnings are completely unneccecary, extractor patterns should be allowed in this case.
@smarter, do you have any idea how hard that would be to special case?

bblfish:

bKt.asMatchable match
	case Triple(sub,rel,URI(obj)) => ...
That had trouble compiling with -source:future.

(Note: Folk may be interested to know that the best model of RDF is not in the category of Set and functions but in the bicategory of relations. See Knowledge Representation in Bicategories of Relations.)

But there is not much interesting information in a URI, so that example would be better written as pointed out above as the simple type test
bKt.asMatchable match
	case Triple(sub, rel, obj: URI) => ...
and that compiled with -source:future and the TypeTest givens.

This case is not as easy. obj has type Node, which does not conform to URI.unapply. Just try rewriting it as URI.unapply(obj), it won’t compile. Because of this, the compiler must do a type test first to make sure obj is indeed of type URI. But there is actually no way to do that. That’s why you need a TypeTest, the fact that it’s an opaque type is irrelevant:

scala> "x" match {case lit: Literal => lit}                                                                                         
-- Warning:
1 |"x" match {case lit: Literal => lit}
  |                ^
  |                the type test for lit.Literal cannot be checked at runtime
val res1: String & lit.Literal = x

If I add a `TypeTest`, this works:

scala> object lit {
     |   import scala.compiletime.asMatchable
     |   import scala.reflect.TypeTest
     |  
     |   opaque type Literal = String
     |   
     |   object Literal {
     |     def apply(str: String): Literal = str
     |     def unapply(lit: Literal): Some[String] = Some(lit)
     |   }
     |   given TypeTest[Any,Literal] with {
     |     def unapply(x: Any): Option[x.type & Literal] = x.asMatchable match {
     |       case str: (x.type & String) => Some(str)
     |       case _ => None
     |     }
     |   }
     | }
// defined object lit

scala> import lit.*

scala> "x" match {case lit: Literal => lit}
val res2: lit.Literal = x

scala> "x" match {case Literal(str) => str}                                                                             
val res3: String = x

Not even a warning. Both unapply and type pattern works.

Regarding the TypeTest, IMHO the TypeTest[Any,URI] is bad practice. Use TypeTest[Node,URI] instead, otherwise you allow coercion from java.net.URI to URI by pattern matching, as seen above.

If you don’t need the security offered by Matchable, the best way to get around it is to simply make Matchable the upper bound of all opaque types. This will save you from writing asMatchable everywhere.

It will however not save you from the need to use `TypeTest`.

scala> object lit {
     |   opaque type Literal <: Matchable= String
     |   
     |   object Literal {
     |     def apply(str: String): Literal = str
     |     def unapply(lit: Literal): Some[String] = Some(lit)
     |   }
     | }
// defined object lit

scala> import lit.*

scala> "x" match {case Literal(str) => str}
-- Warning:
1 |"x" match {case Literal(str) => str}
  |                ^
  |                the type test for lit.Literal cannot be checked at runtime
val res0: String = x

scala> "x" match {case lit: Literal => lit}                                                                                         
-- Warning:
1 |"x" match {case lit: Literal => lit}
  |                ^
  |                the type test for lit.Literal cannot be checked at runtime
val res1: String & lit.Literal = x

scala> Literal("x") match {case lit: Literal => lit}
val res2: lit.Literal = x

scala> Literal("x") match {case Literal(str) => str}                                                                       
val res3: String = x

scala> Literal("x") match {case str: String => str}                                                                                
val res4: lit.Literal & String = x

mbloms · September 8, 2021, 4:44pm

Wouldn’t it defeat the purpose of Matchable if you were allowed to match on anything with a Typeable instance? If it worked like this, you could convert a IArray to an Array.

smarter · September 8, 2021, 4:59pm

I don’t really have much opinion or expertise on TypeTest/Typeable and Matchable so I’ll let others answer :).

bblfish · September 10, 2021, 3:32pm

Thanks @mbloms for the detailed response.
I made two commits based on your comments:

I narrowed the TypeTests down away from TypeTest[Any,URI] to TypeTest[Node,URI] in this commit. It compiled and tests were successful. That is running without -source:future on Scala3.1.0-RC1
I added back -source:future and added type Node <: Matchable in the right places and I removed all .asMatchable uses in this commit and it compiled. I added a test to check that calling a method from the underlying opaque object would fail at compile time.

So now I think that everything is actually ok with Scala3. It just requires everything to be just right like clockwork. I guess in Switzerland that is just how everything works

bblfish · September 15, 2021, 6:09pm

Kai mentioned a very interesting idea in chat and above which I have now implemented. It uses
the following type of construct to get the effect we previously got with type projections:

I have written a pretty complete version of it in line 16 where the trait RDF is defined, and later on line 190 where the match types are defined.. This allows me to write generically a PointedGraph independently of the final implementation

class PG[Rdf <: RDF](val pointer: Node[Rdf], val graph: Graph[Rdf])

object PG:
	def apply[Rdf <: RDF](pointer: Node[Rdf], graph: Graph[Rdf]): PG[Rdf] =
		new PG[Rdf](pointer, graph)
	def apply[Rdf <: RDF](node: Node[Rdf])(using ops: Ops[Rdf]): PG[Rdf] =
		new PG[Rdf](node, ops.Graph.empty)

The Ops are defined as a trait, which are then implemented by say Jena and rd4j objects independently.

There are a couple of inconveniences to this approach that I have not yet resolved:

because the RDF object defining Graph[Rdf], Triple[Rdf],… use match types and these are evaluated only in the objects somehow (not in traits), I need to implement the Ops[R] givens nearly identically in Jena and Rdf4j.
Another inconvenience is that I can’t specify inheritance in matche types, so that I have to use implicit conversions instead

	// todo: this transformation should really be automatically handled by compiler. Report back.
	implicit def lit2Node(lit: Literal[Rdf]): Node[Rdf] = lit.asInstanceOf[Node[Rdf]]
	implicit def uri2Node(uri: URI[Rdf]): Node[Rdf] = uri.asInstanceOf[Node[Rdf]]
	implicit def uri2rUri(uri: URI[Rdf]): rURI[Rdf] = uri.asInstanceOf[rURI[Rdf]]
	implicit def rUri2rNode(uri: rURI[Rdf]): rNode[Rdf] = uri.asInstanceOf[rNode[Rdf]]

For this project that is ok, but I guess it might be problematic in other situations.

LPTK · September 16, 2021, 2:01am

Not sure it would help with your problem (maybe I misunderstood it), but you can provide bounds on match types, as in:

trait S
trait T[A <: S]

class Test:
  type P[X] <: S = X match
    case T[a] => a
  
  (??? : P[Int]): S // ok

bblfish · September 16, 2021, 12:30pm

Thanks @LPTK for making me think about this some more. I went to write up a simplified version of the library which is in this Scastie, but it does not seem to show anything when I click on that link, so I put the full code here:

/* two imaginary RDF frameworks very differently implemented
 a J-framework and an O-framework. We show how to write generic code 
 over these and any other RDF framework. See https://github.com/bblfish/banana-play */
package J:
  enum Node:
    case BNode(n: Int)
    case Uri(u: String)
    case Literal(s: String)
  case class Triple(subj: Node, rel: Node.Uri, obj: Node)
  type Graph = Set[Triple]

package O:
  type Node = Long | java.net.URI | String
  type Triple = (Node, java.net.URI, Node)
  type Graph = Set[Triple]

package banana:
  trait Ops[R<:RDF]:
    def empty: RDF.Graph[R]
    def mkTriple(s: RDF.Node[R], r: RDF.URI[R], o: RDF.Node[R]): RDF.Triple[R]
    def mkURI(u: String): RDF.URI[R]
    def mkLiteral(lit: String): RDF.Literal[R]
    import scala.language.implicitConversions
    implicit def lit2Node(lit: RDF.Literal[R]): RDF.Node[R] = lit.asInstanceOf[RDF.Node[R]]
    implicit def uri2Node(uri: RDF.URI[R]): RDF.Node[R] = uri.asInstanceOf[RDF.Node[R]]

  trait RDF:
    rdf =>
    type R = rdf.type
    type Node
    type BNode <: Node
    type Literal <: Node
    type URI <: Node
    type Triple
    type Graph
    given ops: Ops[R]

  object RDF:
    type Graph[R <: RDF] = R match
      case GetGraph[g] => g
    type Triple[R <: RDF] = R match
      case GetTriple[t] => t
    type Node[R <: RDF] = R match
      case GetNode[n] => n
    type URI[R <: RDF]  = R match
      case GetURI[u] => u
    type Literal[R <: RDF] = R match
      case GetLiteral[l] => l

    type GetURI[U]  = RDF { type URI = U }
    type GetNode[N] = RDF { type Node = N }
    type GetLiteral[L] = RDF { type Literal = L }
    type GetTriple[T] = RDF { type Triple = T }
    type GetGraph[T] = RDF { type Graph = T }

  case class PG[Rdf <: RDF](node: RDF.Node[Rdf], graph: RDF.Graph[Rdf])
  object PG:
    def apply[R<:RDF](node: RDF.Node[R])(using ops: Ops[R]): PG[R] = new PG(node, ops.empty)

package banana.JRDF:

  import banana.RDF

  object JRdf extends banana.RDF:
    opaque type Node = J.Node
    opaque type BNode <: Node = J.Node.BNode
    opaque type Literal <: Node = J.Node.Literal
    opaque type URI <: Node = J.Node.Uri
    opaque type Triple = J.Triple
    opaque type Graph = Set[Triple]

    given ops: banana.Ops[JRdf.type] with
      def empty: banana.RDF.Graph[JRdf.type] = Set()
      def mkTriple(s: RDF.Node[R], r: RDF.URI[R], o: RDF.Node[R]): RDF.Triple[R] =
        J.Triple(s,r,o)
      def mkURI(u: String): RDF.URI[R] = J.Node.Uri(u)
      def mkLiteral(lit: String): RDF.Literal[R] = J.Node.Literal(lit)

package banana.ORDF:

  import banana.JRDF.JRdf.R
  import banana.RDF

  object ORdf extends banana.RDF:
    opaque type Node = Long | String | java.net.URI
    opaque type BNode <: Node = Long
    opaque type Literal <: Node = String
    opaque type URI <: Node = java.net.URI
    opaque type Triple = O.Triple
    opaque type Graph = O.Graph

    given ops: banana.Ops[ORdf.type] with
      def empty: banana.RDF.Graph[ORdf.type] = Set()
      def mkTriple(s: RDF.Node[R], r: RDF.URI[R], o: RDF.Node[R]): RDF.Triple[R] =
        (s,r,o)
      def mkURI(u: String): RDF.URI[R] = java.net.URI(u)
      def mkLiteral(lit: String): RDF.Literal[R] = lit


def main(args: Array[String]): Unit =
  println("jrdf="+simpleTest(using banana.JRDF.JRdf.ops))
  println("ordf="+simpleTest(using banana.ORDF.ORdf.ops))

def simpleTest[Rdf<:banana.RDF](using ops: banana.Ops[Rdf]) =
  import ops.*
  val bbl = ops.mkURI("https://bblfish.net/#i")
  val name = ops.mkURI("https://xmlns.com/foaf/0.1/name")
  val henry = ops.mkLiteral("Henry")
  ops.mkTriple(bbl,name,henry)

(It turns out that the scastie problem is only related to Safari from the current beta of Apple OSX Monterey)

Working with this simplified version seems to have helped me avoid the code duplication problem I mentioned earlier… Here I only seem to have to give an Ops implementation once per RDF singleton, and in each case the implementation is unique.

I tried a few things with match type inheritance, but none of those worked.

LPTK · September 17, 2021, 2:53am

Can you explain what you’re trying to achieve exactly? The code you provided seems to compile and work.

bblfish · September 18, 2021, 11:32am

yes, the code that works requires the two implicit to turn a RDF.Literal[R] and a RDF.URI[R] into RDF.Node[R]. If those are commented out, then the code no longer compiles.

In my use case that can be an ok fallback position, but I was wondering if there was a better way that directly expresses that in the match type definitions.

I tried the following, which hopefully gives an idea of what I want to do.

My attempt at writing something like Literal2 does not compile.

bblfish · September 18, 2021, 5:37pm

I found the following test may help explain the problem

def failingTest[R<:banana.RDF](using ops: banana.Ops[R]) =
	import ops.given TypeTest[Matchable, banana.RDF.Literal[R]]
	import ops.lit2Node
	val timNode: banana.RDF.Node[R] = ops.mkLiteral("Tim")
	timNode match
		case t : banana.RDF.Literal[R] => "success matching "+t
		case _ => "failed to match literal "+timNode

I get a warning

 the type test for ...banana.RDF.Literal[R] cannot be checked at runtime

even though I have 2 very reasonable type tests I think and have imported them.
With “-Xfatal-warnings” flag that turns not surprisingly into an error, not just a warning.
That is a pretty strange warning.

I found this problem writing a test suite for my library, and duplicated it in the following
simplified version:

This might have brought us full circle to the problem discussed earlier on, but the use case is much further along. Should I report this warning?

bblfish · October 19, 2022, 6:12am

I did get the code to compile without inheritance using implicit conversion. But that is not very satisfactory. As I was starting to add DSLs this week, the inheritance problem came back to bite me.

So I wrote up the problem carefully and hopefully clearly in the Scala Users forum in this post on Match Type inheritance. I have simplified the code to make it easy to get an overview. Hopefully someone can help me solve this, as this is one of the most useful constructs I have come across in Scala. I think unique even.