Question about reasoning about type equality

A pattern I have run into in a few of the libraries I work with is something like having a list of type aligned pairs:

class AlignedPair[K[_], V[_]] {
  type A
  def key: K[A]
  def value: V[A]

so, I have say List[AlignedPair[Key, Value]] now, I come with a Key[B] and I want an Option[V[B]].

I can’t see a way to implement this in scala without casts. By adding the casts, I wonder if it is actually sound.

So, my question is:

if F[_] is invariant and we know that (fa: F[a]) eq (fb: F[b]) can we infer that a =:= b?

This is clearly not true for a covariant F[+_] since you can make counter example of a value F[Nothing] would imply all types are equal. I guess the same is true of contravariant F[-_] if you can create an F[Any].

It “feels” true for a strictly invariant type, but I can’t see a way to encode this in scala since if I write def foo[F[_], A](fa: F[A]) = ... it will accept covariant types.

It would be cool if dotty knew this, e.g, if we have:

xa: SomeInvType[A] = ...
xb: SomeInvType[B] = ...

xa match {
  case yb if yb eq xb =>
    // the type inference understands that A =:= B in this branch

It occurs to me, maybe what I want is something like:

def maybeEq[A, B](ta: TypeTag[A], tb: TypeTag[B]): Option[A =:= B] =
  if (ta == tb) Some(implicitly[A =:= A].asInstanceOf[A =:= B])
  else None

Which seems like it should be sound. Is this correct? If so, are there any blockers to a method like this being added to the standard library?

If you ask more concretely when this comes up: consider a directory of files that can be deserialized into objects (e.g. for Spark or similar systems). We want a Directory that can know about all these mappings. And if we come with some Identifier[A] we want to get the Dataset[A] out. So, we expect the user to know the type associated with the identifier.

Any insight into this design space would be appreciated.

1 Like

Not in general (think of type F[T] = Any). For this property to hold, you’d need F to be injective.

There is no way to ask Scala to make sure F is injective. However, you can use a wrapper class over the key value to achieve something similar:

case class KeyHolder[K[_], A](key: K[A])

case class AlignedPairOf[A, K[_], V[_]](key: KeyHolder[K, A], value: V[A])

type AlignedPair[K[_], V[_]] = AlignedPairOf[?, K, V]

def get[K[_], V[_], B](key: KeyHolder[K, B]): List[AlignedPair[K, V]] => Option[V[B]] = {
  case (AlignedPairOf(`key`, v)) :: _ => Some(v)
  case _ :: xs => get(key)(xs)
  case Nil => None

@main def test = {
  val p = AlignedPairOf(KeyHolder(Some(42)), List(1, 2, 3))
  val q = AlignedPairOf(KeyHolder(Some("hi")), List("hey"))
  val ls = p :: Nil
  println(get(p.key)(ls): Option[List[Int]]) // Some(List(1, 2, 3))
  println(get(q.key)(ls): Option[List[String]]) // None


This works thanks to Dotty’s new GADT reasoning capabilities (cc @AleksanderBG).

However, I’m wondering if you didn’t have something else in mind — i.e., something where the AlignedPair stores a K (not some K[A]) along with a V[K]. This would also be possible.


To the more general question of how to safely get an Option[V[B]] out, I think you can accomplish this by moving more stuff into type members.

In particular, you can write

class AlignedPair[K <: { type Out }, V[_]] {
  def key: K
  def value: V[key.Out]
  def getValue(k: Key): Option[V[k.Out]] = if(k == key) Some(value.asInstanceOf[k.Out]) else None

The soundness requirement here is easy to verify for the user: if two Ks are equal, then their Out members must be equal. (I think you’d have to do something very wrong to violate this.)

I have some utilities for this purpose here under the Dot* names. The most useful one is the function analog which I call DotKleisli. (My naming conventions ain’t perfect…) I’ve thought about releasing something more neatly packaged for general use but never got around to it. But anyway I think DotKleisli is the general form of what you’re looking for: a function whose output type can depend on type information about its input (in possibly arbitrary ways, hence the Kleisli part).

I find it generally easier to work with this style of representation—this approach is lets you encode an entire API in a single type without any higher-kinded parameters (unlike with the Interpreter Pattern). This proves very useful for things like serialization—you can launder the type member over the wire since it provides a proper existential. For example, you can write a single function which converts any API implementation into an HTTP server or client. In the past I couldn’t get this kind of thing to work properly because of how the type parameter (i.e., the A in your K[A]) would end up as an unmentionable Skolem with the traditional Interpreter Pattern approach.

Hope that helps. :slight_smile:

Except this is false, and it’s easy to come up with a reasonable counter example where nothing wrong is done. For instance:

case class MyKey(k: Int)(val otherStuff: Any) {
  type Out = otherStuff.type

Why not stick with a technique which avoids unsafe casts altogether, like the one I’ve shown above?

Ah sorry, I should have used clearer language. I didn’t mean it’s sound in general, but that the condition for soundness is easy for the user to verifiably satisfy when defining K. And that example is … exactly what I feel qualifies as “something very wrong” :smile:. I was speaking morally. In my many convoluted type-member-wrangling escapades I have thankfully never needed something like that, and I would be perfectly happy in a world where value equality implied type member equality everywhere. But that seems a little beyond what the Scala compiler is made for.

Your approach seems valid as well! But now that I’m looking carefully, I’m not seeing how it necessarily fixes the problem. Adding onto your scastie:

  type C[A] = Int
  type Id[A] = A
  val u = AlignedPairOf[String, C, Id](KeyHolder[C, String](1), "omg")
  val v = AlignedPairOf[Double, C, Id](KeyHolder[C, Double](1), 9000.1)
  val xs = u :: Nil
  println(get(u.key)(xs): Option[String]) // Some(omg)
  println(get(v.key)(xs): Option[Double]) // Some(omg) --- Silently wrong!
  println(get(v.key)(xs).map(math.round): Option[Double]) // java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Double

Funny, nowhere did we write any casts. Seems to me like there may be a more general soundness issue here :thinking:

Good catch! This is definitely a bug in the Dotty compiler. I have opened an issue for it.

Surprisingly, matching with `key` uses == instead of eq. Now I remember reading about it, and about the fact that they considered changing it but it would have broken too much existing code.

So ironically, my code fails for the very same reason as yours (except that in my case it’s because of a bug in Dotty) — a usage of == leads to the unsound assumption that the type members are equal.

The correct way to implement the function is to use a _: key.type pattern, which does use eq behind the scene (ensuring object identity):

def get[K[_], V[_], B](key: KeyHolder[K, B]): List[AlignedPair[K, V]] => Option[V[B]] = {
  case (AlignedPairOf(_: key.type, v)) :: _ => Some(v)
  case _ :: xs => get(key)(xs)
  case Nil => None

When you’re dealing with more advanced type-level mechanisms, it very quickly becomes harder to discern what is reasonable (for some definition of the word) from what is not. How are users to know whether they are doing things you would have considered “morally wrong”, when they look perfectly OK and blessed by the language, requiring no unsafe features?

The unsound implementation of get I showed above is a perfect example: assuming the compiler would have correctly rejected my code and I had used an unsafe cast, following the same sort of reasoning as you, it would have led to the reasonable-looking code you wrote above to crash.

IMO using the type system as the source of truth on whether what users are doing on the type level is “wrong” is the main distinguishing feature of a strong static type system, compared to more dynamic ones like Java where casts are normal, arrays are covariant, etc. and you can never be sure you didn’t do something wrong which will later cause a ClassCastException without a clear place to blame.

In my case, it’s very easy for the user to know: for the type K, value equality must imply type member equality on K#Out. This would be part of the API contract for e.g., getValue to work as expected without error. By “morally wrong” I really mean “a clearly bad idea which I suspect should never be needed, but is unfortunately impossible for the compiler to rule out.”

Okay, but now if I use

val p = AlignedPairOf(KeyHolder(Some(42)), List(1, 2, 3))
val q = AlignedPairOf(KeyHolder(Some(42)), List(1, 2, 3))

in your example, I get the same result, where p's lookup succeeds and q's fails—I’ve lost referential transparency. In the (≥99%) case that my type members do respect value equality (==), I now have an extremely treacherous situation where my lookup function’s runtime behavior depends on aliasing. In the (overwhelmingly common in practice) case that the client constructs their own key object to index into the list, their lookups will always silently fail. So what your proposal does is:

  • Exchanges a precondition which is almost always satisfied at use sites for one that rarely is.
  • Exchanges an easy, locally verifiable precondition (that value equality implies type member equality for K) for a program-global precondition (that for all k1: K, k2: K, k1 == k2 implies k1 eq k2 at all execution points in the program) which is a very difficult static analysis problem.
  • Exchanges a very loud error in case that precondition is violated (ClassCastException) with a completely silent one (false None lookups).

Seems to me like a hard lose. You could require the user to manage all instances of K to make sure the aliasing condition is satisfied, but that is a huge burden which I’m sure you want to avoid.

Is there a way around this issue? I suspect not. The only way for the construction we’re discussing to have reasonable, predictable behavior is if value equality on the keys implies type equality on the values. I agree that using types is great and should be done where possible, but it’s not always possible. (You don’t tell people that Monoid is bad because the compiler can’t check associativity.) In this case, it’s especially not possible: trying to incorporate value equality into the type system is a big step toward full-blown dependent type theory, and I would assume it would be very difficult to do reasonably in Scala due to its fundamentally impure execution model. I love dependent type theory, but this seems way out of scope for Scala today. So just make it part of the API contract. It is easy for the user to verify (it can very easily be made a local property of K!), will be true in nearly all cases, and if the user violates it there will be wailing and gnashing of ClassCastExceptions at runtime until they reevaluate their life and fix the problem.

You make a good point that my solution requires referential equality. You could intern the KeyHolder instances, but you’d need an unsafe cast in the interning logic now, which could also fail if used in the wrong way.

So I agree with you that there doesn’t seem to be a good solution if you require equals-based key equality. Though it would be possible to put the onus on the user and ask them to provide an implementation for an equality function with a more precise type (but as you said, the dependent typing would get hairy).

But note that the author of this post was asking for an eq-based solution (see his case yb if yb eq xb => suggestion), not an equals-based one.

Well, that’s a bit of an unfair characterization. Value equality implying type member equality is not a “locally verifiable precondition” by any means! For one thing, it depends on how downstream types implement their own equality, and this may not even be known if your key type is polymorphic. And you don’t need the global static analysis condition you laid out above to use my solution soundly; you just have to keep in mind that it relies on reference equality.

You’re not actually guaranteed to have a ClassCastException (let alone have one at the place where the static type violation occurs!) as this depends on the erasure of your domain types — and on ScalaJS it’s actually undefined behavior. I’ll take an error which the type system forces me to handle over an unpredictable VM exception which will likely manifest at the wrong place, if at all.

Sure, but breaking associativity won’t get you an unsound type system.

As I said, the user may not even know where the exception is coming from. For instance, if their stored type was of the form List[Int => Int], receiving a List[String => String] instead could lead to a crash in a completely different par of the application, with unclear cause. If not only your library, but other libraries too start to be liberal in their use of unsoundness, then the user may have an even harder time figuring out where this particular unsoundness comes from.

Fair point, an eq-based solution is the right answer to OP’s technical question. I totally missed the issue in my first post and it would have been better if I had recognized the difference earlier. But to the more general design question, it’s not obvious that reference equality is the right thing to use. I think it’s absolutely not. At the very least, you lose referential transparency and the user can’t usefully create queries on the fly. Consider:


This query will always return None under your suggested implementation. How will real programs, in practice, write queries that can actually succeed and find their values? I don’t know what that would even look like except in toy examples like the one you’ve given or simple cases where there is a small number of globally defined instances of the key type. These problems are symptoms of the inherent difficulties of reasoning about reference equality. Such an index built necessarily on reference equality is just not a terribly useful or usable abstraction. That’s why == is a thing in the first place, and used, e.g., for normal Maps and Sets. Also note that, in cases where you do want reference equality (perhaps there are only a few global instances or for some other reason it is easy to avoid such issues), then you could just use reference equality directly as value equality on your key type. Assuming that your approach with reference equality is sound, my precondition will always be satisfied in these cases.

I don’t mean that it is always going to be true, or always going to be verifiable (since it’s not always true), or even that it’s always locally verifiable when it is verifiable. I mean in the vast majority of actual use cases, it is locally verifiable, or easy for the user to choose to structure their type so that it is. For example:

sealed trait Key
case class IntKey(value: String) extends Key { type Out = Int }
case class StringKey(value: String) extends Key { type Out = String }

Here it is immediately obvious that value equality on Key will always imply type equality on Out. I actually use the pattern we’re discussing fairly regularly in practice, and pretty much all use cases look roughly like this. If your type member is path-dependent on some parameters, or you have type parameters on your Key subtype and the type member depends on those, then it’s pretty simple to work out how the constraint propagates down, producing another fairly straightforward precondition which must be verified for the instantiating type. This is what I mean by “local.”

Sure, to use it soundly, but to use it usefully? As I argued above, reference equality is not useful (and perhaps is actively adversarial) to real use cases for this kind of abstraction. If you want reasonable behavior—which obeys value equality—then you need to start think about aliasing, which is a whole-program analysis problem. And as you mention, if you try to mitigate the issue by interning based on value equality, you still need my precondition to be satisfied.

The point of it being “local” is that once you look at the definition of K and verify that this is the case, you’re done and don’t have to worry about it anymore. You know at all call sites it will always be correctly typed and do the reasonable thing.

When providing such an abstraction to the user, you can clearly state the API contract and warn them of undefined behavior if it’s violated. In practice, they should never dare to use the system without meeting the API precondition, and in case of undefined behavior they should know exactly where to look—the definitions of all of the key types they use in such a way—and the solution will be to redesign them so that value equality implies type member equality.

Sure, it’s not good for lots of libraries to be liberal with unsoundness, but here’s the problem for our case: relying on value equality over keys is necessary for the pattern we’re discussing to produce reasonable, usable behavior. But the condition in order for such a thing to be sound, while it can be easily checked in many cases by the user, cannot be checked by the compiler. So our options are:

  1. Give up and do something much less useful instead (i.e., use reference equality)
  2. Rely on value equality but put very loud, clear warnings in the API contract that program behavior is undefined if the precondition isn’t met.

When it happens that for many (and I suspect all) real, practical use cases, the precondition is trivially met and verifiably so, 2 seems like not such a bad trade. I can safely tell you that for all my use cases of type-dependent pairs, maps, etc., the sort that you’re describing (i.e., relying on reference equality) would be totally useless (plus potentially infuriating) to me unless I was guaranteeing that value equality implied aliasing (e.g., using interning, which as you said relies on my precondition for soundness anyway). So I would be stuck with nothing instead of something.

I like this idea actually. You could approach it roughly as follows:

// Instead use, e.g.,
abstract class Is[A, B] extends Serializable {
  def substitute[F[_]](fa: F[A]): F[B]

trait TypeMemberEq[A <: { type Out }] {
  def teq(x: A, y: A): Option[Is[x.Out, y.Out]]

class AlignedPair[K <: { type Out }, V[_]] {
  def key: K
  def value: V[key.Out]
  def getValue(k: Key)(implicit ev: TypeMemberEq[K]): Option[V[k.Out]] = if(k == key) ev.teq(key, k).map(_.substitute(value)) else None

It seems that in theory the user might be able to entirely avoid unsafe casts, at least in some cases, as well:

sealed trait Key { type Out }
case class IntKey(value: String) extends Key { type Out = Int }
case class StringKey(value: String) extends Key { type Out = String }
object Key {
  implicit val keyTypeMemberEq: TypeMemberEq[Key] = new TypeMemberEq[Key] {
    def teq(x: Key, y: Key): Option[Is[x.Out, y.Out]] = (x, y) match {
      case (IntKey(_), IntKey(_)) => Some(Is[Int])
      case (StringKey(_), StringKey(_)) => Some(Is[String])
      case _ => None

But this doesn’t compile:

      case (IntKey(_), IntKey(_)) => Some(Is[Int])
                                          Found:    Is[Int, Int]
                                          Required: Is[x.Out, y.Out]
      case (StringKey(_), StringKey(_)) => Some(Is[String])
                                            Found:    Is[String, String]
                                            Required: Is[x.Out, y.Out]

This seems like a shortcoming of type inference, which I guess is what you were pointing out has been improved for GADTs. But it seems like type member information doesn’t get fully propagated through / constrained by a match. This issue has been a pretty common source of annoyance for me. Would be really nice if this situation were improved, though I don’t know if there are any technical or theoretical blockers.

But anyway, using this approach, I can adapt your example to work with no casts while ensuring value equality:

// use this from e.g. cats
abstract class Is[A, B] extends Serializable {
  def substitute[F[_]](fa: F[A]): F[B]
object Is {
  def apply[A]: Is[A, A] = new Is[A, A] {
    def substitute[F[_]](fa: F[A]): F[A] = fa

trait InnerTypeEq[F[_]] {
  def getTypeEquality[A, B](fa: F[A], y: F[B]): Option[Is[A, B]]

case class AlignedPairOf[A, K[_], V[_]](key: K[A], value: V[A])

type AlignedPair[K[_], V[_]] = AlignedPairOf[?, K, V]

def get[K[_], V[_], B](ls: List[AlignedPair[K, V]], key: K[B])(implicit ev: InnerTypeEq[K]): List[V[B]] = ls.flatMap { case AlignedPairOf(k, v) =>
  if(key == k) ev.getTypeEquality(k, key).map(_.substitute(v))
  else None

sealed trait Key[A]
case class IntKey(value: String) extends Key[Int]
case class StringKey(value: String) extends Key[String]
object Key {
  implicit val keyInnerTypeEq: InnerTypeEq[Key] = new InnerTypeEq[Key] {
    def getTypeEquality[A, B](x: Key[A], y: Key[B]): Option[Is[A, B]] = (x, y) match {
      case (IntKey(_), IntKey(_)) => Some(Is[Int])
      case (StringKey(_), StringKey(_)) => Some(Is[String])
      case _ => None

@main def test = {
  val p = AlignedPairOf(IntKey("42"), List(1, 2, 3))
  val q = AlignedPairOf(StringKey("42"), List("1", "2", "3"))
  val r = AlignedPairOf(IntKey("42"), List(1, 2, 3))
  val ls = p :: Nil
  println(get(ls, p.key): List[List[Int]]) // Some(List(1, 2, 3))
  println(get(ls, q.key): List[List[String]]) // None
  println(get(ls, r.key): List[List[Int]]) // Some(List(1, 2, 3))

So I guess yeah, with the right API design we can get value equality to work for the user without any unsafe casts. It’d perhaps be trickier to do this using generic type constructors like Option without having the same unsoundness issue you’re pointing out. But this design does at least allow for the opportunity to use value equality with full safety and leaves unsafe casts to the user.

Open question what the best API contract is for InnerTypeEq—whether it should respect value equality or use a coarser relation, etc.

Anyway, thanks for spurring this discussion! Hadn’t thought of this way of doing it.

(Also sorry for Scala 2 syntax, have not been keeping up with Dotty really)

I don’t find this attitude very constructive. If you had really thought about it for a minute, I’m sure you would have found plenty of real use cases. Beyond the case you describe, where there is a finite number of keys (which needs not be small), implemented as global objects in some libraries — which is a completely valid use case; think of something like SBT — here is how you could define your running example, in my approach, without any casts:

class KeyBase[K] extends KeyHolder[K] {
  private val cache: WeakHashMap[K, K] = WeakHashMap.empty
  def apply(k: K): K = cache.getOrElseUpdate(k, k)
object IntKey extends KeyBase[Int]
object StringKey extends KeyBase[StringKey]



Remember that the entire point is to avoid having to make assumptions, and to let the type checker sort things out for you.

It’s always simple at first, until it’s not, because the code evolves, and some other programmer misses or misunderstands the requirement and changes your definition. By contrast, it’s so convenient to have a theorem prover (in the form of a type checker) automatically verify your code against such changes, as long as you’re not subverting it with unsafe casts!
I’m not saying that’s always possible, but as I’ve demonstrated here and as you found out below in this case it is, and IMHO it’s something worth striving for.

Right, because that approach worked so well in other languages, like C++… (In fact, it’s responsible for the majority of security vulnerabilities in the wild.)

That does not make sense. By definition, undefined behavior is undefined, and you can’t consistently detect it (let alone detect it locally). So there is no such thing as “in case of undefined behavior, [do this]” — you only get an exception if you’re lucky. Most likely, your application starts failing silently and corrupting its data.

1 Like

Nice approach. Now I’m starting to like it :^)

We can actually make that more principled and general by using an Invariant type class.

EDIT: the code I provided accidentally used a new soundness bug in Dotty, which I have reported here.

Here is how to do it cleanly in Dotty. It type checks successfully:

enum =:=[A, B]:
  case Refl[X]() extends (X =:= X)

trait TypedEq[K[_]]:
  def apply[A, B]: (K[A], K[B]) => Option[A =:= B]

enum Key[T]:
  val id: String
  case OfInt(id: String) extends Key[Int]
  case OfString(id: String) extends Key[String]

given TypedEq[Key]:
  def apply[A, B] =
    case (k0, k1) if != => None
    case (Key.OfInt(id0), Key.OfInt(id1)) => Some(=:=.Refl())
    case (Key.OfString(id0), Key.OfString(id1)) => Some(=:=.Refl())
    case _ => None

case class AlignedPair[A, K[_], V[_]](key: K[A], value: V[A])

def get[K[_], V[_], B](key: K[B])(using eq: TypedEq[K])
    : List[AlignedPair[?, K, V]] => Option[V[B]] = {
  case (p: AlignedPair[x, K, V]) :: xs =>
    eq(p.key, key) match
      case Some(=:=.Refl()) => Some(p.value)
      case _ => get(key).apply(xs)
  case Nil => None

@main def test = {
  val p = AlignedPair(Key OfInt "n", List(1, 2, 3))
  val q = AlignedPair(Key OfString "s", List("hello"))
  val ls = p :: q :: Nil
  println(get(Key OfInt "n")(ls): Option[List[Int]]) // Some(List(1, 2, 3))
  println(get(Key OfString "s")(ls): Option[List[String]]) // Some(List(hello))
  println(get(Key OfString "nope")(ls): Option[List[String]]) // None



1 Like

I don’t follow the exact details of this (it seems like you’re just passing an Int into get), but I think I see what you’re getting at. This is a case of interning, which can only be done properly if the precondition is satisfied, but I guess point here is that “there exists a sound implementation of interning” is a programmatic way of verifying the precondition (in at least some cases). It still seems to me like an extra burden on the user and surface area for logic errors, but there’s certainly an argument for this over surface area for soundness errors.

Yeah this is a comical statement. “In case of undefined behavior” doesn’t make any sense. Sorry about that. I think lockdown is getting to me.

Right, I’m not saying this is a good situation. I agree it should be avoided where possible, and leveraging the type system and maintaining soundness is worth striving for as you say—though in our case, the precondition is so easy to verify in so many use cases, I think it’s possible to overstate the practical problem this poses. And your clean example of how to do it soundly is quite nice—these new GADT type inference capabilities seem great. So I really think we’re mostly on the same page.

But the perspective I want to highlight is the following. I’ve spent a lot of time fooling around with type members trying to write safe, generic code for various purposes. And they’ve been really great at providing powerful, safe APIs for generic programs. But sometimes, due for example to limitations in type inference like the one I related above, or other issues related to equality, I’ve had a great deal of difficulty getting obviously correct implementations of these APIs to pass the Scala compiler. In this case, it seems we can safely capture many use-cases with a single, simple typeclass—awesome. But in other cases, getting the right constraints to propagate through the type system may be tedious and require a lot of boilerplate and syntactic noise. Overall, I have found that it is sometimes less error-prone to use a technically-unsound feature internally, carefully verify it once, and have a safe, simple API, than the alternative, which is using something which is technically sound but requires much more ceremony and makes it harder to understand the code’s business logic, leaving greater room for error in practice.

The balance of these concerns greatly depends on the use case IMO. If you’re designing a language, soundness is paramount. If you’re designing a library, it’s crucial, since you need to handle arbitrary users. If you’re developing an application, it’s important, but may need to be balanced with other concerns, depending on the size of your team as well. The approach I explained has worked well for me because in practice I’m pretty much always in the “developing an application” bucket: using this bit of unsoundness has saved me quite a bit of time, effort, and errors, while it hasn’t bitten me once. As we discovered in this thread, we can do even better and solve the problem (at least in some cases) without unsoundness, which is great, and I might change to the new approach once I update my working Scala version (lol). But if such a solution weren’t available, I think there’d still be value in my approach, for the reasons I outlined, and I certainly think that as long as the compiler and language isn’t perfect (or, fully dependently-typed), there’s room to argue such things.