Enforcing immutability

I wonder if it would be possible to have a mechanism (either in scala or in dotty) that will enforce/guarantee immutability of data structures and/or functions?

I’m no compiler expert but it could be something like a new keyword or implicit argument required for doing reassignments. From what I understand on the lowest level reassignment is the only way to mutate something, so this should be enough.

I can recall that C++ has some pretty sophisticated const semantics similar to this, but I remember as well that it was pretty cumbersome to use it (I have not used C++ for more than a few years now, so don’t know whats the current status).

Not quite. Assuming a fully “const correct” (to borrow C++ parlance) Scala ecosystem, you could still call functions from Java, unless you treat everything unknown as “not const”. I’d assume that to be very painful, though.

1 Like

As you said, I would assume everything unknown as mutable, but this shouldn’t be that painful. It only matters if you pass to the java method call a mutable object or you inherit from java trait and have mutable internals. If you pass something immutable to inpure code it cannot do any harm, and most of the code in Scala is rather immutable by default.

So, I think this is an interesting idea, but it gets complex fast.

Consider these two different kinds of mutation that might occur in your program:

var x: Map[Int, Int] = Map.empty
x.updated(2, 3) // we've mutated the variable m.
// however, the previous and current *values* are both immutable.

import java.util.HashMap
val y: HashMap[Int, Int] = new HashMap
y.put(2, 3) // we've mutable the object y references.
// y is an immutable reference to a mutable object.

You’re quite correct that forbidding the first type of mutation is relatively straightforward: forbid var. However, the second type of mutation (nested mutation coming from an inner object) is more difficult to detect and prevent.

Furthermore, consider the following:

final class IArray[A] private (array: Array[A]) {
  def length: Int = array.length
  def apply(i: Int): A = array(i)
}

object IArray {
  def wrap[A](arr: Array[A]): IArray[A] =
    new IArray(arr.clone)
}

Is IArray immutable? It has a reference to an array, which can be mutated. However, it does not expose this array or mutate it itself. Nor does it store an array reference which could be mutated by an outside third-party. So it seems say to say that IArray is immutable in the sense that (outside of reflection) it can’t be mutated, and doesn’t reference anything which could be mutated by an outsider (assuming that A is also immutable).

Would you want to allow or forbid IArray to be eligible for this enforced immutability?

Forbidding IArray is tough but fair. However, many collections we currently consider immutable (e.g. List or Vector) have these kinds of issues (vectors use internal arrays, and both have hidden var fields which are only intended to be used internally in hidden ways). Would lazy val qualify? The subset of the Scala ecosystem that satisfies this is (potentially) low.

On the other hand, allowing IArray opens up a can of worms: we’d need a way to audit internal uses of potentially-mutable references of objects to ensure that we don’t expect them to be observable. This is probably intractable in at least some cases and would need to be backstopped by explicit user-annotations of what to consider safe or unsafe.

Anyway, I hope these examples illustrate why the issue is a bit thornier than it might seem.

5 Likes

Is Dotty implementing an effect system? You could enforce a function be pure with the effect system.

1 Like

In the first example, we have not mutaded anything, we just created a new object and done nothing with it. In the second case, we have a mutable object which makes is unsafe automatically.

Here we have two problems:

  1. Is length mutating or not? As it comes from Java, we have to assume it is mutating.
  2. the array can be mutated from the outside of our class. It is properly cloned in the object but its probably too far away to know it from the compiler point of view.

If we could by any way conclude that “length” is safe, then we can rewrite the class as follows:

final class IArray[A] (array: Array[A]) {
  private val arrCpy = array.clone
  def length: Int = arrCpy.length
  def apply(i: Int): A = arrCpy(i)
}

So now, we know that the internal state is not mutated anywhere.

I have no idea how to make it safe and convenient at the same time, sadly. User-placed annotations are an option, but it can give us a false sense of safety. Maybe its the best we can get.

I would love to hear/read more about that.

On the one hand that’s true; OTOH, I suspect that that’s liveable for some use cases.

Putting my Lead Engineer hat on for a second, I could totally see some value in having just the simplistic, incomplete version. That is, enforcing that my engineers can’t build new mutable structures, coupled with some limits on what structures and functions they are allowed to use, could impose some very helpful discipline. (Heck, enforcing that discipline on myself might be useful.)

It’s incomplete, but that’s okay: indeed, being able to whitelist trusted data structures that have internally-mutable constructors seems quite useful, as is an annotation to say “this function is trusted to play with vars”.

But as you say, that’s not terribly hard – it seems like something you could totally build with some straightforward scalafix rules. Now I’m wondering whether anybody’s done that yet…

Playing around with Scalafix rules for some compromise version might be worthwhile, to see if you do find something “good enough”.

With my PhD student hat on, finding something “really good” with low enough annotation overhead is not easy. Lukas Rytz’s papers (and PhD thesis) describe a few issues and one possible approach. There’s Dotty-related research for an approach with even lower annotation overhead but IIUC further research is needed — information is scattered among a few mentions in a few different papers. One with the most details is https://www.cs.purdue.edu/homes/rompf/papers/osvald-oopsla16.pdf, though effects are still a side topic (and I’m not sure what’s described is the currently favored approach).

I’m sure that when results are in, they’ll be made public—not much we can do to help.

This is a very interesting topic. Immutability is an important property in OO programs, evidence:

  • Popularity of Scala immutable collections
  • Google Guava
  • Effective Java (item 15): minimize mutability

As far as I know no practical OO languages can enforce and track immutability reliably (Rust?), despite there are active research in academia (1, 2, 3, 4).

There are many different concepts of immutability ([5]):

  • deep VS shallow
  • class VS object
  • reference VS object
  • abstract VS concrete

The study in [4] suggests that transitive class immutability leads to simpler system and easier learning, but in [5] it claims transitive class immutability is too restrictive and inflexible. Interestingly, the empirical study in [6] shows that in large Scala projects, 45%-60% classes in large Scala projects are transitively immutable.

We are considering an effect and immutability system for Dotty. It’s not an easy problem, given the constraints from Java interop and legacy Scala code base, not mention the complexity of the problem itself. I had a tentative idea [7] that may do better than transitive class immutability. Nothing is decided yet, we’d like to hear more from the community about what are the use cases and requirements for such a system.

5 Likes

BTW relevant: https://youtu.be/IiCt4nZfQfg

1 Like

Rust does not track immutability in anything like the sense you mean. It doesn’t even have an idea of an “immutable data structure”–mutability is a property of your permissions/ownership, not a tag that gets carried around with the data. That means you can give a mutable data structure to a function with the assurance that the function cannot modify it (because it isn’t permitted to).

It works well, but it’s different enough from how Scala might do it that I’m not sure we can draw any conclusions (except that smart compilers can really help one generate safe code, but I don’t think we were doubting that).

2 Likes

The D language has an ‘immutable’ keyword that’s used to enforce immutability: https://dlang.org/spec/const3.html

1 Like

The presentation pointed by @nafg is extremely valuable here. Guys have done some broad research on the topic and I will just list few most important points from it here:

  • Detecting immutability is possible and was already implemented as compiler plugin https://github.com/luax/scala-immutability-plugin
  • Based on the results presented, majority of scala codebase is immutable in some way
  • There is a chance that even bigger part of codebase is immutable because the plugin has quite simple notion of immutability implemented(e.g. it treats private vals of mutable type as symptom of mutability, which is not always true)

So enforcing mutability should be quite easily achievable by reusing rules implemented in the mentioned plugin. Unfortunately, I don’t think it is a good idea to have this on a plugin level.

I strongly believe that this feature has to be implemented on a language level for the same reason we have vals and vars. It is possible to have functional/immutable code in java, but using additional final everywhere is too cumbersome to be widely adopted. If we could make something similar to type definitions(for example enforce immutability of case classes by default and require an additional keyword to allow mutability) we would make scala language even more reliable and safe not only hypothetically but also effectively because people tend to use defaults all the time.

I thought about previously, and while I don’t see real technical difficulty - immutability can be defined on class level, not only method level - I do see issues which might make it much less usable than one would expect. For example, while one could initially assume that creating and using a mutable class is fine as long as it doesn’t escape from the method, but in reality it can do countless of things like checking thread id or reading from the console, so the only choice is to prohibit any kind of use of code not verified as pure.

It is quite similar conceptually to throws() in C++ - it just turned out too limiting in practice, as at one point or another you need to call existing code which isn’t declared as such, and created a chicken and egg problem. Furthermore, Scala is a multiparadigm language and it does take advantage of that even when writing otherwise functional style. Iterable.map (and just about any Iterable method) are implemented with mutable state, even if they do not introduce mutability to the semantics, so one would have to not only provide verified wrappers for all standard Java classes which people routinely rely on, but effectively duplicate a lot of Scala’s own standard library. One can’t make exceptions for ‘standard’ features, because people are typically using interface types like Seq rather than List most of the time, and we don’t know what kind of implementation, possibly mutable, stands behind it. Even immutable.ArraySeq has a method unsafeWrapArray which creates an ‘immutable’ collection not only with mutable state, but possibly externally mutable state.

Scala allows to implement ‘effective immutability’ - classes which technically are mutable, or have methods implemented imperatively (parsing always felt more natural that way for me, for example), even though outside it appears as if it were immutable, and I see it as a strength, not a weakness: 100% pure Haskell with no type casting is just much more painful to use, even compared to heavily typed Scala code written 95% in the functional style. C++ has a keyword mutable exactly for that purpose: a const method can mutate any class field with that modifier. This is per C++ philosophy ‘the programmer knows best’, but I think that, given Scala’s expressibility and speed of writing code, such a feature would be less then useless if we just allowed people to declare it without strictly enforcing it by a compiler.

I think the same applies to the idea of prohibiting casting: because Java is not type safe and allows it, it would prevent one from using the Java ecosystem totally, and it is throwing out the baby with the bath water.

Overall, I think it would be better delegated to IDE - JetBrains adds automatically ‘virtual’ annotations @Pure to methods it verifies as such in Java.