Automatic Resource Management Library

newhoggy · February 24, 2017, 1:29pm

It is actually fairly difficult in languages like Java and Scala to ensure that resources are always cleaned up in a timely manner and more difficult still to ensure that all resources are cleaned up should an attempt to create composite resource fail mid-construction.

This library would aim to simplify resource management by applying RAII (Resource Acquisition Is Initialisation) principles borrowed from C++.

The library needs to function well under the many environments it might be reasonably used in, so thread-safety, extensibility, ease-of-use, and performance are important concerns.

I provide an example library to invite discussion on this topic:

github.com

packetloop/pico-disposal/blob/develop/pico-disposal/src/main/tut/tutorial.md


## Using the dispose method instead of close methods to close resources
In order to dispose a resource, call the `dispose` method on the resource:

```scala
resource.dispose()
```

## How is Disposable different from Closeable and AutoCloseable?
The dispose pattern is different from `Closeable` and `AutoCloseable` in a number of ways:

The `Disposable` pattern:

* means objects may be disposed at a scope level regardless of whether exceptions are thrown
* can be made to work on arbitrary types
* can be composed
* provides `for` syntax support for automatic disposal
* allows disposal to be generically delegated to another object via ownership

### Treatment of exceptions

This file has been truncated. show original

pathikrit · February 24, 2017, 2:55pm

Hi, a small FYI - better-files (currently accepted as a Scala platform library) also comes with its own micro ARM built-in. See:
https://github.com/pathikrit/better-files#lightweight-arm

Here’s a gist of how it works for Java’s closables:

type Closeable = {
  def close(): Unit
}

type ManagedResource[A <: Closeable] = Traversable[A]

implicit class CloseableOps[A <: Closeable](resource: A) { 
  var isClosed = false
  override def foreach[U](f: A => U) = try {
    f(resource)
  } finally {
     if (!isClosed) {
      resource.close()
      isClosed = true
    }
  }
}

for {
  in <- file1.newInputStream.autoClosed
  out <- file2.newOutputStream.autoClosed
} in.pipeTo(out)

There are also special auto-closing iterators for things like Java 8s streams (code).

tpolecat · February 24, 2017, 8:30pm

I think what is provided by better-files is probably sufficient. There’s only so much you can do here and additional machinery has diminishing returns.

The only way to truly guarantee resource safety is to make the resource itself inaccessible, which a common use case for free monads.

newhoggy · February 24, 2017, 11:24pm

It’s important to note that auto-closing iterators by themselves don’t solve the problem. For example iterator.take(5) means that you may never get to the end and the iterator doesn’t close. Exceptions can happen as well.

The autoClosed in the better-files library is a good start is exactly the kind of thing I’m looking for but I don’t think it goes far enough.

For example, what if I want to use auto closed iterators, but want them in combination with autoClosed to cover cases like take(5) and exceptions? Some resources do not like to be closed twice and will throw an exception on the attempt.

What about if I want to automatic resource manage a threadpool? But they aren’t Closeable.

Then there are more complicated situations like S3 multipart uploads, where I may not actually want to close in the case of an exception. I could be leaving lots of garbage around if I don’t close or abort. If I close when there is an error or exception the file will be written to S3 and appear as a file, whereas what I really want is to abort the upload to avoid writing an incomplete corrupt file. Also if I close the upload after an abort, the close will throw an exception.

Just using a commonly used library like aws-s3 shows that what’s available in better files is inadequate.

tpolecat · February 25, 2017, 12:01am

You are correct that Iterator is inadequate for this kind of thing. I suggest looking at fs2 which handles resource safety with streaming I/O in a very robust way. In general these things are much easier with pure functional IO but I realize most Scala programmers aren’t interested in that approach.

pathikrit · February 25, 2017, 12:12am

Some resources do not like to be closed twice and will throw an exception on the attempt.

The ones in better-files do handle multiple closing but you are completely right about the use cases you mentioned. I can work with you to incorporate these cases into better-files itself if you want.

newhoggy · February 25, 2017, 2:33am

better-files actually looks like a nice library and I wouldn’t mind using it. My comment isn’t criticism of better-files as a files library where light-weight ARM looks like it serves that use-case well.

I’m more pointing out that light-weight ARM isn’t sufficient in the general case and rolling your own more complete ARM solution is fairly error-prone.

As long as better-files itself doesn’t need more than light-weight ARM to implement its functionality, it doesn’t need to implement one. A comprehensive ARM library could be used in conjunction better-files for the cases where users need it.

One of the more annoying things about Scala is how do I define a class that owns two or more resources in a safe way? In Scala constructor code declaration of fields are intermingled, which is nice from a DRY perspective, but I find it makes ARM of composite resources difficult:

class Composite {
  val resource1 = allocate1()
  val resource2 = allocate2(resource1)
  val resource3 = allocate3()

  override def close(): Unit = {
    resource1.close()
    resource2.close()
    resource3.close()
  }
}

Here, if allocate2 throws, resource1 is leaked. If allocate3 throws, resource1 and resource2 is leaked. Also, the close happens in the wrong order. I would be closing resource1 before resource2, which could be bad because resource2 might be still using resource1. Even if I fix the ordering, that’s still not right because close can throw, so if an earlier close() fails, later calls to close() won’t happen, and there is another leak.

A more correct way to do this might be to handle all the exceptions and fix the order in which resources are closed:

class Composite extends Closeable {
  val resource1 = allocate1()
  val resource2 = try {
    allocate2()
  } catch {
    case t: Throwable =>
      resource1.close()
      throw t
  }
  val resource3 = try {
    allocate3()
  } catch {
    case t: Throwable =>
      try {
        resource2.close()
        throw t
      } catch {
        case t: Throwable =>
          resource1.close()
          throw t
      }
  }

  override def close(): Unit = {
    try {
      resource3.close()
    } catch {
      case t: Throwable =>
        try {
          resource2.close()
          throw t
        } catch {
          case t: Throwable =>
            resource1.close()
            throw t
        }
    }
  }
}

I’d rather not write this kind of code though.

nafg · February 26, 2017, 12:35am

Out of curiosity, what would Scala need to have ownership tracking like
Rust, and would that help for this?

newhoggy · February 26, 2017, 2:30am

Language support could help a lot.

If there were language support, there needs to be a way to handle exceptions easily in the constructor without interfering with how fields are declared.

I try to deal with ownership from a library, but I still have issues working with the language. For example:

trait Disposer {
   def disposes[A: Disposable](a: A): A = ???
   def close(): Unit = ???
}

class Composite extends Disposer {
   val resource1 = this.disposes(allocate1())
   val resource2 = this.disposes(allocate2())
}

Above, Disposer maintains state that stores every resource with ownership declared with the disposes method. Disposer.close then disposes the resources in reverse order.

But this is not enough. What happens when the constructor throws an exception?

One of the more vexing issues with the way constructors work in Scala is the fact that the syntax for declaring local variables and fields are the same:

class Composite extends Disposer {
   val resource1 = this.disposes(allocate1()) // this is a field
   val resource2 = this.disposes(allocate2()) // this is a field
}   

class Composite extends Disposer {
   try {
     val resource1 = this.disposes(allocate1()) // this is NOT a field
     val resource2 = this.disposes(allocate2()) // this is NOT a field
   } catch {
     case t: Throwable => // Exceptions thrown by constructor
       this.close()
       throw t
   }
}

I currently deal with this in the library in a less than elegant way:

def Construct[A <: Disposer](f: Disposer => A): A = {
    val disposer = Disposer()

    try {
      val resource = f(disposer)

      // Transfer ownership from the disposer to the object
      resource.disposes(disposer.release())
      resource
    } finally {
      disposer.dispose()
    }
}

class Compose(resource1: Resource, resource2: Resource) extends Disposer

object Composite {
   def apply(): Composite = {
     Construct { disposer =>
       val resource1 = this.disposes(allocate1())
       val resource2 = this.disposes(allocate2())
       Composite(resource1, resource1)
     }
   }
}

The Construct syntax transfers ownership from the disposer to the resource, but only if no exceptions are thrown. If an exception is thrown, the disposer frees all resources. What I’m essentially doing here is doing all of my construction outside the constructor and only after all the construction is successful, transfer ownership to the Composite object via the constructor to avoid throwing from the constructor at all.

This is where the language can help. If I could intercept any exceptions thrown from the constructor(s), I could delegate to close and ensure I don’t leak resources in this case.

For example:

trait Disposer {
   def disposes[A: Disposable](a: A): A = ???
   def close(): Unit = ???
}

class Composite extends Disposer {
   val resource1 = this.disposes(allocate1())
   val resource2 = this.disposes(allocate2())
} catch {
   case t: Throwable => // Exceptions thrown by constructor
     this.close()
     throw t
}

newhoggy · February 26, 2017, 3:08am

I think the following language improvements could help

@auto val resource1 = allocate1()
@auto val resource2 = allocate2()
doSomething()

The compiler could generate the following code:

val resource1 = allocate1()
try {
  val resource2 = allocate2()
  try {
    doSomething()
  } finally {
    resource2.dispose()
  }
} finally {
  resource1.dispose()
}

It could be extended to var with C++ auto_ptr semantics:

@auto var resource1 = allocate1()
resource1 = allocate1()

var resource1 = allocate1()
try {
  val tmp = allocate1()
  swap(resource1, tmp) // swaps value of two variables
} finally {
  resource1.dispose()
}

Then for fields:

class Resource extends Closeable {
  @auto val resource1 = allocate1()
  @auto val resource2 = allocate2()
   
  def close(): Unit = {
    this.finally() // Cleans up auto allocated resources.
  }
}

This expands to:

class Resource extends Closeable {
  val resource1
  val resource2

  doSomething()

  resource1 = allocate1()
  try {
    resource2 = allocate2()
    try {
      doSomething()
    } catch {
      case t: Throwable => resource2.dispose(); throw t
    }
  } finally {
    case t: Throwable => resource1.dispose(); throw t
  }

  def close(): Unit = {
    try {
      resource2.dispose()
    } finally {
      resource1.dispose()
    }
  }
}

jatcwang · March 4, 2017, 12:38am

It looks like what you’re suggesting is possible with macros, @newhoggy?

jvican · March 7, 2017, 10:44pm

Hey @newhoggy!

@densh (Denys Shabalyn) wrote a library and did a talk about this topic last ScalaWorld 2016 (https://www.youtube.com/watch?v=MV2eJkwarT4). I invite you to watch the video, I think it’s quite similar to what you propose.

@jsuereth (Josh Suereth) did some work on this topic too, you can have a look at the scala-arm library here.

Language-level support could be provided via macros, and it should be relatively easy to code. I think that there’s good room for experimentation here. We have recently decided to add better-files to the Scala Platform, so what about you provide a prototype of this working on top of this library? I would imagine this being in an independent module inside better files that people could optionally use. This can be a very good opportunity of solving the hassle of resource management once and for all .

newhoggy · March 8, 2017, 9:13pm

Macros is an area of the language I haven’t explored yet, but perhaps I could have a stab at it if it seems possible.

jvican · March 8, 2017, 9:54pm

I encourage you to have a look at that, you’ll have fun.

This is the goto macros tutorial for macros. You have all the instructions in the README, the tutorial is split in commits and the commit messages contain the description of each step. It’s an authoritative and very approachable guide for macros. From there, you have links to more detailed tutorials. I suggest you that you don’t spend too much time reading those though, get your hands dirty as soon as possible. That’s the best way to learn.

If you’re stuck at any point, let me know. I’m happy to help you out.

tpolecat · March 8, 2017, 10:13pm

Auto-closing is a straightforward monadic effect, as demonstrated with scala-arm. I don’t see the benefit of a rewrite system.

newhoggy · March 9, 2017, 8:22am

Do you have a reference I could read up on monadic auto-closing?

I’ve used conduits before without ever having to worry about closing resources, but never really understood how it works underneath.

Also, I’m interested to know how monadic effects can help with automatic resource closing in problems outside of stream processing.

tpolecat · March 10, 2017, 12:08am

@newhoggy Here is one way to do it. There are better ways but this should be easy to follow.

newhoggy · March 10, 2017, 12:51am

I appreciate you putting this together.

I’m looking to solve particular resource management techniques to deal with trickier situations like for example:

how do I implement a thread/connection pool?
how do I implement objects that are composed of multiple objects safely?

For example, let’s say I want to encrypt something from Scala by invoking the gpg command line program, I need to create a process, get its input and output streams, create a thread to handle bytes, etc.

def encrypted(out: OutputStream): OutputStream = ???

val out: OutputStream = ???
val encryptedOutputStream = encrypted(out)

encryptedOutputStream would hold a composite resource. When I call encryptedOutputStream.close(), I expect to not only close that stream, but also ensure that the thread and process are cleaned up and all the internal input/output streams are cleaned up and that the gpg process has definitely exited under all situations.

tpolecat · March 10, 2017, 12:57am

For that kind of thing you need fs2, which also subsumes the case above.

natrixia · January 17, 2018, 2:22pm

You can use Choppy’s Lazy TryClose monad for things that need to be closed, flushed, etc…

val output = for {
  outputStream      <- TryClose(new ByteArrayOutputStream())
  gzipOutputStream  <- TryClose(new GZIPOutputStream(outputStream))
  _                 <- TryClose.wrap(gzipOutputStream.write(content))
} yield wrap({gzipOutputStream.flush(); outputStream.toByteArray})
  
output.resolve.unwrap match {
  case Success(bytes) => // process result
  case Failure(e) => // handle exception
}

More info here: https://github.com/choppythelumberjack/tryclose