Disposable pattern (Disposable Design Principle) pt.1


    Disposable pattern (Disposable Design Principle)


    I guess almost any programmer who uses .NET will now say this pattern is a piece of cake. That it is the best-known pattern used on the platform. However, even the simplest and well-known problem domain will have secret areas which you have never looked at. So, let’s describe the whole thing from the beginning for the first-timers and all the rest (so that each of you could remember the basics). Don’t skip these paragraphs — I am watching you!


    If I ask what is IDisposable, you will surely say that it is


    public interface IDisposable
    {
        void Dispose();
    }

    What is the purpose of the interface? I mean, why do we need to clear up memory at all if we have a smart Garbage Collector that clears the memory instead of us, so we even don’t have to think about it. However, there are some small details.


    This chapter was translated from Russian jointly by author and by professional translators. You can help us with translation from Russian or English into any other language, primarily into Chinese or German.

    Also, if you want thank us, the best way you can do that is to give us a star on github or to fork repository github/sidristij/dotnetbook.

    There is a misconception that IDisposable serves to release unmanaged resources. This is only partially true and to understand it, you just need to remember the examples of unmanaged resources. Is File class an unmanaged resource? No. Maybe DbContext is an unmanaged resource? No, again. An unmanaged resource is something that doesn’t belong to .NET type system. Something the platform didn’t create, something that exists out of its scope. A simple example is an opened file handle in an operating system. A handle is a number that uniquely identifies a file opened – no, not by you – by an operating system. That is, all control structures (e.g. the position of a file in a file system, file fragments in case of fragmentation and other service information, the numbers of a cylinder, a head or a sector of an HDD) are inside an OS but not .NET platform. The only unmanaged resource that is passed to .NET platform is IntPtr number. This number is wrapped by FileSafeHandle, which is in its turn wrapped by the File class. It means the File class is not an unmanaged resource on its own, but uses an additional layer in the form of IntPtr to include an unmanaged resource — the handle of an opened file. How do you read that file? Using a set of methods in WinAPI or Linux OS.


    Synchronization primitives in multithreaded or multiprocessor programs are the second example of unmanaged resources. Here belong data arrays that are passed through P/Invoke and also mutexes or semaphores.


    Note that OS doesn’t simply pass the handle of an unmanaged resource to an application. It also saves that handle in the table of handles opened by the process. Thus, OS can correctly close the resources after the application termination. This ensures the resources will be closed anyway after you exit the application. However, the running time of an application can be different which can cause long resource locking.

    Ok. Now we covered unmanaged resources. Why do we need to use IDisposable in these cases? Because .NET Framework has no idea what’s going on outside its territory. If you open a file using OS API, .NET will know nothing about it. If you allocate a memory range for your own needs (for example using VirtualAlloc), .NET will also know nothing. If it doesn’t know, it will not release the memory occupied by a VirtualAlloc call. Or, it will not close a file opened directly via an OS API call. These can cause different and unexpected consequences. You can get OutOfMemory if you allocate too much memory without releasing it (e.g. just by setting a pointer to null). Or, if you open a file on a file share through OS without closing it, you will lock the file on that file share for a long time. The file share example is especially good as the lock will remain on the IIS side even after you close a connection with a server. You don’t have rights to release the lock and you will have to ask administrators to perform iisreset or to close resource manually using special software.
    This problem on a remote server can become a complex task to solve.


    All these cases need a universal and familiar protocol for interaction between a type system and a programmer. It should clearly identify the types that require forced closing. The IDisposable interface serves exactly this purpose. It functions the following way: if a type contains the implementation of the IDisposable interface, you must call Dispose() after you finish work with an instance of that type.


    So, there are two standard ways to call it. Usually you create an entity instance to use it quickly within one method or within the lifetime of the entity instance.


    The first way is to wrap an instance into using(...){ ... }. It means you instruct to destroy an object after the using-related block is over, i.e. to call Dispose(). The second way is to destroy the object, when its lifetime is over, with a reference to the object we want to release. But .NET has nothing but a finalization method that implies automatic destruction of an object, right? However, finalization is not suitable at all as we don’t know when it will be called. Meanwhile, we need to release an object at a certain time, for example just after we finish work with an opened file. That is why we also need to implement IDisposable and call Dispose to release all resources we owned. Thus, we follow the protocol, and it is very important. Because if somebody follows it, all the participants should do the same to avoid problems.


    Different ways to implement IDisposable


    Let’s look at the implementations of IDisposable from simple to complicated. The first and the simplest is to use IDisposable as it is:


    public class ResourceHolder : IDisposable
    {
        DisposableResource _anotherResource = new DisposableResource();
        public void Dispose()
        {
            _anotherResource.Dispose();
        }
    }

    Here, we create an instance of a resource that is further released by Dispose(). The only thing that makes this implementation inconsistent is that you still can work with the instance after its destruction by Dispose():


    public class ResourceHolder : IDisposable
    {
        private DisposableResource _anotherResource = new DisposableResource();
        private bool _disposed;
        public void Dispose()
        {
            if(_disposed) return;
            _anotherResource.Dispose();
            _disposed = true;
        }
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        private void CheckDisposed()
        {
            if(_disposed) {
                throw new ObjectDisposedException();
            }
        }
    }

    CheckDisposed() must be called as a first expression in all public methods of a class. The obtained ResourceHolder class structure looks good to destroy an unmanaged resource, which is DisposableResource. However, this structure is not suitable for a wrapped-in unmanaged resource. Let’s look at the example with an unmanaged resource.


    public class FileWrapper : IDisposable
    {
        IntPtr _handle;
        public FileWrapper(string name)
        {
            _handle = CreateFile(name, 0, 0, 0, 0, 0, IntPtr.Zero);
        }
        public void Dispose()
        {
            CloseHandle(_handle);
        }
        [DllImport("kernel32.dll", EntryPoint = "CreateFile", SetLastError = true)]
        private static extern IntPtr CreateFile(String lpFileName,
            UInt32 dwDesiredAccess, UInt32 dwShareMode,
            IntPtr lpSecurityAttributes, UInt32 dwCreationDisposition,
            UInt32 dwFlagsAndAttributes,
            IntPtr hTemplateFile);
        [DllImport("kernel32.dll", SetLastError=true)]
        private static extern bool CloseHandle(IntPtr hObject);
    }

    What is the difference in the behavior of the last two examples? The first one describes the interaction of two managed resources. This means that if a program works correctly, the resource will be released anyway. Since DisposableResource is managed, .NET CLR knows about it and will release the memory from it if its behaviour is incorrect. Note that I consciously don’t assume what DisposableResource type encapsulates. There can be any kind of logic and structure. It can contain both managed and unmanaged resources. This shouldn't concern us at all. Nobody asks us to decompile third party’s libraries each time and see whether they use managed or unmanaged resources. And if our type uses an unmanaged resource, we cannot be unaware of this. We do this in FileWrapper class. So, what happens in this case? If we use unmanaged resources, we have two scenarios. The first one is when everything is OK and Dispose is called. The second one is when something goes wrong and Dispose failed.


    Let’s say straight away why this may go wrong:


    • If we use using(obj) { ... }, an exception may appear in an inner block of code. This exception is caught by finally block, which we cannot see (this is syntactic sugar of C#). This block calls Dispose implicitly. However, there are cases when this doesn’t happen. For example, neither catch nor finally catch StackOverflowException. You should always remember this. Because if some thread becomes recursive and StackOverflowException occurs at some point, .NET will forget about the resources that it used but not released. It doesn’t know how to release unmanaged resources. They will stay in memory until OS releases them, i.e. when you exit a program, or even some time after the termination of an application.
    • If we call Dispose() from another Dispose(). Again, we may happen to fail to get to it. This is not the case of an absent-minded app developer, who forgot to call Dispose(). It is the question of exceptions. However, these are not only the exceptions that crash a thread of an application. Here we talk about all exceptions that will prevent an algorithm from calling an external Dispose() that will call our Dispose().

    All these cases will create suspended unmanaged resources. That is because Garbage Collector doesn’t know it should collect them. All it can do upon next check is to discover that the last reference to an object graph with our FileWrapper type is lost. In this case, the memory will be reallocated for objects with references. How can we prevent it?


    We must implement the finalizer of an object. The 'finalizer' is named this way on purpose. It is not a destructor as it may seem because of similar ways to call finalizers in C# and destructors in C++. The difference is that a finalizer will be called anyway, contrary to a destructor (as well as Dispose()). A finalizer is called when Garbage Collection is initiated (now it is enough to know this, but things are a bit more complicated). It is used for a guaranteed release of resources if something goes wrong. We must implement a finalizer to release unmanaged resources. Again, because a finalizer is called when GC is initiated, we don’t know when this happens in general.


    Let’s expand our code:


    public class FileWrapper : IDisposable
    {
        IntPtr _handle;
        public FileWrapper(string name)
        {
            _handle = CreateFile(name, 0, 0, 0, 0, 0, IntPtr.Zero);
        }
        public void Dispose()
        {
            InternalDispose();
            GC.SuppressFinalize(this);
        }
        private void InternalDispose()
        {
            CloseHandle(_handle);
        }
        ~FileWrapper()
        {
            InternalDispose();
        }
        /// other methods
    }

    We enhanced the example with the knowledge about the finalization process and secured the application against losing resource information if Dispose() is not called. We also called GC.SuppressFinalize to disable the finalization of the instance of the type if Dispose() is successfully called. There is no need to release the same resource twice, right? Thus, we also reduce the finalization queue by letting go a random region of code that is likely to run with finalization in parallel, some time later. Now, let’s enhance the example even more.


    public class FileWrapper : IDisposable
    {
        IntPtr _handle;
        bool _disposed;
        public FileWrapper(string name)
        {
            _handle = CreateFile(name, 0, 0, 0, 0, 0, IntPtr.Zero);
        }
        public void Dispose()
        {
            if(_disposed) return;
            _disposed = true;
            InternalDispose();
            GC.SuppressFinalize(this);
        }
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        private void CheckDisposed()
        {
            if(_disposed) {
                throw new ObjectDisposedException();
            }
        }
        private void InternalDispose()
        {
            CloseHandle(_handle);
        }
        ~FileWrapper()
        {
            InternalDispose();
        }
        /// other methods
    }

    Now our example of a type that encapsulates an unmanaged resource looks complete. Unfortunately, the second Dispose() is in fact a standard of the platform and we allow to call it. Note that people often allow the second call of Dispose() to avoid problems with a calling code and this is wrong. However, a user of your library who looks at MS documentation may not think so and will allow multiple calls of Dispose(). Calling other public methods will destroy the integrity of an object anyway. If we destroyed the object, we cannot work with it anymore. This means we must call CheckDisposed at the beginning of each public method.


    However, this code contains a severe problem that prevents it from working as we intended. If we remember how garbage collection works, we will notice one feature. When collecting garbage, GC primarily finalizes everything inherited directly from Object. Next it deals with objects that implement CriticalFinalizerObject. This becomes a problem as both classes that we designed inherit Object. We don’t know in which order they will come to the “last mile”. However, a higher-level object can use its finalizer to finalize an object with an unmanaged resource. Although, this doesn’t sound like a great idea. The order of finalization would be very helpful here. To set it, the lower-level type with an encapsulated unmanaged resource must be inherited from CriticalFinalizerObject.


    The second reason is more profound. Imagine that you dared to write an application that doesn’t take much care of memory. It allocates memory in huge quantities, without cashing and other subtleties. One day this application will crash with OutOfMemoryException. When it occurs, code runs specifically. It cannot allocate anything, since it will lead to a repeated exception, even if the first one is caught. This doesn’t mean we shouldn’t create new instances of objects. Even a simple method call can throw this exception, e.g. that of finalization. I remind you that methods are compiled when you call them for the first time. This is usual behavior. How can we prevent this problem? Quite easily. If your object is inherited from CriticalFinalizerObject, then all methods of this type will be compiled straight away upon loading it in memory. Moreover, if you mark methods with [PrePrepareMethod] attribute, they will be also pre-compiled and will be secure to call in a low resource situation.


    Why is that important? Why spend too much effort on those that pass away? Because unmanaged resources can be suspended in a system for long. Even after you restart a computer. If a user opens a file from a file share in your application, the former will be locked by a remote host and released on the timeout or when you release a resource by closing the file. If your application crashes when the file is opened, it won't be released even after reboot. You will have to wait long until the remote host releases it. Also, you shouldn’t allow exceptions in finalizers. This leads to an accelerated crash of the CLR and of an application as you cannot wrap the call of a finalizer in try… catch. I mean, when you try to release a resource, you must be sure it can be released. The last but not less important fact: if the CLR unloads a domain abnormally, the finalizers of types, derived from CriticalFinalizerObject will be also called, unlike those inherited directly from Object.


    This charper translated from Russian as from language of author by professional translators. You can help us with creating translated version of this text to any other language including Chinese or German using Russian and English versions of text as source.

    Also, if you want to say «thank you», the best way you can choose is giving us a star on github or forking repository https://github.com/sidristij/dotnetbook

    Also popular now: