Writing a ZLib-based Archiver in .NET

    Why write

    • because it’s convenient to have your own custom tool in which you can interfere with archiving at any stage
    • because it's interesting
    • because many archivers with api are paid, but for others, see the first argument.

    Technology and Libraries

    You will need the zlib.net.dll library ( official site ).
    Visual Studio 2010 Development Environment
    C #
    Framework 3.5 Language

    Technical task

    The archiver should be able to:
    • compress files and directories
    • compile archive without compression
    • encrypt data (with and without compression)
    • exclude specified paths
    • delete files after they are compressed
    • unpack compressed archive


    Archive format

    Through optimization, I came to the following option:
    The size
    Archive type1 byte
    Header length (after compression and encryption)4 bytes
    Title (we will consider in more detail below)N bytes
    Content block of the first fileN bytes
    Content block of the second fileN bytes
    Content Block of Kth FileN bytes

    Archive Header Format
    The size
    Raw Header Size4 bytes
    Block 1N bytes
    Block 2N bytes
    Block KN bytes

    Archive Header Block Format
    The size
    Block size4 bytes
    Absolute path length4 bytes
    Absolute pathN bytes
    Relative path length4 bytes
    Relative pathN bytes
    The size of the object after processing8 bytes

    A little explanation. At the beginning of the archive file, a header is stored that collects all the metadata for the archive objects. The header itself goes through the same compression and encryption stages as the archive files. After the header are blocks storing the contents of the files after processing, the blocks go right up to the mark. The definition of block boundaries follows from the header, in which the sizes of the blocks are stored.

    General principles of work

    The user sets compression options, on the basis of which the necessary file handlers (archiver, encoder) are connected, each such handler contains two methods, Execute and BackExecute. When archiving, we call the Execute method, when unzipping the BackExecute method, and when unzipping we use the handlers in the reverse order. This structure makes it extremely easy to supplement the program with any number of new handlers (for example, implementing other encryption or compression methods).

    Work algorithm

    1. Archive type detection (compressed, encrypted)
    2. Reading a list of archiving objects
    3. Formation of a complete list of archived objects based on the read list and the list of exceptions
    4. Creating an archive header (in object view)
    5. Enumerating the complete list of objects in the header
    6. Processing the object, updating data on its size after processing in the header, writing to the temporary file of the processed content.
    7. Saving Header to File
    8. Header processing (compression, encryption)
    9. Build the resulting archive file


    ZLib can compress / decompress the data transferred to it as an array of bytes. Actually this is all we need and all that we will use. He does not know how to encrypt data, for this we use the standard .NET Framework library - System.Security.Cryptography.
    In the process of archiving / unzipping, you can get data on the current object being processed, as well as errors that have occurred.
    If an error occurs while processing the file, the user is offered a choice of 4 actions:
    • abort
    • ignore error
    • ignore all errors
    • repeat

    The action request can be canceled by simply commenting out the ErrorProcessing event, in which case the program execution will be interrupted.
    I will not give the program code, I give a link to the sources.

    In the form of dll'ki

    svn: //svn.code.sf.net/p/yark/code-0/trunk


    And an example of use:

    ArchiveProvider compressor = new ArchiveProvider();
    using (SaveFileDialog sfd = new SaveFileDialog())
        if (sfd.ShowDialog() == System.Windows.Forms.DialogResult.OK)
            CompressorOption option = new CompressorOption()
                Password = пароль_если_зашифровать,
                WithoutCompress = true_если_без_сжатия,
                RemoveSource = true_если_удалять_исходные_файлы,
                Output = sfd.FileName
            //Списки файлов и каталогов для сжатия
            foreach (string line in lbIncludes.Items)
            //Списки файлов и каталогов для исключения
            foreach (string line in lbExclude.Items)


    ArchiveProvider decompressor = new ArchiveProvider();
    using (FolderBrowserDialog fbd = new FolderBrowserDialog())
        if (fbd.ShowDialog() == System.Windows.Forms.DialogResult.OK)
            decompressor.Decompress(путь_к_архиву, fbd.SelectedPath, пароль_если_зашифрован);

    Comparison of the result of work

    In time, the result did not begin to be detected, approximately the same.
    Initial data:
    • directory with text files (1 430 Kb)
    • mixed data catalog (18 893 Kb)

    Mixed data
    6138 045
    6388 709
    5888 655

    For rar and zip format, the usual compression parameter was set, which is also used in the program.
    The current archive format stores absolute paths of files and directories, you can exclude them and slightly improve compression.

    Possible improvement

    • saving information about the file (date of creation / change, access rights)
    • add multithreading (just parallelize the creation of temporary files)
    • add comments to the archive
    • associate files with the program

    Also popular now: