How does ProGuard work


    If you have ever thought about the security of your application, or somehow wanted to optimize your code, then, for sure, you know what ProGuard is. Perhaps you have already suffered from it or were able to overcome the documentation, a couple of articles in the open spaces and figured out what's what.

    In this article, I will not talk about how to set keep rules or about some useful options. In my opinion, in order to drag in the project ProGuard it is quite enough to view the tutorials attached to it. I will analyze exactly how ProGuard works in terms of code. And if you are interested - welcome under cat.

    Many people are mistaken about the ProGuard account, mistakenly believing that this is an obfuscator, which it is not for several reasons:

    • It does not mix the code.
    • It does not encrypt the code.

    The main task of ProGuard is to change the names of objects, classes, methods, thereby making it difficult for code analysis for a reverse engineer. In addition, he also optimizes the code, removing unused resources in the program. But, ultimately, in the classical sense, it cannot be called an obfuscator.

    So what are we dealing with?


    In general, ProGuard is an open-source utility that works with java-code. Yes, and it is also written in java. The guys who deal with it, develop more and DexGuard, the sorts of which also, if you rustle, can be found in the open spaces, as they periodically merge. But in general, DexGuard is considered to be paid, in fact being a more hard-drive version of the same ProGuard.

    So we can conclude that ProGuard is a jar file that rearranges the characters in our code, optimizes it and, it seems, increases security. By default, ProGuard works with 26 uppercase and lowercase English letters.

    The rules for proguard are dragged from the project to the project and are considered inviolable, because don’t touch it - it works that way, otherwise some hellish red lines may start, which nobody knows to fix, and he doesn’t want to know. And these rules were made by some kind of oracle, to reach which you need to turn around yourself seven times, turn into a bird and fly to the south-west for two hours and forty-three minutes.

    Well, since no one wants to go there, and no one needs to go there, let's climb into the weights .

    If you look at the directories of the project proguard, you can immediately identify its main functions.


    So far everything seems to be clear. So let's look at the main class.

    publicclassProGuard{
        //…privatefinal MultiValueMap<String, String> injectedClassNameMap = new MultiValueMap<String, String>();
       //.../**
         * The main method for ProGuard.
         */publicstaticvoidmain(String[] args){
            if (args.length == 0)
            {
                System.out.println(VERSION);
                System.out.println("Usage: java proguard.ProGuard [options ...]");
                System.exit(1);
            }
            // Create the default options.
            Configuration configuration = new Configuration();
            try
            {
                // Parse the options specified in the command line arguments.
                ConfigurationParser parser = new ConfigurationParser(args,
                                                                     System.getProperties());
                //...

    Well, let's focus here on the obvious main method, in which we see how the default rules are set and set by the developer himself.

    In addition, there is a quite expected injectedClassNameMap object, with which we get the build / outputs / proguard / release / mapping.txt file that looks like this:


    So if we suddenly want to open our own code and bring it into a readable form, you can do this using mapping.txt. To do this, when publishing an apk-file, you need to download the mapping.txt version in Google Play Console.

    Now you can look at the parser configurations that the developer sets.

    publicclassConfigurationParser{
        //.../**
         * Parses and returns the configuration.
         * @param configuration the configuration that is updated as a side-effect.
         * @throws ParseException if the any of the configuration settings contains
         *                        a syntax error.
         * @throws IOException if an IO error occurs while reading a configuration.
         */publicvoidparse(Configuration configuration)throws ParseException, IOException
        {
            while (nextWord != null)
            {
                lastComments = reader.lastComments();
                // First include directives.if      (ConfigurationConstants.AT_DIRECTIVE                                     .startsWith(nextWord) ||
                         ConfigurationConstants.INCLUDE_DIRECTIVE                                .startsWith(nextWord)) configuration.lastModified                          = parseIncludeArgument(configuration.lastModified);
                elseif (ConfigurationConstants.BASE_DIRECTORY_DIRECTIVE                         .startsWith(nextWord)) parseBaseDirectoryArgument();
                // Then configuration options with or without arguments.elseif (ConfigurationConstants.INJARS_OPTION                                    .startsWith(nextWord)) configuration.programJars                           = parseClassPathArgument(configuration.programJars, false);
                elseif (ConfigurationConstants.OUTJARS_OPTION                                   .startsWith(nextWord)) configuration.programJars                           = parseClassPathArgument(configuration.programJars, true);
                //…elseif (ConfigurationConstants.KEEP_CLASSES_WITH_MEMBER_NAMES_OPTION            .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, false, true,  true,  null);
                elseif (ConfigurationConstants.PRINT_SEEDS_OPTION                               .startsWith(nextWord)) configuration.printSeeds                            = parseOptionalFile();
                // After '-keep'.elseif (ConfigurationConstants.KEEP_DIRECTORIES_OPTION                          .startsWith(nextWord)) configuration.keepDirectories                       = parseCommaSeparatedList("directory name", true, true, false, true, false, true, true, false, false, configuration.keepDirectories);
        //...

    Wow, I said no obfuscator, no obfuscator, but here you can see the whole obfuscate directory. How so?


    If you look at the above screen, you can easily find classes that are responsible for renaming objects (SimpleNameFactory, ClassRenamer ...). As I said above, 26 Latin characters are used by default.

    publicclassSimpleNameFactoryimplementsNameFactory{
        privatestaticfinalint CHARACTER_COUNT = 26;
        privatestaticfinal List cachedMixedCaseNames = new ArrayList();
        privatestaticfinal List cachedLowerCaseNames = new ArrayList();
        privatefinalboolean generateMixedCaseNames;
        privateint  index = 0;
        //…

    In the SimpleNameFactory class, there is a special method for checking printNameSamples () that will give us quite expected values.

    publicstaticvoidmain(String[] args){
            System.out.println("Some mixed-case names:");
            printNameSamples(new SimpleNameFactory(true), 60);
            System.out.println("Some lower-case names:");
            printNameSamples(new SimpleNameFactory(false), 60);
            System.out.println("Some more mixed-case names:");
            printNameSamples(new SimpleNameFactory(true), 80);
            System.out.println("Some more lower-case names:");
            printNameSamples(new SimpleNameFactory(false), 80);
        }
        privatestaticvoidprintNameSamples(SimpleNameFactory factory, int count){
            for (int counter = 0; counter < count; counter++)
            {
                System.out.println("  ["+factory.nextName()+"]");
            }
        }

    Some mixed-case names:
    [a]
    [b]
    [c]
    [d]
    [e]
    [f]
    [g]
    [h]
    [i]
    [j]
    [k]
    ...

    Obfuscator class is responsible for “obfuscation”, in which there is a single execute method, where all the collected pool of classes of the project itself and all libraries added to it are passed.

    publicclassObfuscator{
        privatefinal Configuration configuration;
        //...publicvoidexecute(ClassPool programClassPool,
                            ClassPool libraryClassPool)throws IOException
        {
            // Check if we have at least some keep commands.if (configuration.keep         == null &&
                configuration.applyMapping == null &&
                configuration.printMapping == null)
            {
                thrownew IOException("You have to specify '-keep' options for the obfuscation step.");
            }
        //...

    In addition to the ProGuard present optimization, which runs the class Optimizer, thereby performing a very important function to clean up unexploited resources. It also takes into account the parameters specified by the developer. So if you want to be sure that the code is safe, then you can always prescribe rules for it. Optimization is launched from the ProGuard class.

    /**
         * Performs the optimization step.
         */privatebooleanoptimize(int currentPass,
                                 int maxPasses)throws IOException
        {
            if (configuration.verbose)
            {
                System.out.println("Optimizing (pass " + currentPass + "/" + maxPasses + ")...");
            }
            // Perform the actual optimization.returnnew Optimizer(configuration).execute(programClassPool,
                                                        libraryClassPool,
                                                        injectedClassNameMap);
        }
    

    The work of proguard can be divided into several stages:

    1. Read set rules
    2. Optimization
    3. Deleting resources flagged in optimization
    4. Rename objects
    5. Write the project to the specified directory in a revised form

    You can manually start proguard with the command:

    java -jar proguard.jar @android.pro

    Where proguard.jar is the project collected by ProGuard, and android.pro are the rules for working with input and output data parameters.

    Why writing your own proguard is too painful


    In fact, while I was climbing the ProGuard code, I saw in the author’s column only one name - Eric Lafortune. By quick googling, I found his personal website , if anyone is interested, you can familiarize yourself with it.

    Google itself progus us ProGuard, as the only solution for optimizing and protecting your code, and in fact it is only one. All other solutions are either paid or are on github and are covered with dust and I personally would not advise you to try to use them in your projects, because the main problem of minification, entanglement, repacking and optimization is that a conflict may arise at any moment as it is difficult to foresee all the options that may occur with the code. In addition, such a utility should be covered as tightly as possible with tests, and who likes to do this? :) Unfortunately, everyone likes to talk about tests, but not to write them.

    Why use ProGuard is useless


    ProGuard works according to the rules that are known to be known, and if you do not set the rules and simply include it in your project, it will not be difficult for the attacker to access the code for the intruder, because the reverse converters have been written for a long time and are publicly available. Of course, if you study the topic in more detail and it will be more difficult to add rules, but quite a bit. Which exit?

    Companies for which the concealment of their code is a priority forge from ProGuard and modify it to fit their needs, thereby obtaining a unique solution.

    Why, why, you have nothing to do?


    In general, proguard is not a huge utility and nothing supernatural is happening there, so it’s quite possible to study the sources by spending a couple of evenings drinking a cup of tea and stroking the cat. Why do you need it? In order to know in more detail the tools you work with and understand what they do with your code, do you really need them so much. This applies not only to ProGuard, but also to any other third-party code that you use in your project. Your code is your area of ​​responsibility and be clean in it, otherwise what is the point of doing development at all?

    The article was written about six months ago, and the proguard is constantly evolving, because some fragments may no longer coincide.

    PS I publish all the collections as always in the @paradisecurity cable channel, and the link can be found in my profile, or found in the search for telegrams by name.

    Also popular now: