SWAYER August 9, 2011 at 06:23

Organization of code with introspection in the context of obfuscation and refactoring

From the sandbox

The .NET platform provides a rich API for accessing metadata at run time. But the mechanism of introspection involves late binding to program elements by setting their names and signatures through the corresponding data structures. Such code can lead to a change in the logic of the program to incorrect after refactoring (renaming, changing the order of parameters) or obfuscation of metadata. The solution to this problem is to use syntactic sugar, available through Expression Trees technology and C #.

In order not to go far, for an example we will write a simple class.


public class SimpleClass
{
  private readonly string theValue;
        public SimpleClass(string value, int ratio)
        {
            theValue = value;
        }
        public string Value
        {
            get { return theValue; }
        }
}

Add properties to this class that provide access to its metadata through the introspection mechanism:


 internal static PropertyInfo ValueProperty
        {
            get { return typeof(SimpleClass).GetProperty("Value"); }
        }
        internal static FieldInfo ValueField
        {
            get { return typeof(SimpleClass).GetField("theValue", BindingFlags.NonPublic | BindingFlags.Instance); }
        }
        internal static ConstructorInfo Constructor
        {
            get { return typeof(SimpleClass).GetConstructor(new[] { typeof(string), typeof( }); }
        }

If we output the values of these properties to the console, we get string representations of these program elements. However, if you rearrange the constructor parameters using the refactoring engine built into the IDE (for example, in Visual Studio this is the Refactoring menu, the Reorder parameters item), then the property reflecting the constructor metadata will return an empty link (i.e. null). And if you also apply obfuscation to the assembly, then reflecting in this way the field and the property will become completely impossible (except for the case when the ObfuscationAttribute attribute is used ).
In order to get out of the situation, we arm ourselves with expression trees, lambda expressions and generalizations. From all this we glue the helper method:


static TMember Reflect(Expression memberAccess)
            where TDelegate : class
            where TMember : MemberInfo
        {
            if (memberAccess.Body is MemberExpression)
               return ((MemberExpression)memberAccess.Body).Member as TMember;
            else if (memberAccess.Body is NewExpression)
                return ((NewExpression)memberAccess.Body).Constructor as TMember;
            else if (memberAccess.Body is MethodCallExpression)
                return ((MethodCallExpression)memberAccess.Body).Method as TMember;
            else return null;
        }

This method accepts two type parameters: the first accepts the delegate used to resolve the signature of the lambda expression to be passed as an argument, the second accepts the type of the reflected member of the class (for example, FieldInfo for the field). The passed lambda expression as an argument describes access to the class member of interest to us. Inside the method, an appeal is made to the body of the lambda expression, which is represented as an expression tree. Since the body represents access to a member of the class, we analyze the type of body for possible options:

Access to a property or field;
Calling the new operator (this expression contains a reference to the constructor);
Method call.

The main objective of this solution is to avoid classical reflection methods using binding flags, names of program elements and arrays that describe the signatures of methods and constructors, since these methods are not available for analysis by the refactoring engine and obfuscators.
Now everything is ready for writing reflection code, which changes automatically during refactoring and is safe for obfuscation algorithms.


internal static PropertyInfo ValueProperty
        {
            get { return Reflect, PropertyInfo>(v => v.Value); }
        }
        internal static FieldInfo ValueField
        {
            get { return Reflect, FieldInfo>(v => v.theValue); }
        }
        internal static ConstructorInfo Constructor
        {
            get { return Reflect, ConstructorInfo>((a1, a2) => new SimpleClass(a1, a2)); }
        }

This approach does not use string literals to represent class member names and type arrays to describe constructor signatures.
Why does this code remain working after the obfuscator works on it? The answer lies in how the C # compiler generates code for expression trees. If you open the assembly in ILDASM, you can make sure that the LDTOKEN instruction is used to load the metadata of a class member, which operates with a numeric token indicating the location of the member metadata in the corresponding table inside the PE file.

There is always BUT

This method is only suitable for reflecting program elements available in the current lexical scope.

Tags:

Organization of code with introspection in the context of obfuscation and refactoring

There is always BUT

Also popular now: