Convert string to number

    Helped one of my acquaintances the other day to understand programming. Along the way, they wrote a training program that can convert a string to a number (int). And somehow I wanted to compare the speed of my own imperishable, with the speed of standard tools (Convert.ToInt32 and Int32.Parse). The result of this comparison, at first glance, turned out to be somewhat unusual.

    I think that any of you will be able to solve the problem without any problems, so I don’t see any reason to explain what and how it works.
        class MyConvert
        {
            private static int CharToInt(char c)
            {
                return c - '0';
            }
            public static int ToInt(string s)
            {
                if (s == null) throw new ArgumentException();
                if (s.Length == 0) throw new ArgumentException();
                bool isNegative = false;
                int start = 0;
                switch (s[0])
                {
                    case '-':
                        if (s.Length == 1) throw new ArgumentException();
                        start = 1;
                        isNegative = true;
                        break;
                    case '+':
                        if (s.Length == 1) throw new ArgumentException();
                        start = 1;
                        break;
                }
                int result = 0;
                for (int i = start; i <  s.Length; i++)
                {
                    if (c < '0' || c > '9') throw new ArgumentException();
                    result = checked(result * 10 + CharToInt(s[i]));
                }
                return (isNegative) ? -result : result;
            }
        }
    


    We will compare the speed of the following functions:
            static void MyConvertTest(int numbersCount)
            {
                for (int i = - numbersCount; i < numbersCount; i++)
                {
                    if (i != MyConvert.ToInt(i.ToString()))
                    {
                        throw new ArgumentException();
                    }
                }
            }
            static void ConvertTest(int numbersCount)
            {
                for (int i = - numbersCount; i < numbersCount; i++)
                {
                    if (i != Int32.Parse(i.ToString()))
                    {
                        throw new ArgumentException();
                    }
                }
            }
    


    On my machine (Windows 7, .NET 4.0, intel i5) with numbersCount = 16,000,000, the following results are obtained on average:
    ConvertTest: 08.5859994 seconds
    MyConvertTest: 07.0505985 seconds

    If you use Convert.ToInt32 instead of Int32.Parse, the result will not change. But this is understandable, considering that the Convert.ToInt32 function itself calls Int32.Parse. Thus, we find that the speed of your own bike is faster than the speed of the standard function by ~ 18%.

    If you look at the documentation , it becomes clear that the Int32.Parse function is quite complex. In particular, it can convert the string representation of a number taking into account regional formats. Although in my practice I did not have to use this functionality.

    Let's try to make our creation even a little faster. To do this, change the loop in the ToInt function
        for (int i = start; i <  s.Length; i++)
    

    on the
        int length = s.Length;
        for (int i = start; i < length; i++)
    


    In this case, we get:
    MyConvertTest: 06.2629928 seconds
    That is, now the speed of our function is faster than the standard by ~ 27%. This is pretty unexpected. I thought that the compiler (or the CLR) would be able to understand that since we do not change the s variable inside the loop, the s.Length value will be received only once.

    Now try instead of calling the CharToInt function, embed its body in the ToInt function. In this case,
    MyConvertTest: 05.5496214 seconds.
    So, the speed of operation relative to the standard function increased by ~ 35%. This, in turn, is quite unexpected, since the compiler (or CLR) in some cases can do this on its own.

    Almost everything has been tried. It remains only to try to abandon the for loop. This can be done, for example, as follows:
            unsafe public static int ToInt(string s)
            {
                if (s == null)
                {
                    throw new ArgumentException();
                }
                int result = 0;
                bool isNegative = false;
                fixed(char* p = s)
                {
                    char* chPtr = p;
                    char ch = *(chPtr++);
                    switch (ch)
                    {
                        case '-':
                            isNegative = true;
                            ch = *(chPtr++);
                            break;
                        case '+':
                            ch = *(chPtr++);
                            break;
                    }
                    do
                    {
                        if (ch < '0' || ch > '9')
                        {
                            throw new ArgumentException();
                        }
                        result = result * 10 + (ch - '0');
                        ch = *(chPtr++);
                    }while (ch != '\0');
                }
                return (isNegative) ? -result : result;
            }
    

    Result:
    MyConvertTest: 05.2410683 seconds.
    This is ~ 39% faster than the standard function (and only 3% faster than the for option).

    conclusions


    Of course, attempts to increase the speed of the function of converting a string to a number have no particular value. I can not imagine a single situation in which such a performance increase would have at least some significant impact. Nevertheless, some quite obvious conclusions can be made:
    • In some cases, your own creations may work better than standard solutions.
    • In some cases, allocating a separate variable for the final value of the for loop (that is, replacing “i <s.Length” with “i <length”) can give a good increase in speed (almost 10% in this case). Although this largely depends on the mood of the CLR.
    • In some cases, embedding a method in a function of a body instead of calling the method itself also matters. Although it also depends on the mood of the CLR.


    Quite often in various literature there are quite fair allegations that all low-level optimization must be placed on the shoulders of the compiler and the runtime. They say they are quite intelligent and themselves know better how it will be better. However, as such tests show, optimization of cycles and deployment of functions can result in a 20 percent increase in speed (if you compare the 1st option).

    Of course, you need to do optimization wisely. And far from always in real projects the complication of project support can be justified by speeding up the work by a few seconds.

    UPD fixed the problem with the plus, thanks.
    UPD 2 it4_kp correctly noticed that in all the algorithms proposed by me it does not work
    MyConvert.ToInt( int.MinValue.ToString() )
    

    Indeed, int.MinValue is modulo one more than int.MaxValue. And since in the intermediate calculation a positive range from int is used as a buffer, the module from int.MinValue is not included in it.
    One possible solution: use a negative range as an intermediate buffer
        unsafe public static int ToInt(string s)
        {
                ....
                result = result * 10 - (ch - '0');
             .....
            return (isNegative) ? result : checked(-result);
        }
    


    Already two errors in seemingly primitive code. An additional reason to think about before abandoning standard libraries.

    Also popular now: