Karatsuba algorithm for multiplying two numbers

    Somehow I had to implement the multiplication of long numbers, through the halves of their representation in C ++. 128-bit numbers were presented as a pair of 64-bit numbers. It turned out that to multiply two 128-bit numbers and get all 256 bits of the result in complexity is comparable to 4 products of 64-bit halves. How does it work ...



    For multiplication, the method of the domestic mathematician Anatoly Alekseevich Karatsuba was used . First, I’ll give you my comment on the function. He clarifies a lot about the algorithm.

    //
    // Karatsuba multiplication algorithm
    //
    //            +------+------+
    //            |  x1  |  x0  |
    //            +------+------+
    //                   *
    //            +------+------+
    //            |  y1  |  y0  |
    //            +------+------+
    //                   =
    //     +-------------+-------------+
    //  +  |    x1*y1    |    x0*y0    |
    //     +----+-+------+------+------+
    //          . .             .
    //          . .             .
    //          +-+------+------+
    //       +  | x0*y1 + x1*y0 |
    //          +-+------+------+
    //
    

    In principle, the commentary makes it clear how the result is calculated.
    The essence of the method can be understood if the following formula is derived:



    T symbolizes the indentation, respectively T ^ 2 - double indentation.
    Here is the code that performs these operations:

    //
    // Params:
    //     T  - is type of x0, x1, y0 and y1 halves
    //     T2 - is type of x, y and half of res
    //
    template
    inline void Karatsuba_multiply(T * const x, T * const y, T2 * res)
    {
        // Define vars (depending from byte order)
        #define ptrRes ((T*)res)
        T2 &  lowWord = (T2&)(ptrRes[0]);
        T2 &  midWord = (T2&)(ptrRes[1]);
        T2 & highWord = (T2&)(ptrRes[2]);
        T  & highByte = (T &)(ptrRes[3]);
        #undef ptrRes
        const T & x0 = x[0];
        const T & x1 = x[1];
        const T & y0 = y[0];
        const T & y1 = y[1];
        // Multiply action
        lowWord  = x0 * y0;
        highWord = x1 * y1;
        T2 m1 = x0 * y1;
        T2 m2 = x1 * y0;
        midWord += m1;
        if (midWord < m1) highByte++;
        midWord += m2;
        if (midWord < m2) highByte++;
    }
    

    Type T is a type of half, type x0, x1, y0, y1.
    Type N2 is the type of half the result, 2 times that of type T.

    Example of using the function:

    int main()
    {
        typedef unsigned char   u8;
        typedef unsigned short u16;
        typedef unsigned int   u32;
        u16 a = 1000;
        u16 b = 2000;
        u32 r = 0;
        u8  * a_ptr = (u8*)&a;
        u8  * b_ptr = (u8*)&b;
        u16 * r_ptr = (u16*)(void*)&r;
        Karatsuba_multiply(a_ptr, b_ptr, r_ptr);
        cout << r;
    }
    

    The code with the result of execution can be found here: codepad.org/1OTeGqhA
    Code with an algorithm for different orders of bytes (LE + BE) here:codepad.org/f5Pxtiq1

    UPDATE1:
    User mark_ablov noticed that the code lacks #undef.
    Corrected Code:codepad.org/13U4fuTp
    Corrected full code (LE + BE):codepad.org/kBazqo8f

    UPDATE2: The
    user ttim noticed that the above algorithm is not exactly the Karatsuba method and pointed to the possibility of using 3 multiplications instead of four.

    Clarity formula:



    Thus, only 3 products need to be calculated:
    1. a * x
    2. b * y
    3. (Ta + b) * (Tx + y)

    Unfortunately, I will not be able to use this formula, because . I do not have the ability to multiply 128-bit numbers. Actually, my task was to implement the multiplication of 128-bit numbers. The fact is that the numbers of (Ta + b) and (Tx + y) are 128 bits ...

    UPDATE3: The
    user susl continued the discussion of the algorithm and showedthat you don’t need to multiply 128-bit numbers at all.

    Another formula:



    The function can be rewritten as follows:

    //
    // Params:
    //     T  - is type of x0, x1, y0 and y1 halves
    //     T2 - is type of x, y and half of res
    //
    template
    inline void Karatsuba_multiply(T * const x, T * const y, T2 * res)
    {
        // Define vars (depending from byte order)
        #define ptrRes ((T*)res)
        T2 &  lowWord = (T2&)(ptrRes[0]);
        T2 &  midWord = (T2&)(ptrRes[1]);
        T2 & highWord = (T2&)(ptrRes[2]);
        T  & highByte = (T &)(ptrRes[3]);
        #undef ptrRes
        const T & x0 = x[0];
        const T & x1 = x[1];
        const T & y0 = y[0];
        const T & y1 = y[1];
        // Multiply action
        lowWord  = x0 * y0;
        highWord = x1 * y1;
        //T2 m1 = x0 * y1;
        //T2 m2 = x1 * y0;
        T2 m = (x0+x1)*(y0+y1) - (lowWord + highWord);
        //midWord += m1;
        //if (midWord < m1) highByte++;
        //midWord += m2;
        //if (midWord < m2) highByte++;
        midWord += m;
        if (midWord < m) highByte++;
    }
    

    Corrected example: http://codepad.org/w0INBD77
    Corrected example for LE and BE: http://codepad.org/nB9HqWt1

    Also popular now: