Butcher's even-odd merge sort
Introduction
The odd-even mergesort algorithm was developed by Batcher in 1968. The algorithm is not too popular and not too famous. However, it is fairly easy to parallelize and its implementation is not too complicated. Personally, I found out about him when I figured out MPI and saw a test task on coursera: write a Butcher sort.
Basic operations
The above code is not ideal in terms of performance, but it is the easiest to understand and understand the algorithm.
In the algorithm, we will need the following three abstract operations:
compare-exchange - we interchange elements if they go out of order.
template <classT>
voidcompexch(T& a, T&b)
{if (b < a)
std::swap(a, b);
}
perfect shuffle - divide the array in half and then the first element of the first half is the first as a result, the first element of the second half is the second as a result, the second of the first half is the third as a result, etc. The array is necessarily an even length. In fact, we place the elements of the first half in even positions, and of the second in odd positions.
template <classT>
voidshuffle(std::vector<T>& a, unsignedint l, unsignedint r)
{
auto half = (unsignedint) (l + r) / 2;
std::vector<T> tmp(a.size());
unsignedint i, j;
for (i = l, j = 0; i <= r; i += 2, j++)
{
tmp[i] = a[l + j];
tmp[i + 1] = a[half + j + 1];
}
for (auto i = 0; i < tmp.size(); i++)
a[i] = tmp[i];
}
perfect unshuffle - operation opposite to the previous one. Elements that take even positions are sent to the first half of the result array, odd ones to the second.
template <classT>
voidunshuffle(std::vector<T>& a, unsignedint l, unsignedint r)
{
auto half = (unsignedint) (l + r) / 2;
std::vector<T> tmp(a.size());
unsignedint i, j;
for (i = l, j =0; i<=r; i += 2, j++)
{
tmp[l + j] = a[i];
tmp[half + j + 1] = a[i + 1];
}
for (auto i = 0; i < tmp.size(); i++)
a[i] = tmp[i];
}
The algorithm itself
As often happens in posts / articles / books about sorting, we assume that the array that comes to our input has a size equal to a power of two. This will simplify the description of the algorithm and the proof of its correctness. This restriction will be removed later.
Using the introduced operations, the algorithm is formulated quite simply. Using the unshuffle operation, we split the array into two halves. Next, you need to sort each of these halves and then merge it back using the shuffle operation. The algorithm is not just called even-odd sort merge - an approach similar to the well-known The merge The sort , except on the part of the partitioning logic is the other - on an index parity, rather than just two.
The simplest implementation using the entered operations:
template <classT>
voidOddEvenMergeSort(std::vector<T>& a, unsignedint l, unsignedint r)
{
if (r == l + 1) compexch(a[l], a[r]); //мы дошли до подмассива размера 2 - теперь просто сравним элементыif (r < l + 2) return; //дошли до подмассива размера 1 - выходим, такой подмассив априори отсортирован
unshuffle(a, l, r); //делим подмассив на две частиauto half = (unsignedint) (l + r) / 2;
OddEvenMergeSort(a, l, half);
OddEvenMergeSort(a, half + 1, r); //вызываемся рекурсивно для половинок
shuffle(a, l, r); //сливаем частиfor (auto i = l + 1; i < r; i += 2)
compexch(a[i], a[i + 1]);
auto halfSize = (r - l + 1) / 2 - 1; //*for (int i = l + 1; i + halfSize < r; i++) //*
compexch(a[i], a[i + halfSize]); //*
}
Comment
Если вы, как и я, читали про этот алгоритм у Сэджвика в «Фундаментальных алгоритмах на С++», то можете заметить, что у него в функции OddEvenMergeSort нет строк, помеченных "*". Уж опечатка это или что, не знаю. Однако алгоритм, приведенный в его книге, ошибается, например, на строке «ABABABAB».
После первого же вызова unshuffle мы получим «AAAABBBB». Далее мы вызываемся рекурсивно для частей «AAAA» и «BBBB». Будем считать, что алгоритм работает верно. Тогда после вызовов мы так и получим части «AAAA» и «BBBB». Сделаем shuffle, получим «ABABABAB». Попарное сравнение выродится в 4-х кратный вызов compexch(«A», «B»), которые ничего не изменят.
Три добавленные строки решают эту проблему. В будущем, если будет время, опишу, почему.
После первого же вызова unshuffle мы получим «AAAABBBB». Далее мы вызываемся рекурсивно для частей «AAAA» и «BBBB». Будем считать, что алгоритм работает верно. Тогда после вызовов мы так и получим части «AAAA» и «BBBB». Сделаем shuffle, получим «ABABABAB». Попарное сравнение выродится в 4-х кратный вызов compexch(«A», «B»), которые ничего не изменят.
Три добавленные строки решают эту проблему. В будущем, если будет время, опишу, почему.
Description
The principle of operation itself is practically no different from merge sort , however, merge operations are performed in completely different ways. If in merge sort we get two indexes - in the first and second half of the array, where the halves are already sorted, and at each step we simply put the smallest of the current in each half as a result, then here we just do the shuffle operation, and then compare the resulting pairwise array.
How to start?
Just call
OddEvenMergeSort(a, 0, a.size() - 1);
What about arrays of length not being a power of two?
The easiest way is to add the required number of elements to the power of two, which are a priori more (or less) of any element in the original array.
The second approach is this.
Because any number can be represented as the sum of powers of two, then divide the array into such subarrays, sort them individually by Butcher sorting and combine using an operation similar to merge
sorting in merge sort. In this case, small subarrays can be sorted by another algorithm, which, for example, does not require recursive calls .
Work example
AGINORSTAEELMPXY
AIOSAEMXGNRTELPY
AOAMISEX
AAOM
AA
MO
AMAO
AAMO
IESX
EI
SX
ESIX
EISX
AEAIMSOX
AAEIMOSX
GREPNTLY
GERP
EG
PR
EPGR
EGPR
NLTY
LN
TY
LTNY
LNTY
ELGNPTRY
EGLNPRTY
AEAGELINMPORSTXY
AAEEGILMNOPRSTXY