Optimizing C++/General optimization techniques/Sorting
The C++ Standard Template Library (STL) provides the template function sort
that implements a comparison sort algorithm. Because sort
is templatized, it can be used for various types of sequences holding any type of key, as long as the keys are comparable (implement the < operator). A good compiler can generate code optimized for the various kinds of sequence/key combinations.
The reference implementation of the STL uses the introsort algorithm (since the 2000 release; the GNU C++ library uses the reference implementation). This algorithm is a very fast combination of quicksort and heapsort with a specially designed selection algorithm.
The sort
template function is not guaranteed to be stable. When a stable sort is required, use the stable_sort
template function instead.
This section suggests alternatives to the sort
and stable_sort
template functions that may be faster in specific cases.
Sorting with small ranges of keys
[edit | edit source]To sort a data set according an integer key having a small range, use the counting sort algorithm.
The counting sort algorithm has O(N+M) complexity, where N is the number of elements to sort and M is the range of the sort keys, that is the difference between the highest key and the lowest key.
In case N elements are to be sorted whose key is an integer number belonging to an interval containing no more than two times N values (i.e when M <= 2 * N
holds), this algorithm may be quite faster than sort
.
In some cases it is faster even with larger ranges.
This algorithm may be used also for a partial ordering; for example, if the keys are integers between zero and one billion, you can still sort them using only the most significant byte, so to get an order for which the formula holds.
Example: sorting 8-bit integers
[edit | edit source]Say you want to sort an array of arbitrary unsigned char
elements. <climits>
defines constant limits for integer and char types for a specific implementation. CHAR_BIT is the number of bits in a char
object. ISO C++ requires CHAR_BIT to be 8 or greater. An unsigned char
may have a value in the range between 0 and UCHAR_MAX. ISO C++ requires UCHAR_MAX to be 255 (2^8-1) or greater.
Note: unsigned char
must be used because char
can be signed or unsigned depending on the implementation.
#include <climits>
void count_sort(unsigned char *a, unsigned char *const end)
{
unsigned char freq[UCHAR_MAX+1] = {0};
unsigned char *p, c;
for (p = a; p < end; ++p) {
freq[*p] += 1;
}
for (c = 0, p = a; c < UCHAR_MAX; ++c) {
while (freq[c]-- > 0) {
*p++ = c;
}
}
while (freq[c]-- > 0) {
*p++ = c;
}
}
The counting_sort
function implements the pigeonhole sort algorithm. It takes a pointer to the first element of the input array and a pointer that points one element beyond the end of the array. Why? Because we don't have to stop here.
We can generalize counting_sort
to a template function that also works for string
, vector<unsigned char>
and other sequence types, without loss of efficiency. When doing so, we need to work with iterators rather than pointers.
#include <iterator>
#include <limits>
template <typename iterator>
void counting_sort(iterator const &begin, iterator const &end)
{
typedef std::iterator_traits<iterator>::value_type T;
T max = std::numeric_limits<T>::max();
T freq[max+1] = {0};
iterator it;
T c;
for (it = begin; it < end; ++it) {
freq[*it] += 1;
}
for (c = 0, it = begin; c < max; ++c)
while (freq[c]-- > 0) {
*it++ = c;
}
}
while (freq[c]-- > 0) {
*it++ = c;
}
}
Partial sorting
[edit | edit source]Partitioning
[edit | edit source]If you have to split a sequence according a criterion, use a partitioning algorithm, instead of a sorting one.
In STL there is the std::partition
algorithm, that is faster than the std::sort
algorithm, as it has O(N) complexity, instead of O(N log(N)).
Stable partitioning and sorting
[edit | edit source]If you have to partition or sort a sequence for which equivalent entities may be swapped, don't use a stable algorithm.
In STL there is the std::stable_partition
partitioning algorithm, that is slightly slower than the std::partition
algorithm; and there is the std::stable_sort
sorting algorithm, that is slightly slower than the std::sort
algorithm.
Order partitioning
[edit | edit source]If you have to pick out the first N elements from a sequence, or the Nth element in a sequence, use an order partitioning algorithm, instead of a sorting one.
In STL there is the std::nth_element
algorithm, that, although slightly slower than the std::stable_partition
algorithm, is quite much faster than the std::sort
algorithm, as it has O(N) complexity, instead of O(N log(N)).
Sorting only the first N elements
[edit | edit source]If you have to sort the first N elements of a much longer sequence, use an order statistic algorithm, instead of a sorting one.
In STL there are the std::partial_sort
and std::partial_sort_copy
algorithms, that, although slower than the std::nth_element
algorithm, are so much faster than the std::sort
algorithm as the partial sequence to sort is shorter than the whole sequence.