- Key point: Lower bound on comparison-based sorting is Ω(nlogn)
-
How to get around it?
- Take advantage of constant time to access in arrays (leave the comparison models)
- Restrict the allowed inputs
Direct Access Array Sort
- Items must have unique keys from a bounded positive range {0,...,u−1} (so n≤u)
- Construct a direct access array of size u, insert each item x into x.key, then read array left to right
- Inserting takes Θ(n) time, initializing and scanning take Θ(u) time, so overall Θ(n+u). If u=O(n), then this algorithm is linear
- Drawbacks: Cannot handle duplicate keys and large key ranges
Counting Sort
- Link a chain to each direct acess array index; when multiple items have the same key, store them both in the chain associated with their key
- For the algorithm to be stable (items with duplicate keys appear in same order in input and output), choose chains that support a sequence queue interface, inserting to end of queue and returning items back in order of insertion
- O(u) to initialize chains, O(n) to insert all the elements, O(u) to scan back through; overall O(n+u), which is linear when u=O(n)
- Cumulative sums: keep track of how many of each key map to each index, then move each item only once (instead of moving item into chain and then back into place)
Tuple Sort
- To handle large key ranges, we'll break up integer keys into parts, then sort each part. To do hat, we'll need a strategy to sort multiple parts, i.e. tuples
- Use a stable sorting algorithm as a subroutine, work from least to most important key
Radix Sort
- Break each integer into its multiples of powers of n, representing each item key as its sequence of digits when represented in base n
- If the integers are non-negative and the largest integer in the set is u, then this base n number will have ⌈lognu⌉ digits
- Run tuple sort from least to most significant digit
- If the largest integer in the set u≤nc, then radix sort runs in O(nc) time
Exercises