← Back to mit.katmh.com

# Linear Sorting

• Key point: Lower bound on comparison-based sorting is $\Omega(n \log n)$
• How to get around it?

• Take advantage of constant time to access in arrays (leave the comparison models)
• Restrict the allowed inputs

# Direct Access Array Sort

• Items must have unique keys from a bounded positive range $\{0, ..., u-1\}$ (so $n \leq u$)
• Construct a direct access array of size $u$, insert each item $x$ into $x.key$, then read array left to right
• Inserting takes $\Theta(n)$ time, initializing and scanning take $\Theta(u)$ time, so overall $\Theta(n+u)$. If $u = O(n)$, then this algorithm is linear
• Drawbacks: Cannot handle duplicate keys and large key ranges

# Counting Sort

• Link a chain to each direct acess array index; when multiple items have the same key, store them both in the chain associated with their key
• For the algorithm to be stable (items with duplicate keys appear in same order in input and output), choose chains that support a sequence queue interface, inserting to end of queue and returning items back in order of insertion
• $O(u)$ to initialize chains, $O(n)$ to insert all the elements, $O(u)$ to scan back through; overall $O(n+u)$, which is linear when $u = O(n)$
• Cumulative sums: keep track of how many of each key map to each index, then move each item only once (instead of moving item into chain and then back into place)

# Tuple Sort

• To handle large key ranges, we'll break up integer keys into parts, then sort each part. To do hat, we'll need a strategy to sort multiple parts, i.e. tuples
• Use a stable sorting algorithm as a subroutine, work from least to most important key

• Break each integer into its multiples of powers of $n$, representing each item key as its sequence of digits when represented in base $n$
• If the integers are non-negative and the largest integer in the set is $u$, then this base $n$ number will have $\lceil \log_n u \rceil$ digits
• If the largest integer in the set $u \leq n^c$, then radix sort runs in $O(nc)$ time