Merge Sort
Merge sort is a divide-and-conquer algorithm that splits an array in half, recursively sorts each half, then merges the two sorted halves. It guarantees O(n log n) in all cases and is stable, making it a top choice for sorting linked lists and for stable sort requirements.
Divide & Conquer
Array: [5, 3, 8, 1, 2, 7, 4, 6]
Divide:
[5,3,8,1] [2,7,4,6]
[5,3] [8,1] [2,7] [4,6]
[5][3] [8][1] [2][7] [4][6] ← base cases (single elements)
Conquer (merge bottom-up):
[3,5] [1,8] [2,7] [4,6]
[1,3,5,8] [2,4,6,7]
[1,2,3,4,5,6,7,8] ← fully sorted
Implementation
def merge_sort(arr):
if len(arr) <= 1:
return arr
mid = len(arr) // 2
left = merge_sort(arr[:mid])
right = merge_sort(arr[mid:])
return merge(left, right)
def merge(left, right):
result = []
i = j = 0
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
result.extend(left[i:])
result.extend(right[j:])
return result
nums = [5, 3, 8, 1, 2, 7, 4, 6]
print(merge_sort(nums)) # [1, 2, 3, 4, 5, 6, 7, 8]
The Merge Step in Detail
The merge step is the heart of merge sort. It combines two sorted arrays into one in O(n) time by advancing two pointers and always picking the smaller front element.
left = [1, 3, 5, 8]
right = [2, 4, 6, 7]
# Trace:
# i=0,j=0: 1 < 2 → take 1 result=[1]
# i=1,j=0: 3 > 2 → take 2 result=[1,2]
# i=1,j=1: 3 < 4 → take 3 result=[1,2,3]
# i=2,j=1: 5 > 4 → take 4 result=[1,2,3,4]
# i=2,j=2: 5 < 6 → take 5 result=[1,2,3,4,5]
# i=3,j=2: 8 > 6 → take 6 result=[1,2,3,4,5,6]
# i=3,j=3: 8 > 7 → take 7 result=[1,2,3,4,5,6,7]
# i=3,j=4: j exhausted → extend with left[3:] = [8]
# result=[1,2,3,4,5,6,7,8]
Stability
Merge sort is stable because the merge step uses <= to prefer the left element on ties, preserving the original relative order of equal elements. This is critical when sorting records by one field while maintaining secondary ordering.
# Sort people by age; people with same age keep original order
people = [("Alice", 30), ("Bob", 25), ("Carol", 30), ("Dave", 25)]
sorted_people = merge_sort_by_age(people)
# Expected: [('Bob',25), ('Dave',25), ('Alice',30), ('Carol',30)]
# Bob before Dave (original order), Alice before Carol (original order)
Space Trade-off
Each merge creates a new list. Total auxiliary space is O(n) at any level of recursion (each level processes n elements in total). This is the main disadvantage vs quick sort, which is O(log n) extra space. For memory-constrained environments, quick sort is often preferred despite its O(n²) worst case.
Complexity Table
| Case | Time | Space | Note |
|---|---|---|---|
| Best case | O(n log n) | O(n) | Always splits and merges |
| Average case | O(n log n) | O(n) | |
| Worst case | O(n log n) | O(n) | Guaranteed — no bad pivot |
| Stable | Yes | — | Left element taken on ties |
| In-place | No | — | Requires O(n) auxiliary space |