🤑 Join the Treehouse affiliate program and earn 25% recurring commission!

New No-Code Track! 🚀start learning today!

🌟 Dreaming of a bright future? 🎓 Ask about the Treehouse Scholarship program! 🚀

✨ Earn college credits in Cybersecurity, JS, HTML, CSS and Python

Well done!

You have completed Algorithms: Sorting and Searching!

Sign up for Treehouse Back to Library

Preview

Sign up for Treehouse Continue

Merge Sort

7:10 with Jay McGavren

Let's review another sorting algorithm, Merge Sort, so we can compare it with Quick Sort.

Teacher's Notes
Questions?
Video Transcript
Downloads
Workspaces

If you'd like more details on Merge Sort, check this stage from our Introduction to Data Structures course.

Both Quicksort and Merge Sort are recursive. The difference between them is in the sorting mechanism itself. Whereas Quicksort sorts a list into two sub-lists that are less than or greater than a pivot value, Merge Sort simply splits the list in half recursively, and then sorts the halves as it merges them back together. (That's why it's called "Merge Sort".)

# You may recognize this code at the top by now; it just loads a file full
# of numbers into a list.
import sys
from load import load_numbers

numbers = load_numbers(sys.argv[1])

# Let's define a recursive merge sort function. As usual, it takes the
# list or sub-list that we want it to sort.
def merge_sort(values):
  # Our base case is the same as with Quicksort. If the list has zero or one
  # values, there's nothing to sort, so we return it as-is.
  if len(values) <= 1:
    return values
  # If we didn't return, it means we're in the recursive case. So first we need
  # to split the list in half. We need to know the index we should split on,
  # so we get the length of the list and divide it by 2. So for example if
  # there are 8 items in the list, we'll want an index of 4. But what if there
  # were an odd number of items in the list, like 7? We can't have an index of
  # 3.5, so we'll need to round down in that case. Since we're working in
  # Python currently, we can take advantage of a special Python operator that
  # divides and rounds the result down: the floor division operator. It
  # consists of a double slash.
  middle_index = len(values) // 2
  # Now we'll use the Python slice syntax to get the left half of the list.
  # We'll pass that list to a recursive call to the merge_sort function.
  left_values = merge_sort(values[:middle_index])
  # We'll also use slice syntax to get the right half of the list, and pass
  # that to merge_sort as well.
  right_values = merge_sort(values[middle_index:])
  # Now we need to merge the two halves together, and sort them as we do it.
  # We'll create a list to hold the sorted values.
  sorted_values = []
  # Now we get to the complicated part - merging the two halves together and
  # sorting them as we do it.
  # We'll be moving from left to right through the left half of the list,
  # copying values over to the sorted_values list as we go. This left_index
  # variable will help us keep track of our position.
  left_index = 0
  # At the same time, we'll also be moving from left to right through the right
  # half of the list and copying values over, so we need a separate
  # right_index variable to track that position as well.
  right_index = 0
  # We'll keep looping until we've processed all of the values in both halves
  # of the list.
  while left_index < len(left_values) and right_index < len(right_values):
    # We're looking to copy over the lowest values first. So first we test
    # whether the current value on the left side is less than the value on the
    # right side.
    if left_values[left_index] < right_values[right_index]:
      # If the left side value is less, that's what we'll copy to the sorted
      # list.
      sorted_values.append(left_values[left_index])
      # And then we'll move to the next value in the left half of the list.
      left_index += 1
    # Otherwise, the current value from the right half must have been lower.
    else:
      # So we'll copy that value to the sorted list instead.
      sorted_values.append(right_values[right_index])
      # And then we'll move to the next value in the right half of the list.
      right_index += 1
  # At this point one of the two unsorted halves still has a value remaining,
  # and the other is empty. We won't waste time checking which is which.
  # We'll just copy the remainder of both lists over to the sorted list.
  # The one with a value left will add that value, and the empty one will add
  # nothing.
  sorted_values += left_values[left_index:]
  sorted_values += right_values[right_index:]
  # All the numbers from both halves should now be copied to the sorted list,
  # so we can return it.
  return sorted_values

# Finally, we need to kick the whole process off. We'll call the merge_sort
# function with the list of numbers we loaded, and print the result.
print(merge_sort(numbers))

Save the above code to a file named merge_sort.py, in the same directory as load.py. Let's try this on our file with eight numbers.

python merge_sort.py numbers/8.txt

It prints out the sorted list.

[2, 3, 3, 4, 5, 6, 7, 9]

But again, this seems pretty magical. Let's add some print statements to get some insight into what it's doing.
First, we'll print the unsorted list so we can refer to it. We'll add a print statement right before we call the merge_sort function for the first time.

print(numbers)

Then we'll add another print statement within the merge_sort function, right after the recursive calls. This will show us the sorted left half and right half that it's returning. Again, don't worry about the fancy Python formatting string; it just keeps the values neatly aligned.

print("%15s %-15s" % (left_values, right_values))

Let's try running this again.

python merge_sort.py numbers/8.txt
[4, 6, 2, 3, 9, 7, 3, 5]
            [4] [6]            
            [2] [3]            
         [4, 6] [2, 3]         
            [9] [7]            
            [3] [5]            
         [7, 9] [3, 5]         
   [2, 3, 4, 6] [3, 5, 7, 9]   
[2, 3, 3, 4, 5, 6, 7, 9]

What we're seeing are the values being returned from the recursive merge_sort function calls, not the original calls to merge_sort. So what you see here is after we reach the base case with a list that's only one item in length, and the recursive calls start returning.
The original list gets split into two unsorted halves: [4, 6, 2, 3], and [9, 7, 3, 5].
The first half gets split in half again: [4, 6] and [2, 3], and each of those halves is halved again into single-element lists: [4], [6], [2], and [3].
There's nothing to sort in the single-element lists, so they're returned from the merge_sort function as-is.
Those single-element lists get merged into two sub-lists: the [4, 6] and [2, 3] sub-lists look the same after sorting as they did before sorting.
But the order is shifted when those two sub-lists get merged back into a single list: [2, 3, 4, 6].
Then we recursively sort the right half of the list: [9, 7, 3, 5].
It gets split in half again: [9, 7] and [3, 5].
And each of those halves gets broken into single-element lists.
There's nothing to sort there, so they're returned as-is: [9], [7], [3], and [5]
The first two are sorted as they're merged: [7, 9].
The order of the second two is unchanged: [3, 5].
All those elements are sorted as they're merged into the second half of the full list: [3, 5, 7, 9].
And then the two halves are sorted as they're merged back into the full list: [2, 3, 3, 4, 5, 6, 7, 9]

Related Discussions

Have questions about this video? Start a discussion with the community and Treehouse staff.

Sign up

Related Discussions

Have questions about this video? Start a discussion with the community and Treehouse staff.

Sign up

Let's review another sorting algorithm, merge_sort, so 0:00

that we can compare it with quicksort. 0:03

merge_sort is already covered elsewhere on the site, so 0:05

we won't go into as much detail about it. 0:08

But we'll have more info in the teacher's notes, if you want it. 0:10

Both quicksort and merge_sort are recursive. 0:14

The difference between them is in the sorting mechanism itself. 0:17

Whereas quicksort sorts a list into two sub-lists that are less than or greater 0:20

than a pivot value, merge_sort simply splits the lift in half recursively, and 0:25

then sorts the halves as it merges them back together. 0:30

That's why it's called merge_sort. 0:33

You may recognize this code at the top by now. 0:35

It just loads a file full of numbers into a list. 0:38

Let's define a recursive merge_sort function. 0:41

As usual, it'll take the list or sublist that we wanted to sort. 0:44

Our base case is the same as with quicksort. 0:48

If the list has 0 or 1 values, there's nothing to sort, so we return it as is. 0:51

If we didn't return, it means we're in the recursive case. 0:57

So first we need to split the list in half. 1:00

We need to know the index we should split on. 1:02

So we get the length of the list and divide it by 2. 1:05

So for example, if there are 8 items in the list, we'll want an index of 4. 1:08

But what if there were an odd number of items in the list, like 7. 1:13

We can't have an index of 3.5, so we'll need to round down in that case. 1:16

Since we're working in Python currently, we can take advantage of a special Python 1:21

operator that divides and rounds the result down, the floor division operator. 1:25

It consists of a double slash. 1:30

Now we'll use the Python slice syntax to get the left half of the list. 1:33

We'll pass that list to a recursive call to the merge_sort function. 1:38

We'll also use slice syntax to get the right half of the list, 1:43

and pass that to merge_sort as well. 1:46

Now we need to merge the two halves together and sort them as we do it. 1:50

We'll create a list to hold the sorted values. 1:54

And now we get to the complicated part, merging the two halves together and 1:57

sorting them as we do it. 2:01

We'll be moving from left to right through the left half of the list, 2:03

copying values over to the sorted values list as we go. 2:07

This left_index variable will help us keep track of our position. 2:10

At the same time, we'll also be moving from left to right through the right half 2:14

of the list and copying values over. 2:18

So we need a separate right_index variable to track that position as well. 2:20

We'll keep looping until we've processed all of the values in both 2:25

halves of the list. 2:30

We're looking to copy over the lowest values first. 2:38

So first, we test whether the current value on the left side is less 2:40

than the value on the right side. 2:45

If the left side value is less, that's what we'll copy over to the sorted list. 2:48

And then we'll move to the next value in the left half of the list. 2:57

Otherwise, the current value from the right half must have been lower, so 3:01

we'll copy that value to the sorted list instead. 3:06

And then, we'll move to the next value in the right half of the list. 3:13

That ends the loop. 3:17

At this point, one of the two unsorted halves still has a value remaining, and 3:19

the other is empty. 3:22

We won't waste time checking which is which. 3:23

We'll just copy the remainder of both lists over to the sorted list. 3:26

The one with a value left, we'll add that value, and the empty one, 3:30

we'll add nothing. 3:33

All the numbers from both halves should now be copied to the sorted list, so 3:34

we can return it. 3:38

Finally, we need to kick the whole process off. 3:39

We'll call the merge_sort function with the list 3:42

of numbers we loaded and print the result. 3:47

Let's save this. 3:54

And we'll try it out on our file with eight numbers. 4:01

Python merge_sort.py numbers/8.txt, 4:03

and it prints out the sorted list. 4:08

But again, this seems pretty magical. 4:12

Let's add some print statements to get some insight into what it's doing. 4:14

First, we'll print the unsorted list, so we can refer to it. 4:19

We'll add a print statement right before we call the merge_sort function for 4:23

the first time. 4:26

Then we'll add another print statement within the merge_sort function 4:30

right after the recursive calls. 4:34

This will show us the sorted left half and right half that it's returning. 4:36

Again, don't worry about the fancy Python formatting string, 4:39

it just keeps the values neatly aligned. 4:42

Let me resize my console, clear the screen, 4:45

and then we'll try running this again. 4:48

What we're seeing are the values being returned from the recursive merge_sort 4:53

function calls, not the original calls to merge_sort. 4:57

So what you see here is after we reach the base case with a list that's only one item 5:00

in length, and the recursive calls start returning. 5:04

The original list gets split into two unsorted halves, 4, 6, 5:08

3, and 2, and 9, 7, 3 and 5. 5:13

The first half gets split in half again, 4 and 6, and 3 and 2. 5:15

And each of those halves is halved again into single element lists. 5:21

There's nothing to sort in the single element list, so they'll return for 5:26

the merge_sort function as is. 5:30

Those single element lists get merged into two sublists, and sorted as they do so. 5:31

The 4 and 6 sublist looks the same after sorting as it did before sorting, but 5:37

the 3 and the 2 get sorted as they're combined into a sublist, 5:42

the new order is 2, 3. 5:46

The order is shifted again when those two sublists get combined back into a single 5:47

list, 2, 3, 4, 6. 5:52

Then we recursively sort the right half of the original list, 9, 7, 3, 5. 5:54

It gets split in half again, 9, 7 and 3, 5. 6:01

And each of those halves get broken into single element lists. 6:05

There's nothing to sort there, so the single element lists are returned as is. 6:09

The first two were sorted as they're merged, 7, 9. 6:14

And so are the second, 3, 5. 6:18

And then those two sublists get sorted as they're combined into another sublist, 6:21

3, 5, 7, 9. 6:26

And finally, everything is sorted as it's merged back into 6:28

the full sorted list, 2, 3, 3, 4, 5, 6, 7, 9. 6:32

That's how merge_sort works on a list of eight numbers. 6:37

Let's see if it works on a bigger list. 6:40

First I'll remove the two print statements, so 6:44

we don't get an overwhelming amount of debug output. 6:46

Then I'll run it on a list of 10,000 items. 6:53

python merge_sort.py numbers/10000.txt. 6:55

Not only did it work, it was pretty fast. 7:02

But which is faster, merge_sort or quicksort? 7:06

We'll look at that next. 7:08

You need to sign up for Treehouse in order to download course files.

Sign up

You need to sign up for Treehouse in order to set up Workspace

Sign up