Two Ways to Sort a List

A - compare

  1. Build a new list as the result
  2. Transfer the minimum value in original list into new list
  3. Repeat step 2
  4. End when the original list has length of 0 or 1

example code:

1
2
3
4
5
6
7
8
9
10
11
def sort1(iList):
if len(iList)<2: return(iList)
sList = []
for i in range(len(iList)):
min = iList[0]
for x in iList[1:]:
if x < min:
min = x
sList.append(min)
iList.remove(min)
return(sList)

B - recursion

  1. If the list has a length of 0 or 1, just return it.
  2. Otherwise, return the minimum value as a single-value list plus the output list of the remnant list.

example code:

1
2
3
4
5
6
7
def sort2(list):
if len(list) < 2:
return(list)
else:
mn = min(list)
list.remove(mn)
return([mn]+sort2(list))

Benchmark

Which one is the best?
Firstly I made 20 640-length-integer list to test two sorting methods.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
import random
import time
sort1_time = []
sort2_time = []
for x in range(20):
rList = random.sample(range(1280), 640)
tList = rList[:]
start_time = time.time()
sort1(tList)
sort1_time += [time.time() - start_time]
tList = rList[:]
start_time = time.time()
sort2(tList)
sort2_time += [time.time() - start_time]

Then I copy the result data into R and visualize it:

1
2
3
4
library(tidyr)
time <- gather(data.frame(round=factor(1:20),sort1,sort2), method, time, -round)
library(ggplot2)
ggplot(time, aes(x = round, y = time, fill = method)) + geom_bar(position = "dodge", stat="identity")



The reason

  • In sort1, we have two loops.

    • One is for i in range(len(iList)) and the other is for x in iList[1:] which is inside the first one.
    • For an i-th iteration, it has a sub-loop with n-i iteration (n=len(list)).
    • So each sorting has $\frac{1}{2}n^2+n+\frac{1}{2}$ times iterations.
    • Sort1’ order of growth is $O(n^2)$.
  • In sort2, however, we use recursive loop.

    • For the last recursion, it only have 2-time calculation. One for checking if’s condition and one for return value.
    • The second last one has 4-time calculation plus 2-time calculation from the last recursion.
    • The third last one also has 4-time calculation plus 6-time calculation from the second last recursion.
    • Overall, recursive method has $4*(n-1)+2=4n-2$ times of calculation.
    • Its order of growth is $O(n)$.

In conclusion, their calculation amount is not the same and this is the reason that sort1 spend much more time.
The advantage of recursion can be more obvious in big data.

0%