Algorithms


Quantifying complexity W1L7

Big-O

f(x) is O(g(n)) if 'after some time', g(n) dominates f(x).

After some time in the sense there exists constants n₀ and c such that

∀n ≥ n₀,
  f(x) ≤ n₀ + c.g(n) 

The relation between f(n) and g(n) doesn't matter when n<n₀.

For n≥n₀, f(n) is always 'below' (with respect to y axis) c.g(n) in a graph plot.

A small proof

If:

then,

ie, the least efficient phase in an algorithm determines the upper bound of the whole algorithm.

Proof:

We know that:

Then,

f₁(n) + f₂(n)
  ≤ [n₁ + c₁.g₁(n)] + [n₂ + c₂.g₂(n)]
  ≤ (n₁+n₂) + c₁.g₁(n) + c₂.g₂(n)

Let n₃=n₁+n₂, c₃=max(c₁,c₂).

f₁(n) + f₂(n)
  ≤ n₃ + c₃.g₁(n) + c₃.g₂(n)
  ≤ n₃ + c₃[g₁(n) + g₂(n)]
  ≤ n₃ + c₃(2*max[g₁(n), g₂(n)])
  ≤ n₃ + 2c₃(max[g₁(n), g₂(n)])

ie,

f₁(n) + f₂(n) ≤ n₃ + 2c₃(max[g₁(n), g₂(n)])

∴ f₁(n) + f₂(n) = O(max(g₁(n), g₂(n)))

Big-Ω

f = Ω(g)

means

∃n₀, ∃c, ∀n≥n₀,
  f(n) ≥ n₀ + c.g(n)

All sorting algorithms are Ω(n.logn)

No algorithm can do better (at least not on a classical computer).

Θ: Tight bound

Algorithm with Θ bound found means asymptotically most efficient found! Because we can't do any better. It's a tight bound.

f = Θ(g)

means

∃n₀ cₗ cₕ, ∀n≥n₀,
  cₗ.g(n) ≤ f(n) ≤ cₕ.g(n)

Examples:

Analysis of merge sort W2L6

Merge operation:

-- | Merges two lists
-- Input lists are assumed to be sorted
merge :: Ord a => [a] -> [a] -> [a]
merge l [] = l
merge [] l = l
merge l1@(a:aa) l2@(b:bb)
  | a < b = a : (merge aa l2)
  | otherwise = b : (merge l1 bb)

a = merge [11,22,33] [12,21,30]  
-- [11,12,21,22,30,33]

If length l1 == n and length l2 == m, then max number of steps = m + n

Obviously,

   n+m ≤ 2*max(n,m)
=> n+m = O(max(n,m))
       = O(n)

ie, merge is a linear time operation.

DBT: - Time complexity: O(n + m) ?? —

Merge sort itself:

T(n) = 2T(n/2) + n
       |     |   |
       +--+--+   |
          |      +------> Merge operation
          v       
  2 instances of same problems but with input size halved

For simpler calculation, assume n=2ᵏ for some k.

T(1) = 1   -- Only one element => element already sorted
T(n) = 2T(n/2) + n
     = 2[2T(n/4) + (n/2)] + n       = 2²T(n/2²) + 2n
     = 2²[2T(n/2³) + (n/2³)] + 2n   = 2³T(n/2³) + 3n
     =  ....     
     =  ....     
     = 2ʲT(n/2ʲ) + jn

TODO List difference

Incomplete..

-- hand-rolled `elem`
myElem :: Eq a => a -> [a] -> Bool
myElem a (x:xs)
  | a==x = True
  | otherwise = myElem a xs
myElem _ [] = False

-- | Find 'difference' between two lists
listDiff :: Ord a => [a] -> [a] -> [a]  
listDiff (x:xs) ys
  | myElem x ys = listDiff xs ys
  | otherwise = x:listDiff xs ys
listDiff xs _ = xs
--listDiff xs [] = xs
--listDiff [] ys = []

λ> listDiff [1,2,3] [2,3]  
[1]

TODO Merge unique

Doesn't work…

-- | Merge, but elements of result should all be unique
mergeUniq :: Ord a => [a] -> [a] -> [a]
mergeUniq l1@(x:xs) l2@(y:ys)
  | x<y = x:mergeUniq xs l2
  | x==y = mergeUniq xs ys
  | x>y = y:mergeUniq l1 ys
mergeUniq [] l = l
mergeUniq l [] = l

a = mergeUniq [1,2,3] [2,3,3,3,4]  
[1,3,3,4]

List intersection

-- | Find common elements of two lists
intersect :: Ord a => [a] -> [a] -> [a]
intersect (x:xs) l2@(y:ys)
  | x<y = intersect xs l2
  | x==y = x:intersect xs ys
  | otherwise = intersect xs ys
intersect [] _ = []
intersect _ [] = []

λ> intersect [1,2,3] [2,3,3,3,4,5]  
[2,3]

O(n+m)

Quicksort

quick :: Ord a => [a] -> [a]
quick (pivot:lst) = left ++ [pivot] ++ right
  where left = quick $ filter (\x -> x<pivot) lst
        right = quick $ filter (\x -> x>pivot) lst
quick [] = []
-- Data.List has
--   partition :: (a -> Bool) -> [a] -> ([a], [a])

λ> quick [11,102,13, 1, 0, 10]
-- [0,1,10,11,13,102]

P and NP W8L7

NP:

Asymptotic analysis OLD

Running time of an algorithm is usually a complex expression.

It is more worthwhile to have an estimate.

By considering only the highest order term of the expression for the running time of the algorithm.

Because the highest order term dominates for large inputs.

Conventions

Doubts