Quantifying complexity W1L7

Big-O

Upper bound (weak version)
Indicates worst case complexity

f(x) is O(g(n)) if 'after some time', g(n) dominates f(x).

After some time in the sense there exists constants n₀ and c such that

∀n ≥ n₀,
  f(x) ≤ n₀ + c.g(n)

The relation between f(n) and g(n) doesn't matter when n<n₀.

For n≥n₀, f(n) is always 'below' (with respect to y axis) c.g(n) in a graph plot.

A small proof

If:

f₁ ≤ O(g₁)
f₂ ≤ O(g₂)

then,

f₁ + f₂ ≤ O( max(g₁, g₂) )

ie, the least efficient phase in an algorithm determines the upper bound of the whole algorithm.

Proof:

We know that:

f₁(n) ≤ n₁ + c₁.g₁(n)
f₂(n) ≤ n₂ + c₂.g₂(n)

Then,

f₁(n) + f₂(n)
  ≤ [n₁ + c₁.g₁(n)] + [n₂ + c₂.g₂(n)]
  ≤ (n₁+n₂) + c₁.g₁(n) + c₂.g₂(n)

Let n₃=n₁+n₂, c₃=max(c₁,c₂).

f₁(n) + f₂(n)
  ≤ n₃ + c₃.g₁(n) + c₃.g₂(n)
  ≤ n₃ + c₃[g₁(n) + g₂(n)]
  ≤ n₃ + c₃(2*max[g₁(n), g₂(n)])
  ≤ n₃ + 2c₃(max[g₁(n), g₂(n)])

ie,

f₁(n) + f₂(n) ≤ n₃ + 2c₃(max[g₁(n), g₂(n)])

∴ f₁(n) + f₂(n) = O(max(g₁(n), g₂(n)))

Big-Ω

Lower bound (weak version)
Indicates best case complexity

f = Ω(g)

means

∃n₀, ∃c, ∀n≥n₀,
  f(n) ≥ n₀ + c.g(n)

All sorting algorithms are Ω(n.logn)

No algorithm can do better (at least not on a classical computer).

Θ: Tight bound

Algorithm with Θ bound found means asymptotically most efficient found! Because we can't do any better. It's a tight bound.

f = Θ(g)

means

∃n₀ cₗ cₕ, ∀n≥n₀,
  cₗ.g(n) ≤ f(n) ≤ cₕ.g(n)

Examples:

n(n-1)/2 is Θ(n²)

Analysis of merge sort W2L6

Merge operation:

-- | Merges two lists
-- Input lists are assumed to be sorted
merge :: Ord a => [a] -> [a] -> [a]
merge l [] = l
merge [] l = l
merge l1@(a:aa) l2@(b:bb)
  | a < b = a : (merge aa l2)
  | otherwise = b : (merge l1 bb)

a = merge [11,22,33] [12,21,30]  
-- [11,12,21,22,30,33]

If length l1 == n and length l2 == m, then max number of steps = m + n

Obviously,

   n+m ≤ 2*max(n,m)
=> n+m = O(max(n,m))
       = O(n)

ie, merge is a linear time operation.

DBT: - Time complexity: O(n + m) ?? —

Merge sort itself:

T(n) = 2T(n/2) + n
       |     |   |
       +--+--+   |
          |      +------> Merge operation
          v       
  2 instances of same problems but with input size halved

For simpler calculation, assume n=2ᵏ for some k.

T(1) = 1   -- Only one element => element already sorted
T(n) = 2T(n/2) + n
     = 2[2T(n/4) + (n/2)] + n       = 2²T(n/2²) + 2n
     = 2²[2T(n/2³) + (n/2³)] + 2n   = 2³T(n/2³) + 3n
     =  ....     
     =  ....     
     = 2ʲT(n/2ʲ) + jn

TODO List difference

Incomplete..

-- hand-rolled `elem`
myElem :: Eq a => a -> [a] -> Bool
myElem a (x:xs)
  | a==x = True
  | otherwise = myElem a xs
myElem _ [] = False

-- | Find 'difference' between two lists
listDiff :: Ord a => [a] -> [a] -> [a]  
listDiff (x:xs) ys
  | myElem x ys = listDiff xs ys
  | otherwise = x:listDiff xs ys
listDiff xs _ = xs
--listDiff xs [] = xs
--listDiff [] ys = []

λ> listDiff [1,2,3] [2,3]  
[1]

TODO Merge unique

Doesn't work…

-- | Merge, but elements of result should all be unique
mergeUniq :: Ord a => [a] -> [a] -> [a]
mergeUniq l1@(x:xs) l2@(y:ys)
  | x<y = x:mergeUniq xs l2
  | x==y = mergeUniq xs ys
  | x>y = y:mergeUniq l1 ys
mergeUniq [] l = l
mergeUniq l [] = l

a = mergeUniq [1,2,3] [2,3,3,3,4]  
[1,3,3,4]

List intersection

-- | Find common elements of two lists
intersect :: Ord a => [a] -> [a] -> [a]
intersect (x:xs) l2@(y:ys)
  | x<y = intersect xs l2
  | x==y = x:intersect xs ys
  | otherwise = intersect xs ys
intersect [] _ = []
intersect _ [] = []

λ> intersect [1,2,3] [2,3,3,3,4,5]  
[2,3]

O(n+m)

Quicksort

quick :: Ord a => [a] -> [a]
quick (pivot:lst) = left ++ [pivot] ++ right
  where left = quick $ filter (\x -> x<pivot) lst
        right = quick $ filter (\x -> x>pivot) lst
quick [] = []
-- Data.List has
--   partition :: (a -> Bool) -> [a] -> ([a], [a])

λ> quick [11,102,13, 1, 0, 10]
-- [0,1,10,11,13,102]

P and NP W8L7

NP:

'Non-deterministic polynomial' time
Given a solution, its correctness can be checked in P-time
Eg: Prime factorisation, SAT, travelling salesman, vertex cover, independent set

Asymptotic analysis OLD

Running time of an algorithm is usually a complex expression.

It is more worthwhile to have an estimate.

By considering only the highest order term of the expression for the running time of the algorithm.

Because the highest order term dominates for large inputs.

Conventions

log n means log₂n

Doubts

A problem of complexity O(n!)
Boolean circuits, register machines for proof of Cook-Levin theorem
Example of NP Hard problems