DSA - Chapter 2

March 7, 2023 6 minute read

Chapter 2. Algorithms

Programs

A program specifies a finite sequence of computer instructions which tells the machine what to do exactly (fine details).

A program consists of 2 parts:

Algorithm: the computational procedure which transforms an input into a desired output as specified by the problem statement.
Data Structures: the representation, organization and storage of information.

Algorithms

An algorithm is the programming language independent and machine independent strategy that a program uses.

We can consider an algorithm to be a finite sequence of well-defined operations that halts in a finite amount of time.

Purpose of an algorithm is to communicate a computation procedure not to computers but people. (We tend to skip find details)

Example 1: Generate an (n x n)-pixel wall paper pattern

Algorithm 1:

For each pixel "d" do
	decide whether "d" is "on" or "off"

Algorithm 2:

For each pixel "d" with coordinates(x, y){
	compute a function c = f(x,y)
	if (c <= threshold) then
		set "d" "on" }

Algorithm 3:

for i=1 to n
	for j=1 to n {
		c = (i^2 + j^2) mod 10;
		if c <= 3
		set (i,j) "on";
}

Algorithm 1 and 2 are too ambiguous, whereas algorithm 3 is a good, valid algorithm.

Data structure

Organise data using data structures for efficient retrieval.

Choosing an algorithm

To choose an algorithm to solve a problem and measure which algorithm is better, we need to pick a measure for:

Problem size
Computational resources required by the algorithm

Problem size

The measure of size should reflect the complexity of the problem instance

Resources

The measure of resources required should have at least the following properties:

It should measure the resources we care about (e.g., memory space, running time, … ).
It should be quantitative.
It should be a good predictor, so it is easy to predict when the algorithm is applied to a large set of problem instances.

Measuring time resource

Method 1: Write programs using the different algorithms and compare their running time
- Straightforward, but too much noise and hard to get a predictor function
Method 2: Discover a function of the problem’s size that reflects the work the algorithm must do to solve the problem of that size.

Plan to solve a computational problem

Recognise a problem
Build an abstract model
Design an algorithm to solve within the model
Analyze the algorithm to get an upper bound on the work needed to solve the problem
Analyze the algorithm to get an lower bound on the work needed to solve the problem
If not satisfied with the algorithm, find a better alternative
Build data structures that allow the algorithm to be implemented efficiently

The Towers of Hanoi

Problem: Given 3 pegs and n disks of different sizes placed in order of size on one peg, transfer the disks from the original peg to another peg with the constraints that:

Each disk is on a peg
No disk is ever on a smaller disk
Only one disk at a time is moved

Base case: If there is only one disk

Approach: If we know how to move n-1 disks, then we know how to move n disks

Algorithm:

Hanoi(A, B, C, n){
/* to move n disks from A to C using B as an intermediate peg */
	if (n==1)
		move A's top disk to C;
	else {
		Hanoi(A, C, B, n-1);
		move A's top disk to C;
		Hanoi(B, A, C, n-1);
	}
}

Analysis

Let $f(n)$ be the number of moves required by our algorithm to move n disks from a peg to another.

\[f(n) = \begin{cases}1 &\text{if }n=1 \\ 2f(n-1)+1 &\text{if } n >1 \end{cases}\]

Solving the recurrence gives $f(n) = 2^n -1$

Our algorithm thus takes $2^n-1$ moves and this is an upper bound of the problem.

Let $g(n)$ be the number of moves that are necessary to solve the problem with n disks.

\[g(n) \ge \begin{cases}1 &\text{if }n=1 \\ 2g(n-1)+1 &\text{if } n >1 \end{cases}\] \[\displaystyle\sum_{i=1}^n 2i = 2^n -1\] \[g(n) \ge 2^n-1\]

Thus it at least takes $2^n-1$ moves and this is a lower bound of the problem.

Example 2: Searching: Given an array of n $\ge$ 1 integers and a “target” integer x, determine if x is in the array

array_search(A,x,n){
	i = 1;
	while (x != A[i])
		if (i == n) return ("not found")
		else i++;
	return ("found");
}

Since the algorithm is a proof that we can solve the problem using at most that number of operations, even in the worst case, that gives an “upper bound” of the problem’s worst cost as a function of the input size.
So, given an algorithm, it makes sense to first find its worst cost; then, if the problem is important enough, its average cost.

Example 3: Matrix multiplication

Algorithm 1:

Alg1 {
	r = a*e + b*f;
	t = c*e + d*f;
	output (r,s,t,u);
}

Algorithm 1 takes 8 multiplications and 4 additions.

Algorithm 2:

Alg2 {
	p1 = a * (g-h);  p2 = (a+b) * h;
	p3 = (c+d) * e;  p4 = d * (f-e);
	p5 = (a+d) * (e+h);  p6 = (b-d) * (f+h);
	p7 = (a-c) * (e+g);
	u = p5 + p1 - p3 - p7;
	r = p5 + p4 -p2 + p6;
	s = p1 + p2;
	t = p3 + p4;
}

Algorithm 2 takes 7 multiplications an 18 additions.

Which algorithm is better?

The answer depends on which machine/computer you use. In other words, it would be difficult to compare algorithms without referring to a particular computer.

Complexity

To compare algorithms, we are interested only in the growth rate of the total number of operations as a function of the input size regardless of their types rather than counting the exact number of execution for each type of operation.

e.g. Growth rate of O(n^2) is better than O(n^3) because it grows slower!

Big Theta ($\theta$)

$\theta(g(n))$ = {${f(n): \exist c_1 > 0, c_2 > 0, n_0 >0 }$ s.t. $\forall n \ge n_0, 0 \le c_1 g(n) \le f(n) \le c_2g(n)$}

$g(n)$ is an asymptotically tight bound of $f(n)$ if $f(n) \in \theta(g(n))$

Big Oh $(O)$

$O$ gives an asymptotic upper bound

$O(g(n)) =$ {$f(n): \exists c>0, n_0 > 0$ s.t. $\forall n \ge n_0, 0 \le f(n) \le cg(n)$}

Big Omega $(\Omega)$

$\Omega$ gives an asymptotic lower bound

$\Omega(g(n)) =$ {$f(n): \exists c>0, n_0 >0$ s.t. $\forall n \ge n_0, 0 \le cg(n) \le f(n)$}

Calculating running time

Rule 1: Loops
- Complexity of a loop is at most the complexity of the statement inside the loop times the number of iterations
Rule 2: Consecutive statements:
- Complexity of 2 consecutive program segments (S1 and S2) = max {complexity of S1, complexity of S2}
Rule 3: If-then-else:
- Complexity of if (conditions) S1; else S2 = max {complexity of (S1 + conditions), complexity of (S2 + conditions)}
Rule 4: Function call:
- The complexity of a function should be analysed first before computing the complexity of the program fragment containing the call

Some typical growth rates

O(1): Constant time
O(log n): Logarithmic complexity
O(n): Linear complexity
O(n log n): Arises when algorithms solve a problem by breaking it up into smaller sub-problems, solving them independently, and then combining the solutions
O(n^2): Quadratic
O(n^3): Cubic
O(2^n): Exponential

Example 4: Fibonacci Numbers

A recurrence equation:

Let $f(n)$ be the number of pairs of rabbits you have in month $n \ge$ 1.

\[f(1) = f(2) = 1;\] \[f(n) = f(n-1) + f(n-2)\]

A recursive algorithm:

Fib(n){
	if (n <= 2) return (1);
	else return Fib(n-1) + Fib(n-2);
}

Time complexity: Exponential!

An iterative algorithm:

Fib(n){
	previous = present = 1;

	for (i = 3; i <= n; i++){
		past = previous;
		previous = present;
		present = previous + past;
	}
	return (present)
}

Time complexity: $\theta(n)$

Share on

Twitter Facebook LinkedIn