by Pang Yan Han
Big thanks to Calvin for giving me this chance
First time talking about a tech topic in a non-class setting
Algorithm (pronounced L-ger-rhythm)
A process or set of rules to be followed in calculations or other problem-solving operations, esp. by a computer.
Finding all pages where a keyword appears in a book
Motivation 1 - Finding all pages where a keyword appears in a book
Algorithm 1
number of occurrences = 0 while there are still pages in the book: while there are words on the page: if the current word is the keyword increment number of occurrences by 1 we are done with the page else advance to the next word advance to the next page
Is that an algorithm?
Yes. But I think we all agree it is not a very good one.
Motivation - Finding all pages of a book where a keyword appears
Algorithm 2
Suppose I am a very rich person. So for each page of the book, I pay 1 person $10 to check if that page contains the keyword.
number of occurrences = 0 Hire 1 person for each page of the book to execute Algorithm 1 While there are still people counting: wait for someone to finish counting his/her page add to the number of occurrences
Motivation - Finding all pages of a book where a keyword appears
Algorithm 2 - The rich man's algorithm
Is that a good algorithm?
Yes and No.
Motivation - Finding all pages of a book where a keyword appears
Algorithm 2 - The rich man's algorithm
Tedium is removed. Time taken is roughly the time it takes for 1 person to go through 1 page of the book.
Motivation - Finding all pages of a book where a keyword appears
Algorithm 2 - The rich man's algorithm
If the book has 1000 pages, I have to pay $10,000. The amount of money paid depends on the number of pages in the book.
Motivation - Finding all pages of a book where a keyword appears
Algorithm 3
Go to the index of the book and look for the keyword If we find the keyword, then we can retrieve all the pages it appears in Otherwise, the keyword is probably not inside the book
Is this a good algorithm?
Yes
(Or, Big O Notation in < 10 min)
NOTE: Not meant to make fun of how this is taught in a formal CS curriculum.
Just showing the most relevant stuff quickly.
We refer to such operations as having $O(1)$, or more accurately, $Θ(1)$ time complexity.
// N is assigned somewhere
int y = 4;
int x = 0;
for (int i = 0; i < N; i++) { // loop runs for N times
x += y; // addition is a Θ(1) operation
}
Time complexity?
$Θ(N)$
aka. Linear Time
// N is assigned somewhere
int y = 4;
int x = 0;
for (int i = 0; i < N; i++) { // loop runs for N times
for (int j = 0; j < N; j++) { // loop runs for N times
x += y; // addition is a Θ(1) operation
}
}
Time Complexity?
$Θ(N^2)$
aka. Quadratic Time
// A and N are declared above
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
for (int k = 0; k < N; k++) {
A[i][k] = min(A[i][k], A[i][j] + A[j][k]);
}
}
}
Time Complexity?
$Θ(N^3)$
aka. Cubic Time
I guess the pattern is becoming clear
// N assigned somewhere
// we want to find `key`
int low, high, mid;
low = 0;
high = N;
ans = -1;
while (low < high) {
mid = (low + high) / 2;
if (A[mid] == key) {
ans = mid;
break;
} else if (A[mid] < key) {
low = mid + 1;
} else {
high = mid;
}
}
Time Complexity?
$Θ(log(N))$
aka. Logarithmic Time
def merge(lst, start, mid, end):
nr = end - start + 1
aux = [0 for i in xrange(nr)]
fp = start
sp = mid + 1
for i in xrange(nr):
if fp > mid:
aux[i] = lst[sp]
sp += 1
elif sp > end:
aux[i] = lst[fp]
fp += 1
elif lst[sp] < lst[fp]:
aux[i] = lst[sp]
sp += 1
else:
aux[i] = lst[fp]
fp += 1
for i in xrange(nr):
lst[start + i] = aux[i]
def mergesort(lst, start, end):
if start >= end:
return lst
mid = (start + end) / 2
mergesort(lst, start, mid)
mergesort(lst, mid + 1, end)
merge(lst, start, mid, end)
if __name__ == '__main__':
lst = [7, 6, 5, 4, 3, 2, 1]
mergesort(lst, 0, 6)
print lst
Time Complexity?
$Θ(N log(N))$
aka. Linearithmetic Time
NEGINF = -1000000000
def choose(i, N, value, weight, wtRem):
if wtRem < 0:
return NEGINF;
elif i >= N:
return 0
else:
return max(
value[i] + choose(i + 1, N, value, weight, wtRem - weight[i]),
choose(i + 1, N, value, weight, wtRem)
)
if __name__ == '__main__':
value = [5, 7, 8, 9, 10]
weight = [1, 13, 2, 6, 16]
N, weightRem = 5, 35
print choose(0, N, value, weight, weightRem)
Time Complexity?
$Θ(2^N)$
aka. Exponential Time
def permute(idx, nr, lst, chosen, p, allPerms):
if idx >= nr:
allPerms.append([x for x in p])
else:
for i in xrange(nr):
if not chosen[i]:
chosen[i] = True
p[idx] = lst[i]
permute(idx + 1, nr, lst, chosen, p, allPerms)
chosen[i] = False
def perm(N):
lst = [x for x in xrange(1, N+1)]
chosen = [False for i in xrange(N)]
p = [0 for i in xrange(N)]
allPerms = []
permute(0, N, lst, chosen, p, allPerms)
print allPerms
if __name__ == '__main__':
perm(6)
Time Complexity?
$Θ(N!)$
aka. Factorial Time
Algorithm Analysis
Given a computer that can execute 1 billion ops per second
10 | 1000 | 10000 | 1 million | 1 billion | |
---|---|---|---|---|---|
$Θ(1)$ | Constant | Constant | Constant | Constant | Constant |
$Θ(log(N))$ | 3.32 ns | 9.97 ns | 13.3 ns | 19.9 ns | 29.9 ns |
$Θ(N)$ | 10 ns | 1 us | 10 us | 0.001 s | 1 s |
$Θ(N log(N))$ | 33.2 ns | 9.97 us | 0.133 ms | 0.0199 s | 29.9 s |
$Θ(N^2)$ | 100 ns | 0.001 s | 0.1s | 16.67 min | 31.7 years |
$Θ(N^3)$ | 1 us | 1 s | 16.67 min | 31.7 years | 3.17e10 years |
$Θ(2^N)$ | 1 us | 3.4e284 years | 1.995e3001 s | 9.9e301020s | Too big... |
$Θ(N!)$ | 3.63 ms | 4.02e2558 s | 2.85e35650 s | 8.26e5565699 s | Too big... |
NOTE: $2^N$ (from $N = 10000$) and $N!$ (from $N = 1000$) required Wolfram Alpha to compute. And it overflowed on $N=10^9$ for those 2.
Hopefully that should be sufficient to convince you that efficient algorithms are necessary
A graph, in this context, refers this type of graph
And not this type of graph:
Sidenote: Facebook Graph API
Probably not a coincidental name
And a whole lot more
In fact, anything with some kind of connection/relation
Data structures to represent a graph
Two major representations:
A mix of:
Problems designed by experts to be solved using a combination of well-studied algorithms and algorithmic techniques, ingenuity, and sometimes math.
From a source vertex $S$, traverse (walk through) the graph starting from vertices directly adjacent to $S$ (or 1 hop away from $S$), then vertices 2 hops away from $S$, then those 3 hops away from $S$, and so on.
Given a map with the following characters:
When standing next to (left, right, up, down) a trap, the player knows there is a trap but not where the trap is.
Find the maximum number of gold pieces he can retrieve.
Time Complexity analysis:
Krochanska is Here!
http://uva.onlinejudge.org/external/117/11792.htmlGiven some train lines containing train stations from $1$ to $N$, such as:
where an important train station is one which appears in more than 1 train line.
Determine the important train station with total minimal travel cost to all other important train stations.
Train lines:
Important stations: 1, 2, 3, 4, 5
IDEA: Build graph using adjacency list
Train lines:
Important stations: 1, 2, 3, 4, 5
Stn | 1 | 2 | 3 | 4 | 5 | Total |
---|---|---|---|---|---|---|
1 | 0 | 1 | 2 | 1 | 1 | 6 |
2 | 1 | 0 | 1 | 1 | 2 | 5 |
3 | 2 | 1 | 0 | 1 | 1 | 5 |
4 | 1 | 1 | 1 | 0 | 1 | 4 |
5 | 1 | 2 | 1 | 1 | 0 | 5 |
Distances computed by BFS, with source dist = 0
Minimum node = 4
Train lines
Important stations: 1, 3, 6
Stn | 1 | 3 | 6 | Total |
---|---|---|---|---|
1 | 0 | 2 | 1 | 3 |
3 | 2 | 0 | 1 | 3 |
6 | 1 | 1 | 0 | 2 |
Minimal node: 6
Bipartite Graph = 2-colorable graph
Left graph is bipartite, Right graph is not
More efficient algorithms!
Example: Graph Matching
Microsoft Paint, Adobe Photoshop
Minesweeper Game
Clicking on empty square will reveal as many squares as possible, until all border squares are integers/reached borders of puzzle.
Clicking on numbered squares / mines = only 1 square revealed
Given a weighted graph (with non-negative edge weights) and a source vertex $S$, find the shortest distance from $S$ to all other vertices in the graph.
Efficient algorithm: Dijkstra's Algorithm (pronounced dike-stra)
Edsger W. Dijkstra (1930-2002)
A Case against the GO TO Statement
The Linux Kernel today:
Edsger W. Dijkstra (1930-2002)
His work is still very relevant today!
Example Graph. Source = 0, Destination = 5
Shortest path
Initialize source distance and put it onto pq
Active vertex - vertex 0
Relaxation on vertex 0
Active vertex - vertex 1
Relaxation on vertex 1
Active vertex - vertex 3
Relaxation on vertex 3
Active vertex - vertex 2
Relaxation on vertex 2
Active vertex - vertex 4
Relaxation on vertex 4 (no changes)
Active vertex - vertex 4 [Case of obsolete distance]
Active vertex - vertex 5
Relaxation on vertex 5 (no changes)
Active vertex - vertex 5 [Case of obsolete distance]
Done =)
$$Θ((E+V)*log(V))$$ using a binary min-heap for priority queue implementation
Source: Wikipedia
Abridged Problem Statement
Given a weighted directed graph with $N ≤ 500$ vertices, $M ≤ 10000$ edges where edge weights are positive, compute the distance of the almost shortest path from source vertex $S$ to destination vertex $D$, or output -1 if no such path exists.
An almost shortest path is a path from $S$ to $D$ which does not include any edges in any shortest path from $S$ to $D$.
Example 1
Blue edges = Shortest Path
Red edges = Almost Shortest Path
Example 2
Blue edges = Shortest Path
Example 3
Blue edges = Shortest Path
Red edges = Almost Shortest Path
Approach 1
Approach 2
Approach 2 (Continued)
Correctness aside (this algorithm is wrong), this algorithm is approximately $$Θ(M*(M+N)*log(N))$$ which takes about $$10000 * (10000 + 500) * log(500) \approx 941,407,350$$ operations per test case in the worst case, assuming $M = 10000, N = 500$
Verdict: TLE
Approach 3 (A correct approach)
Approach 3 (A correct approach)
Approach 3 (A correct approach) - Example
Approach 3 (A correct approach) - Time Complexity
Final words on this problem
Abridged Problem Statement
Abridged Problem Statement
Problem Constraints
Example 1 ($N = 7, K = 3$)
Legend: N = ninja, N = ninja previous pos, | = wall, X = spike
Initial, Water level = 0
Example 1 ($N = 7, K = 3$)
Legend: N = ninja, N = ninja previous pos, | = wall, X = spike
After Turn 1, Water level = 1
Ninja jumps up 3 metres to right wall
Example 1 ($N = 7, K = 3$)
Legend: N = ninja, N = ninja previous pos, | = wall, X = spike
After Turn 2, Water level = 2
Ninja goes down 1 metre
Example 1 ($N = 7, K = 3$)
Legend: N = ninja, N = ninja previous pos, | = wall, X = spike
After Turn 3, Water level = 3
Ninja jumps 3 metres up to left wall
Ninja successfully escapes the next turn
Example 2 ($N = 6, K = 2$)
Legend: N = ninja, N = ninja previous pos, | = wall, X = spike
Initial, Water Level = 0
Example 2 ($N = 6, K = 2$)
Legend: N = ninja, N = ninja previous pos, | = wall, X = spike
After turn 1, Water Level = 1
Ninja jumps 2 metres up to right wall
Example 2 ($N = 6, K = 2$)
Legend: N = ninja, N = ninja previous pos, | = wall, X = spike
After turn 2, Water Level = 2
Ninja climbs 1 metre down
Ninja perishes
Example 3 ($N = 3, K = 2$)
Legend: N = ninja, N = ninja previous pos, | = wall, X = spike
Initial, Water Level = 0
Example 3 ($N = 3, K = 2$)
Legend: N = ninja, N = ninja previous pos, | = wall, X = spike
After turn 1, Water Level = 1
Ninja climbs up 1 metre
Example 3 ($N = 3, K = 2$)
Legend: N = ninja, N = ninja previous pos, | = wall, X = spike
After turn 2, Water Level = 2
Ninja climbs up 1 metre
Ninja escapes the next turn
Approach
Approach
Example 1 Walkthrough ($N = 7, K = 3$), initial
1 | 2 | 3 | 4 | 5 | 6 | 7 | |
---|---|---|---|---|---|---|---|
L | 0 | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ |
R | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ |
Example 1 Walkthrough ($N = 7, K = 3$), after turn 1
1 | 2 | 3 | 4 | 5 | 6 | 7 | |
---|---|---|---|---|---|---|---|
L | 0 | 1 | ∞ | ∞ | ∞ | ∞ | ∞ |
R | ∞ | ∞ | ∞ | 1 | ∞ | ∞ | ∞ |
Example 1 Walkthrough ($N = 7, K = 3$), after turn 2
1 | 2 | 3 | 4 | 5 | 6 | 7 | |
---|---|---|---|---|---|---|---|
L | 0 | 1 | ∞ | ∞ | ∞ | ∞ | ∞ |
R | ∞ | ∞ | 2 | 1 | ∞ | ∞ | ∞ |
Example 1 Walkthrough ($N = 7, K = 3$), after turn 3
1 | 2 | 3 | 4 | 5 | 6 | 7 | |
---|---|---|---|---|---|---|---|
L | 0 | 1 | ∞ | ∞ | ∞ | 3 | ∞ |
R | ∞ | ∞ | 2 | 1 | ∞ | ∞ | ∞ |
Example 1 Walkthrough ($N = 7, K = 3$), after turn 4
1 | 2 | 3 | 4 | 5 | 6 | 7 | |
---|---|---|---|---|---|---|---|
L | 0 | 1 | ∞ | ∞ | ∞ | 3 | ∞ |
R | ∞ | ∞ | 2 | 1 | ∞ | ∞ | ∞ |
Time Complexity
Google Maps, and similar software
Actual algorithms are definitely much more complicated
Link-state routing protocols (Computer Networks)
OSPF (Open Shortest Path First)
Source: Wikpedia
A* algorithm
Source: Wikpedia
Problem Description
Given a set of $N$ cities and the distances between any pair of them, is there a way to travel through each of the the $N$ cities once with a total distance $≤ K$ (for some given $K$), while starting and ending at the same city?
Example
Question: Is there a TSP tour $≤ 50$
Seems like we have to examine all possible tours to find out, and there are $N!$ of them
Small insight: Fix the start/ending city, only $(N-1)!$ tours
Path 0-1-2-3-0
Total distance $= 4 + 20 + 60 + 15 = 99$
Path 0-1-3-2-0
Total distance $= 4 + 2 + 60 + 7 = 73$
Path 0-2-1-3-0
Total distance $= 7 + 20 + 2 + 15 = 44$
Path 0-2-3-1-0
Total distance $= 7 + 60 + 2 + 4 = 73$
Path 0-3-1-2-0
Total distance $= 15 + 2 + 20 + 7 = 44$
Path 0-3-2-1-0
Total distance $= 15 + 60 + 20 + 4 = 99$
2 tours with distance $≤ 50$
Seems like exhaustive search is the only viable algorithm
Did you observe another aspect of the TSP?
Given any tour, for example $0 - 2 - 1 - 3 - 0$
Can I easily verify whether it is $≤ K$ ($50$ in this case)
Yes!
The TSP given in this form above, is called a decision problem
Notice how we only need to output a YES / NO answer
To get this answer, we just need to find any tour with distance $≤ K$
The TSP, given in this decision problem form , falls under a very important class of problems known as the NP-Complete problems
If we change the problem to that of finding the shortest tour, then TSP becomes NP-hard
To date, the most efficient exact algorithm for solving the optimization version of TSP has time complexity $Θ(N^2*2^{N})$
It was invented by Michael Held and Richard Karp in 1962
So problems such as the TSP are strictly of theoretical interest, right?
You will be a millionaire if you prove $P = NP$ (or otherwise)
2 features
Other "features"
Polynomial Time reductions
This is also the reason you hear people saying that if there is an efficient algorithm for solving 1 specific NP-Complete problem, then there is an efficient algorithm for solving all of them
Problem Statement: Is there an algorithm which can tell whether an arbitrary computer program will terminate?
Suppose $P$ is a program that does not halt if an input program $I$ halts and $P$ halts if an input program $I$ does not halt
Notice that we can feed $P$ to itself as the input.
Question: Does $P$ halt when $P$ itself is the input?
Case 1: $P$ halts. Then by our assumption, the input to $P$ does not halt, so $P$ does not halt. Contradiction.
Case 2: $P$ does not halt. Then by our assumption, the input to $P$ must have halted, so $P$ halts. Contradiction.
So $P$ neither halts nor does not halt. This is a contradiction. Therefore, such a program/algorithm $P$ can never exist.
Thank you!
This presentation was created using
The math symbols are formatted using
which is incorporated as part of Reveal.js
Nice looking graphs like this one:
are drawn using:
Just Kidding!
Nice looking graphs like this one:
are drawn using: