Make sure to **REFRESH**
each time you visit this page.

Final exam posted (3-25). Typo in problem 5b corrected.

Third quiz posted (3-25).

Quiz-3 and cumulative quiz statistics posted.

Homework set #13 posted. Due Tuesday, 3-11.
(First batch of problems posted 3-8 4:30pm.
Second batch posted 3-8 5:40pm.) **Updated 3-13 7pm.**

Quiz-2 statistics posted (3-7 1:30pm)

Homework statistics posted (3-3 8:20pm).

Second quiz posted (3-3 8:20pm). Typos fixed.

First quiz posted (2-25 10:45am). Solve the problems on your own time before the next quiz.

Homework set #12 posted (2-25 1pm). Written homework due Tuesday, 3-04. Some of the "DO" exercises do Thu Feb 27.

Homework set #11 posted (2-24 9pm) (stated in class). Due Tuesday, 2-25. It is ok if you submit it after class but the same day. (Slide it under my door.)

Homework set #10: First batch of (problems 10.1-10.8) posted 2-16, 6:30pm.
Second batch **(problems 10.9-10.12) added 2-18, 3:45am**.
Due Thursday, February 20. (Originally the problem set was labeled "ninth."
Relabeled as "tenth" 2-18 11:30am,
after belatedly posting the actual problem set #9 that was due 2-18.)

**Text:** Kleinberg - Tardos: Algorithm Design. (Coop Seminary Bookstore)

Recommended reading: Cormen - Leiserson - Rivest - Stein: Introduction to Algorithms.

**Grading, tests.** There will be three short (10-minute) quizzes
and a final exam. The composition of the course grade:
quizzes: 12% (4% each) final exam 45% homework 35% class participation
(including tutorials) 8%

The tests are closed-book (no notes allowed). Schedule of tests:

Quiz 1 Thu Feb 20.

The quiz will be selected from all material covered so far except
for computability (the material discussed in class on Tue, Feb 18).
Please pay special attention to the "DO" exercises. Accurate
and elegant pseudocodes will be required in the test.
When writing pseudocode, please **explain the meaning of your
variables**.

Quiz 2 Thu Feb 27

Quiz 3 Tue Mar 11

Last class Thu Mar 13 (attendance mandatory)

Last tutorial Thu Mar 13

Final exam Tue Mar 18 8am - 10am

**Quiz-123 statistics** (25 students) (cumulative score of
three quizzes, posted 3-13):

120 points: Max

91 points: Best

69 points: Upper quartile (75%)

37 points: Median

28 points: Lower quartile (25%)

**Quiz-3 statistics** (25 students) (quiz date 3-11, stat posted 3-13):

40 points: Max

34 points: Best

20 points: Upper quartile (75%)

10 points: Median

05 points: Lower quartile (25%)

**Quiz-2 statistics** (25 students) (quiz date 2-27, stat posted 3-7):

40 points: Max

40 points: Best

30 points: Upper quartile (75%)

21 points: Median

12 points: Lower quartile (25%)

**HW statistics after 11th homework set** (25 students) (posted 3-3):

194 points: Max

186 points: Best

179 points: Upper quartile (75%)

168 points: Median

157 points: Lower quartile (25%)

**Quiz-1 statistics** (25 students) (quiz date 2-20, stat posted 2-27):

40 points: Max

28 points: Best

19 points: Upper quartile (75%)

13 points: Median

05 points: Lower quartile (25%)

13.0 DO: Review the proof that the language QBF (quantified Boolean formulas) is in PSPACE.

13.1 DO: Karp-reduce SAT to 3-SAT. (SAT is the set of satisfiable Boolean circuits; 3SAT is the set of satisfiable 3CNF formulas.)

13.2 CHALLENGE. Prove that QBF is PSPACE-complete (under Karp reductions).

**13.3 HW (MAX-3-SAT approximation):** Recall the MAX-3-SAT problem:

*Input:* a list of $m$ disjunctive 3-clauses over a given set of $n$
Boolean variables. (A disjunctive clause is an OR of literals.
A disjunctive 3-clause is an OR of 3 literals corresponding to three distinct
variables, e.g., $x_2\vee {\overline x_5} \vee x_8$. The disjunctions
$x_2\vee {\overline x_2} \vee x_8$ and $x_2\vee {\overline x_5} \vee x_2$
are not 3-clauses.)

*Question:* Find the maximum number $k$ such that $k$ of the clauses
are simultaneously satisfiable.

(a) Prove: $k\ge 7m/8$. (Hint: Use the probabilistic method.)
**(8 points)**

(b)
DO: Turn your proof into a probabilistic algorithm that finds
an assignment that satisfies $7m/8$ of the clauses. The algorithm
should work in polynomial expected time on all inputs. Warning:
if a positive random variable has large expected value, this does not
mean that its value is typically large. **Hint (added 3-13 7pm):**
prove that the
probability that a random assignment satisfies at least $7m/8$ clauses
is at least $1/(m+1)$.

(c) Derandomize the algorithm in (b) (turn it into a deterministic
algorithm). Here are hints for two distinct solutions.
**Updated 3-13 7pm.**

(c1) CHALLENGE: Hint 1: use small sample-spaces for 3-wise independent random variables.

(c2) DO: Hint 2: Successively evaluate the variable to zero or one so that after each evaluation, the expected number of satisfied clauses remains at least $7m/8$.

**13.4 HW (wrong NP-definitions):** Recall the definition of NP
(adding a needed existential quantifier on the alphabet of $L_1$):

(Correct definition) The language $L\subseteq\Sigma^*$ belongs to NP if

$(\exists$ finite alphabet $\Sigma_1)
(\exists L_1\in P, L_1\subseteq\Sigma_1^*)(\exists
c\in \nnn)(\forall x\in\Sigma^*)
(x\in L \Leftrightarrow
(\exists w\in\Sigma_1^*)
(|w|\le |x|^c$ AND $(x,w)\in L_1))$.

Kyle, Larry, and Mike gave incorrect definitions of the class NP.
Below we denote Kyle's class
"KNP", Larry's class "LNP", and Mike's class "MNP". Here are their
definitions.

(Kyle) The language $L\subseteq\Sigma^*$ belongs to KNP if

$(\exists$ finite alphabet $\Sigma_1)
(\exists L_1\in P, L_1\subseteq\Sigma_1^*)(\exists
c\in \nnn)(\forall x\in\Sigma^*)
(x\in L \Rightarrow (\exists w\in\Sigma_1^*)
(|w|\le |x|^c$ AND $(x,w)\in L_1))$.

(Larry) The language $L\subseteq\Sigma^*$ belongs to LNP if

$(\exists$ finite alphabet $\Sigma_1)
(\exists L_1\in P, L_1\subseteq\Sigma_1^*)(\exists
c\in \nnn)(\forall x\in\Sigma^*)(\exists w\in\Sigma_1^*)
(x\in L \Leftrightarrow (|w|\le |x|^c$ AND $(x,w)\in L_1))$.

(Mike) The language $L\subseteq\Sigma^*$ belongs to MNP if

$(\exists$ finite alphabet $\Sigma_1)
(\exists L_1\in P, L_1\subseteq\Sigma_1^*)(\exists
c\in \nnn)(\forall x\in L)
(\exists w\in\Sigma_1^*)
(|w|\le |x|^c$ AND $(x,w)\in L_1)$.

**(a) HW:** Determine the class KNP. Give a very simple answer.
**(6 points)**

(b,c) DO: Determine the classes LNP and MNP. Give very simple answers.

13.5 CHALLENGE (Savitch's Theorem): Prove: if $s(n)\ge \log n$ is a space-constructible function then NSPACE($s(n)$)$\subseteq$ SPACE($s(n)^2$). Corollary: NPSPACE = PSPACE. (NSPACE refers to nondeterministic space: the finite control is nondeterministic).

13.6 CHALLENGE (Immerman -- Szelepcsényi Theorem): Prove: if $s(n)\ge \log n$ is a space-constructible function then NSPACE($s(n)$) = coNSPACE($s(n)$).

13.7 DO: State the decision version of the "maximum independent set" problem in an undirected graph. Prove that this problem is NP-complete, via reduction from a probem shown NP-complete in class. (This is a very easy problem, the reduction is described by a single word.)

**13.8 HW (due Thu, 3-13):**
The 3-CLIQUE problem asks whether or not a given undirected
graph has a triangle. Decide this question in $O(n^{2.81})$
where $n$ is the number of vertices. (Hint: $2.81 > \log_2 7$.)
**(10 points)**

13.9 DO: The MAX-ACYCLIC-SUBGRAPH problem takes a directed graph $G = (V,E)$ as input and asks to find a largest acyclic subset $F \subseteq E$ (i.e., $(V,F)$ must be a DAG (directed acyclic graph)). This problem is NP-hard. Find a factor-2 approximation, i.e., an acyclic subset $F' \subseteq E$ that is at least half as large as the largest acyclic subset. Find $F'$ in polynomial time. (The solution is very simple.)

13.10 DEFINITION: A **linear program** is a system of linear inequalities
written as $Ax \le b$ where $A$ is a matrix, $x$ is an
unknown column vector, and $b$ is a given column vector.
An inequality between vectors is meant coordinatewise:
$(a_1,\dots,a_n)\le (b_1,\dots,b_n)$
means $(\forall j)(a_j \le b_j)$.
A linear program is *feasible* if it has a solution (in real numbers).
Let LP denote the set of feasible linear programs with integer
coefficients. Let BLP ("bounded LP") denote those linear programs
in LP which include the inequalities $0\le x_i\le 1$ for every
variable $x_i$. (These can be written as $-x_i\le 0$
and $x_i\le 1$ so we conform with the format $Ax\le b$.)

13.11 DO: Prove: if a bounded linear program in $n$ unknowns is feasible (i.e., it belongs to BLP) then there is a solution that satisifies $n$ linearly independent constraints with equality.

13.12 DO: Prove: BLP $\in$ NP. Do not use the known fact that
LP $\in$ P (Khachiyan, 1979). What is the
witness? Can you show that your witness is *short*?
What does "short" mean? Use 13.11.

13.13 CHALLENGE. Prove: LP $\in$ NP. (Again, do not use the fact that LP $\in$ P.)

13.14 DEFINITION: Let ILP denote the set of those linear programs with integer coefficients which have a solution in integers ("integer linear programs").

13.15 CHALLENGE. ILP $\in$ NP. (Difficult!) DO: Why is this problem difficult? What is a witness? What do we need to show about the witness?

**13.16 HW:** (a)
Let (0,1)-LP denote the set those linear programs with integer
coefficients which have a solution in zeros and ones (every variable
takes a value in {0,1}). Prove: (0,1)-LP $\in$ NP.
(b) Assuming that 3SAT $\in$ NPC, prove that (0,1)-LP $\in$ NPC.
(Give a Karp reduction. NPC denotes the class of NP-complete
languages.) **(3+10 points)**

**12.1 HW:**Recall the FIND-3-COL problem: Input: a graph $G$;
output: "NO" if $G$ is not 3-colorable, and a 3-coloring if $G$
is 3-colorable. Show that this problem is Cook-reducible
to the decision problem 3-COL (given a graph, is it 3-colorable?).
Recall that a Cook-reduction would use 3-COL as an oracle
(black box, subroutine)
and work in polynomial time (which may include a polynomial number
of calls to the oracle). Give your solution in pseudocode.
State the number of calls made to the oracle.
**(8 points)**

**12.2 HW:** Jimmy incorrectly copied the definition of NP:
he omitted the condition that the length of the witness is polynomially
bounded in the length of the input. Let us call the complexity class
defined by Jimmy's definition JNP. Determine JNP. (It is a class of
languages that has been discussed in class.)
**Added 3-3 7:30pm** Naturally, your solution should begin with
the statement of Jimmy's definition.
**(6 points)**

12.3 DO (due Thu, Feb 27): Count the Boolean functions in $n$ Boolean variables.

12.4 DO (due Thu Feb 27): A **Boolean circuit** is a DAG
of which the sinks (nodes of out-degree zero) are labeled
by literals and all other nodes are labeled "AND" or "OR."
Given an input assignment
to the variables, a value at each node is computed in the natural way
peforming the Boolean operations on each node once its
out-neighbors have been evaluated. The *depth* of the
circuit is the length of the longest path from the output
node to an input node. The *size* of the circuit is
the number of wires (edges). Prove: CNF formulas can be
represented as Boolean circuits of depth 2.

12.5 DO: Prove: PARITY$_n$ can be computed by a Boolean circuit of depth 3 and size $O(n 2^{\sqrt{n}})$. What can be said about depth 4?

12.6 DO (due Thu Feb 27): Prove: PARITY$_n$ can be computed by a Boolean circuit of depth $O(\log n)$ and polynomial size.

12.7 DO: Prove: Addition of $n$-bit integers can be done by Boolean circuits of depth 3 and polynomial size.

12.8 DO: A theorem of Ajtai and Furst - Saxe - Sipser (1982) says that PARITY$_n$ cannot be computed by circuits of bounded depth and polynomial size. Use this theorem to prove that multiplication of two $n$-digit integers cannot be done in bounded depth and polynomial size.

12.9 CHALLENGE (Claude Shannon): Prove: almost all Boolean functions in $n$ Boolean variables require Boolean circuits of size $\Omega(2^n/n)$.

12.10 DO (due Thu Feb 27): Complete the proof of the Karp reduction of 3-SAT to CLIQUE.

12.11 CHALLENGE (V. Pratt, 1970s): Prove that the factoring language "FACT"
discussed in class belongs to NP $\cap$ coNP, without using the AKS
theorem ("Primes $\in$ P"). Recall that what you need is to
produce a witness of primality. If your prime has $n$ digits,
estimate the lengths of your witness.
Hint: Use the theorem that for every prime there is a
primitive root (i.e., the multiplicative group of reduced
mod *p* residue classes is cyclic).

** Eleventh homework set. Due Tuesday, Feb 25.**

11.1 DO: "Boolean variables" take the value 0 or 1.
Recall that a Boolean function in $n$ (Boolean) variables is
a function
$f : \{0,1\}^n \to \{0,1\}$. A CNF ("conjunctive normal form") formula
is a formula of the form $C_1\wedge\dots\wedge C_m$ where
each $C_i$ is a (disjunctive) *clause*. A (disjunctive) clause
has the form form $C = z_{1}\vee\dots\vee z_{t}$ where each $z_i$ is a
*literal*. A literal is a variable or a negated variable.
So for instance, $(x_1\vee \overline{x_4}\vee x_5)\wedge
(\overline{x_1}\vee \overline{x_2}\vee x_3\vee x_4)\wedge \overline{x_4}$
is a CNF formula. Prove that every Boolean function can be represented by
a CNF formula.

**11.2 HW:** The PARITY$_n$ function of $n$ Boolean variables
is defined as
PARITY$_n(x_1,\dots,x_n) = \sum_{i=1}^n x_i \bmod 2$.
So for instance PARITY$_3(1,0,1) = 0$ and
PARITY$_4(1,0,1,1) = 1$. PARITY$_n$ is a
Boolean function, so it can be expressed by a CNF formula.
Prove: if PARITY$_n$ is expressed as a CNF formula with $m$
clauses then $m \ge 2^{n-1}$.
**(8 points)**

11.3 DO: Prove: P $\subseteq$ NP.

11.4 DO: Prove: if P = NP then NP = coNP.

11.5 DO: For a language $L\subseteq \Sigma^*$ over the finite
alphabet $\Sigma$, let $f_L : \Sigma^*\to\{0,1\}$ denote the
*characteristic function* of $L$ defined on the string
$x\in\Sigma^*$ by $f_L(x)=1$ if $x\in L$ and $f_L(x)=0$ otherwise.
Let $\cal R$ denote the class of recursive languages,
i.e., the class of languages $L$ for which $f_L$ is
a computable function (a.k.a. a recursive function, hance
the letter $\cal R$), i.e., for which membership
is decidable by an algorithm (C program, or equivalently,
a Turing machine) that halts on every input.
Prove: NP $\subseteq$ $\cal R$.

11.6 DO: Let $\cal{RE}$ denote the class of computably enumerable (a.k.a. recursively enumerable) languages. By definition, $L\in {\cal{RE}}$ if there exists a C program (or a Turing machine) that halts on input $x\in\Sigma^*$ if and only if $x\in L$. (a) Prove: ${\cal R} \neq {\cal RE}$. (b) Prove: ${\cal R} = {\cal RE} \cap co{\cal {RE}}$.

** Tenth homework set (previously erroneously labeled "ninth").
Due Thursday, Feb 20.**

10.1 DO: Study the handout on Euclid's algorithm and Repeated Squaring.

**10.2 HW:** Let $\ell_b(x)$ denote the number of digits to base $b$
of the positive integer $x$. For $r,s \ge 2$ determine
$$\lim_{x\to\infty} \frac{\ell_r(x)}{\ell_s(x)}.$$
Prove your answer. The proof should include an explicit expression
of $\ell_b(x)$ in terms of $b$ and $x$.
**(4 points)**

10.3 DO: Given an $n$-digit (in binary) positive integer $x$, determine $\lfloor\sqrt{x}\rfloor$ in polynomial time. Write your algorithm in pseudocode. Minimize the number of comparisons made by the algorithm. Show that the number of comparisons is $\sim cn$ for some constant $c$; determine $c$. Show that the overall complexity of the algorithm (number of bit operations) is $O(n^C)$; find the smallest value of $C$ assuming multiplication of two $n$-digit integers is performed by the schoolbook algorithm and thus takes $\Theta(n^2)$.

10.4 DO **(RSA)**: Let $p < q$ be $n$-digit prime numbers. Let $N=pq$
and $M=(p-1)(q-1)$. Given $N$ and $M$, compute $p$ and $q$
in polynomial time. Evaluate the exponent in your time bound
assuming schoolbook arithmetic (multiplication and division).

10.5 DO: review basic number theory, especially greatest common divisor, multiplicative inverses, Fermat's little Theorem, and the Chinese Remainder Theorem.

**10.6 HW (RSA):** Alice's public key is $(N,e)$ where
$N=pq$ is the product of two distinct $n$-digit primes
she keeps private. Her private key $f$ is an $O(n)$-digit
integer that satisfies the congruence $ef\equiv 1 \pmod M$
where $M=(p-1)(q-1)$. Alice notices that her private
key $f$ has been compromised (but not the primes $p,q$).
So she decides to keep $N$ but replace $e$ and $f$ by
some other pair of values, $e'$ and $f'$. (Of course
$e'f'\equiv 1 \pmod M$). She replaces
the pair $(N,e)$ by $(N,e')$ as her public key.
Prove that this is not a good idea. Show that if Eve
knows $N, e, e'$, and $f$ then Eve can decode in
polynomial time any message Alice receives that was
encrypted using the public key $(N,e')$. (Note that
we do not claim that Eve can determine $f'$.)
(Warning: $f$ does not need to fall in the range
$1\le f\le M-1$. All we know about $f$ is that it does
not have too many digits and it is a multiplicative inverse of $e$
modulo $M$.)
**(9 points)**

**10.7 HW:** (a) Evaluate the order of magnitude of the
number of binary digits
of $F_k$, the $k$-th Fibonacci number, in terms of the
number $n=\ell_2(k)$ of binary digits of $k$. Your answer
should be of the form $\ell_2(F_k)= \Theta(n^b c^n)$
where $b,c$ are constants. Determine $b,c$.

(b) Prove that $F_k$ cannot be computed in polynomial time
(given $k$ in binary as input).
**(5+2 points)**

10.8 DO: Given the positive integers $k$ and $m$ in binary, compute $F_k \bmod m$ in polynomial time. Evaluate the exponent in "polynomial time" as a function of the bit-lengths of $k$ and $m$, assuming schoolbook multiplication. Hint: consider the powers of the matrix $$ A = \left(\matrix{1 & 1 \\ 1 & 0}\right) .$$

10.9 DO: Let $G=(V,E,w,s)$ be a weighted directed graph with a source vertex $s$. Assume $k$ of the edges have negative weight but there are no negative cycles. Find the cost of the minimum-cost paths from $s$ to each vertex in time $O((k+1)!T)$ where $T$ is the maximum cost of executing Dijkstra on the nonnegative part of $G$ with arbitrary starting points, i.e., the cost of Dijkstra on $(V,E^*,w^*,s')$, where $E^*$ is the set of edges with non-negative weight and $w^*$ is the restriction of $w$ to $E^*$; we maximize over all choices of $s'\in V$.

**10.10 HW (min-cost paths in a DAG):**
(a) Let $G=(V,E,w,s)$ be a weighted, rooted DAG
with root $s\in V$, where $w : E \to \rrr$ is the weight function.
(Negative weights are permitted.) For each vertex $t\in V$, find
the minimum cost of reaching $t$ from $s$. Describe your algorithm in
elegant pseudocode; use the white/grey/black convention.
Assume $V=\{1,\dots,n\}$ and $G$ is topologically
sorted (if $i\to j$ is an edge then $i< j$). (This assumption is
justified by Exercise 8.3.) Your algorithm should run in linear time.
Prove that your algorithm is correct. Clearly state your loop invariant.

(b) Use part (a) to solve the *critical path* problem: given a
DAG with non-negative edge weights, find the *maximum cost* of
reaching $t$ from $s$.
**(8+1 points)**

10.11 DO **(Bellman-Ford algorithm):** Let $G=(V,E,w,s)$ be a digraph
with weighted edges and root $s\in V$.
Negative weights are permitted but no negative cycles.
For each vertex $t\in V$, find the minimum cost of reaching
$t$ from $s$. Describe your algorithm in pseudocode. The algorithm
should run in $O(mn)$ where $m$ is the number of edges and $n$ the
number of vertices. (Hint: dynamic programming, using recursion
on the additional parameter $\ell$: find the minimum cost of reaching
$t$ from $s$ in at most $\ell$ steps [using at most $\ell$ edges].)
This was one of the first
dynamic programming algorithms, from the 1950s.

10.12 DO: We are given an $r\times n$ array $L$ of zeros and ones. Let $S$ denote the set of rows of $L$. We want to set up a data structure that will answer membership queries in $S$ at the cost of $O(n^2)$ bit-operations, regardless of the value of $r$. The input to a membership query is an $n$-bit string $x$; the question is, does there exist $i$ such that $x$ is the $i$-th row of $L$. We are allowed to spend $O(rn)$ time and we can use $O(rn)$ space to build the data structure. (So, for instance, we cannot use an $r\times r$ array.) The algorithm must be deterministic. Describe in English how to build the data structure and how to answer the membership queries. Indicate why each task is solved within the required time bound.

** Ninth homework set. Due Tuesday, Feb 18.**
(Announced in class 2-13. Posted 2-18, after class.
If you forgot to submit this, please do later today (Tuesday, 2-18))

**9.1 HW (density of Fermat witnesses):**An integer $x$ is
*composite* if $x\ge 2$ and $x$ is not a prime.
Let $x\ge 2$ be an integer.
A *Fermat witness* (of the compositeness of $x$) is an
integer $w$ such that $1\le w\le x-1$, $\gcd(x,w)=1$, and
$w^{x-1}\not\equiv 1 \pmod x$. By Fermat's little theorem,
if a Fermat witness exists then $x$ is composite. Composite
numbers that do not have a Fermat witness are called
*Carmichael numbers*. They exist, but they are rare.
Assume $x$ is composite and not a Carmichael number (i.e., a
Fermat witness of the compositeness of $x$ exists).
Prove: for a random integer $y$ from the interval $1\le y\le x-1$,
the probability that either $y$ is not relatively prime to $x$
or $y$ is a Fermat witness of the compositeness of $x$
is at least 1/2. **(8 points)**

9.2 DO: Prove that an integer of the form $pq$ ($p,q$ primes) is never a Carmichael number.

9.3 DO: The smallest Carmichael number has three digits and is of the form $3pq$. Find this Carmichael number. (You don't need to prove that it is the smallest.)

** Eighth homework set. Due Tuesday, Feb 11.**
(Announced in class 2-6. Posted 2-10, 10pm)

8.1 DO: Study Depth-First Search (DFS)

**8.2 HW (strong connectedness):** Recall that a digraph $G=(V,E)$ is
*strongly connected* if $(\forall x,y\in V)(\exists x\to\dots\to y$
directed path$)$. Given a digraph $G$ (by an array of adjacency lists),
decide in linear time whether or not $G$ is strongly connected.
Do NOT use DFS. The essence of your answer should be two lines.
**(6 points)**

8.3 DO: Given a digraph, find a directed cycle if there is one; otherwise, topologically sort the vertices. Do this in linear time. Describe your solution in pseudocode. Do NOT use DFS.

8.4 DO: Suppose $(\forall n\ge 1)(T(n)\ge 0)$ and $T(n)\le T(\lfloor n/2\rfloor) + O(n)$. (a) Prove: $T(n)=O(n\log n)$. Do not assume that $n$ is a power of 2. (b) Prove the same conclusion under the weaker assumption that $T(n)\le T(\lceil n/2\rceil) + O(n)$.

8.5 DO **(nearest pair)**:
In class, we studied Shamos's divide-and-conquer algorithm to find
a nearest pair among a given set of points in the plane. Prove that
if $p_1,\dots,p_t$ are the points in the middle strip of width
$2d_0$, listed in increasing order of their $y$-coordinates,
then looking ahead by 7 steps suffices, i.e., if $|i-j|\ge 8$
then the distance between $p_i$ and $p_j$ is at least $d_0$.

8.6 CHALLENGE: Can the nearest-pair problem be solved in nearly linear time in 3D?

8.7 DO **(universal family of hash functions)**:
Prove: given $z\in\fff_p^n$, $z\neq 0$, the probability
that for a random $v\in\fff_p^n$ we have $z\cdot v=0$
is $1/p$. Here $\fff_p$ denotes the (residue classes of)
integers modulo the prime number $p$ (the "field of
order $p$") so we can take $\fff_p=\{0,1,\dots,p-1\}$
under mod$~p$ arithmetic. The dot product of
$z=(z_1,\dots,z_n)$ and $v=(v_1,\dots,v_n)$ is
$z\cdot v = \sum_{i=1}^n z_i v_i$ (again, in mod$~p$
arithmetic).

**HW statistics after 7th homework set** (26 students) (posted 2-10):

143 points: Max

140 points: Best

131 points: Upper quartile (75%)

127 points: Median

118 points: Lower quartile (25%)

** Seventh homework set. Due Thursday, Feb 6 (before class)**
(Posted 2-4 at 11am)

7.1 DO **(radix sort)**: Sort $n$ words of length $k$ alphabetically
in time $O(nk)$.
The alphabet is the integers $1,\dots, m$ where $m \le nk$.
Write your algorithm in elegant pseudocode.

7.2 DO **(radix sort with variable length words)**:
Let $w_1,\dots,w_n$ be a list of $n$ words over the alphabet
$\{1,\dots,m\}$. Let $|w|$ denote the length of the words $w$
(number of letters in $w$). Sort this list in linear time,
i.e., in time $O(N)$ where $N=\sum_{i=1}^n |w_i|$ is the length
of the input. Assume $m\le N$.

7.3 DO: Prove: for any real numbers $x,y$ we have $$|(\lceil x \rceil + \lfloor y\rfloor)- (\lfloor x\rfloor + \lceil y \rceil)| \le 1 .$$ (This was a lemma to the analysis of Batcher's sorting network.)

**7.4 HW (Batcher sort):** Draw a diagram of Batcher's sorting network
that sorts $n=8$ data. Your drawing should be elegant, well
organized. Clearly mark the various SORT subnetworks built
recursively (put a box around them).
**(6 points)**
Note: in class I only asked to draw a MERGE network.
Please draw the entire SORT network.

** Sixth homework set. Due Tuesday, Feb 4 (before class)**
(Posted 1-31 at 6:45pm, Problem 6.2 revised 2-1 at 3:30pm)

6.1 DO: Write a short and elegant pseudocode for the MERGE algorithm discussed in class.

**6.2 HW ($k$-way merge):** (Revised)
Merge $k$ sorted lists of data (real
numbers) in $O(n\log k)$ time using a heap. Here $n$ is the
total number of items on the $k$ lists combined. Describe your
algorithm in pseudocode. You do not need to describe how the
heap executes the various heap operations (EXTRACT-MIN, etc.).
**(6 points)**

**6.3 HW (UNION-FIND)** This problem is part of the analysis
of the hierarchical implementation of the UNION-FIND data
structure discussed in class. Prove that under the "deepest wins"
rule, the rooted trees constructed in each country will have
depth $\le \log_2 n$.
**(8 points)**

**6.4 HW (slow growth)** Recall from class that
$\log^*n = \min\{k \mid \TOWER_2(k)\ge n\}$ where
$\TOWER_a(0) = 1$ and $\TOWER_a(k+1) = a^{\TOWER_a(k)}$.
Prove that for all sufficiently large $n$
we have $\log^*n \le \log\log n$. State
how large is "sufficiently large."
*Hint.* First observe that this
problem is equivalent to saying that
for all sufficiently large $k$ we have
$\TOWER_2(k) \ge 2^{2^k}$. First prove
that for $k\ge 2$ we
have $\TOWER_2(k) \ge k+2$ and then bootstrap
from here. ("Bootstrapping" means using a
relatively weak result to prove a much stronger result.)
**(6 points)**

**6.5 HW (amortized analysis):**
Read the notes on amortized analysis.
Solve Problem 2 ("Queue via stacks") stated in the notes.
**(6 points)**

**6.6 HW (edit-distance):** Solve
the problem stated here.
**(6 points)**

6.7 DO **(all-ones square):**
Solve
the problem stated here.

** Fifth homework set. Due Thursday, Jan 30 (before class)**

**5.1 HW (necessity of non-negative weights in Dijkstra's algorithm) **
(posted 1-29 6:20pm, stated in class 1-23)
Construct an edge-weighted DAG (directed acyclic
graph) with exactly one negatively weigthed edge, on which Dijkstra's
algorithm fails. Use as few edges as possible. (You don't need to
prove the minimality of the number of edges used.) DRAW your DAG and describe
how the configuration changes step-by-step.
**(6 points)**

5.2 CHALLENGE. Find a bipartite graph such that under a random ordering of the vertices, the greedy coloring algorithm is likely to use a large number of colors.

5.3 DO: Let $G$ be a graph with $n$ vertices; let $\alpha(G)$ denote the idependence number of $G$ (maximum number of independent vertices). (a) Prove: $\alpha(G) \ge n/(d_{\max}+1)$ where $d_{\max}$ is the maximum degree. (b) Find an independent set of this size in linear time.

5.4 CHALLENGE: (a) Prove: $\alpha(G) \ge \sum_{i=1}^n 1/(d_i+1)$ where $d_i$ is the degree of the $i$-th vertex. (b) Find an independent set of this size (with large probability) by an efficient randomized algorithm.

**Erratum (1-28-2014).**
As pointed out by Nathan, in class I misstated the objective function
optimized by the *Viterbi algorithm.* The correct objective function
is the conditional probability $P(X \mid Y)$ where $Y$ is the given sequence
of $t$ observations and $X$ is any sequence of $t$ (hidden) states. For
any fixed $Y$ the task is to maximize $P(X\mid Y)$ over all the $S^t$
sequences $X$ of $t$ states. ($S$ is the number of states.)

** Fourth homework set. Due Tuesday, Jan 28 (before class)**

**Handout** Dynamic programming: the knapsack
problem posted 1-21 12:30pm. DO: Study the handout.

4.1 DO: The algorithm given in the handout describes how to find the maximum value $m$ under the given weight constraint. Modify the algorithm to also find the optimal set $S$ of items.

**4.2 HW:** Consider the following variant of the Knapsack problem.
Assume the values are positive integers (but the weights are positive reals).
Given a positive integer $V$, find the minimum weight of a set of items of
total value at least $V$. Find this minimum weight in $O(nV)$ steps where
$n$ is the number of items. Describe your solution in simple and elegant
pseudocode.
(Posted 1-21 12:40pm) **(8 points)**

**4.3 HW:** The Interval Sum problem.
(Posted 1-21 12:40pm) **(8 points)**

**4.4:** The Car Race problem.
DO: part (a). **HW**: parts (b) and (c).
(Posted 1-21 1pm) **(5+5 points)** (A correct solution to
(c) earns you the full 10 points (b+c).)

**Third homework set. Due Tuesday, Jan 21 (before class)**

**3.1 HW** The Greedy Matching algorithm.
Read the sheet, solve the Problem stated at the end.
(Posted 1-16 4:40pm) **(2+5+5 points)**

**3.2 HW** The Greedy Coloring algorithm.
Read the sheet, solve the Problem stated at the end.
(Posted 1-16 5:30pm) **(2+5+4+5 points)**

**3.3 HW** Sorting a heap.
(Posted 1-20 5pm. Assigned in class with detailed explanation
on 1-16.) **(8 points)**

**3.4 (Basic manipulation of the representation of a digraph)**
Our input is a directed graph (digraph)
given in *adjacency-list representation*
(i.e., by an array of adjacency lists) as explained in class.
The set of vertices is an array $A[1\dots n]$. Each vertex $i$ in the
array is the starting point of the linked list $\adj[i]$
of the out-neighbors of $i$.
We always put a link from the end of of the list to its start
and vice versa. Each register is capable of strong the name of a
vertex (a $\log_2 n)$-bit integer). One step of the algorithm
can read a register, follow a link, copy the contents of a register.
The number of vertices is $n$, the number of edges is $m$,
so the total length (number of registers and links)
of the description of the digraph is $O(n+m)$.
An algorithm of which the digraph is the input runs in
*linear time* if it takes $O(n+m)$ steps.

3.4(a) DO: Convert this representation to edge-list representation in linear time. ("Edge-list representation" is the list of vertices followed by the list of edges.)

3.4(b) DO: Convert the edge-list representation to adjacency-list representation in linear time.

**3.4(c) HW.** The "reverse" of a digraph $G$ is the digraph $G^{rev}$
which has the same vertices and every edge $(x,y)$ is replaced by its
reverse, $(y,x)$. It is trivial to construct $G^{rev}$ from $G$
is the edge-list representation (in linear time).
Build an adjacency-list
representation of $G^{rev}$ directly from an adjacency-list
representation of $G$, making a single pass
through the adjacency-list representation of $G$.
Write your solution is simple and elegant pseudocode.
Your cycle organization should be a nested pair of
**for** loops:

**for** $i=1$ **to** $n$

**for** $j\in \adj[i]$
**(3 points)**

Note: in class, this problem was listed as a "DO" exercise.
It has been upgraded to "HW" (to hand in) because the details
of your solution are essential to the solution of Problem 3.4(d).

**3.4(d) HW.** We say that an adjacency-list representation is
*monotone* if for each vertex $i$ the list of outneighbors
of $i$ is given in increasing order. Given an adjacency-list
representation of $G$, create the monotone adjacency-list representation
in linear time. The description of your algorithm should be no more
than three lines with reference to 3.4(c).
**(3 points)**

(Posted 1-20, 6pm. Assigned in class. This completes the problem set due Tuesday, Jan 21.)

**Handout** Binary Search to find item in
sorted array posted Jan 15. DO: Study it, compare it with your solution.

**Handout** on The Method of Reverse Inequalities
posted Jan 15. DO: Study it, answer the Questions,
solve the Exercises stated in the handout (by Jan 16).

Second homework set posted Jan 9, due Jan 16. Clarification to problem 2.5 added on Jan 15 at 8:30pm.

First homework set posted Jan 9, due Jan 14