$\newcommand{\nnn}{\mathbb N}$ $\newcommand{\zzz}{\mathbb Z}$ $\newcommand{\rrr}{\mathbb R}$ $\newcommand{\ccc}{\mathbb C}$ $\newcommand{\fff}{\mathbb F}$ $\newcommand{\eee}{\mathrm e}$ $\newcommand{\calA}{\mathcal A}$ $\newcommand{\calB}{\mathcal B}$ $\newcommand{\calC}{\mathcal C}$ $\newcommand{\calD}{\mathcal D}$ $\newcommand{\calP}{\mathcal P}$ $\newcommand{\OPT}{\mathsf{OPT}}$ $\DeclareMathOperator{\RE}{\mathcal{RE}}$ $\DeclareMathOperator{\RRR}{\mathcal{R}}$ $\newcommand{\xbar}{\overline{x}}$ $\newcommand{\ybar}{\overline{y}}$ $\newcommand{\vbar}{\overline{v}}$ $\newcommand{\Gbar}{\overline{G}}$ $\newcommand{\Lbar}{\overline{L}}$ $\newcommand{\vf}{\varphi}$ $\newcommand{\vt}{\vartheta}$ $\DeclareMathOperator{\Div}{\mathrm Div}$ $\DeclareMathOperator{\lcm}{\mathrm lcm}$ $\DeclareMathOperator{\var}{\mathrm Var}$ $\DeclareMathOperator{\cov}{\mathrm Cov}$ $\DeclareMathOperator{\period}{\mathrm per}$ $\DeclareMathOperator{\rank}{\mathrm rk}$ $\DeclareMathOperator{\trace}{\mathrm Tr}$ $\DeclareMathOperator{\span}{\mathrm Span}$ $\DeclareMathOperator{\sgn}{\mathrm sgn}$ $\DeclareMathOperator{\inv}{\mathrm Inv}$ $\DeclareMathOperator{\diam}{\mathrm diam}$ $\DeclareMathOperator{\Adj}{\mathrm Adj}$ $\DeclareMathOperator{\val}{\mathrm val}$ $\DeclareMathOperator{\3col}{\mathrm 3-COL}$ $\DeclareMathOperator{\PPP}{\mathrm P}$ $\DeclareMathOperator{\PPP}{\mathrm P}$ $\DeclareMathOperator{\NP}{\mathrm{NP}}$ $\DeclareMathOperator{\NPC}{\mathrm NPC}$ $\DeclareMathOperator{\coNP}{\mathrm coNP}$ $\DeclareMathOperator{\SAT}{\mathrm SAT}$ $\DeclareMathOperator{\COL}{\mathrm COL}$

CMSC 27230: Honors Theory of Algorithms -- Winter 2020

Homework and Material Covered

Course home | Policy on collaboration | Handouts | Tests | #1| #2| #3 | #4| #5| #6 | #7| #8| #9 | #10| #11| #12

REFRESH your browser to see latest update

The notation "CLRS Chap. 4.1" refers to Chapter 4.1 of the Cormen et al. text.

The notation "LN 4.1.15" refers to the instructor's Discrete Mathematics Lecture Notes, Chapter 4, problem 4.1.15.

The notation "ASY 2.21" refers to problem 2.21 in the Asymptotic Equality and Inequality handout.

"DO" problems are strongly recommended to check your understanding of the concepts; do not hand them in.

Problems labeled "HW" must be handed in at the beginning of the class the next Monday unless expressly stated otherwise.

BONUS PROBLEMS are required only if you are looking for a grade of A or A-. They are underrated, so solve the ordinary HW problems first. For a grade of A you need to earn at least 60% of the Bonus points (in addition to most of the ordinary HW points). For a grade of A- you need to earn at least 30% of the Bonus points (in addition to most of the ordinary HW points).

CHALLENGE PROBLEMS have no point value (but they earn you the instructor's attention) and no deadline (except they expire when discussed in class). Make sure you hand them in on a separate sheet, state your name, the fact that this is a challenge problem solution, and indicate the problem being solved. One further request: if you hand in a solution a challenge problem, send email to the instructor giving the problem number and the brief indication of the problem. Such email will create an easy record and help avoid mistakes in my records.

REWARD PROBLEMS are challenge problems that you should NOT hand in. The solution is its own reward. No deadline. Of course you may discuss your thoughts with each other, the TA, and the instructor.

If you suspect an error, please notify the instructor by email. If you are the first to report a serious error, you may earn bonus points. Stating that the exercise is wrong and giving a counterexample will not earn you the credit - report it (along with a possible fix if you see one). Please also report if the exercise seems too easy for the number of points given; this may again be a sign of an error, and the straightforward solution will not earn you the credit. Apply your critical thinking to whatever the instructor says or writes.

Go to top

REFRESH your browser to see latest update

Homework set #10 (due Wednesday, March 11 before class)

10.1 THEOREM (Agrawal-Kayal-Saxena (AKS) 2002). PRIMES $\in \PPP$. In other words, given a positive integer $x$ we can decide in polynomial time whether $x$ is a prime number.

10.2 HW (7 points) Recall the definition of the factoring decision problem, corresponding to the language FACT-DEC (Ex. 8.66). Prove: FACT-DEC $\in \NP\cap\coNP$. Use AKS.

10.3 CHALLENGE PROBLEM (no due date, no point value) Prove FACT-DEC $\in\NP\cap\coNP$ without using AKS. Hint: Use the fact that there exists a primitive root $\bmod p$. (Look it up.)

10.4 DEF Let $x=(x_1,\dots,x_n)$ and $y=(y_1,\dots,y_n)$ be real vectors. We say that $x\le y$ if $(\forall i)(x_i\le y_i)$.

10.5 DEF The Linear Programming problem (LP) is the optimization problem $$ \max \leftarrow \{f(x) \mid Ax\le b \} $$ where $A$ is a real $k\times n$ matrix, $b=(b_1,\dots,b_k)^T$ is a column vector of length $k$ (the $T$ indicates transpose), i.e., $b$ is a $k\times 1$ matrix, and $f(x)=c_1x_1+\dots+c_nx_n$ is a linear objective function where $c=(c_1,\dots,c_n)$ is a real vector. The input is $(A,b,c)$; the output is the optimum.

10.5 DEF The LP feasibility problem is the question whether there exists a real solution to the system $Ax\le b$ of inequalities. ($k$ inequalities in $n$ unknowns.) This decision problem is known to be in $\PPP$ for integer inputs (all entries of $A$ and $b$ are integers).

10.6 DEF The integer linear programming (ILP) problem is the same as LP except we are only interested in integer solution. The $(0,1)$-ILP problem is also the same but we are only interested in $\{0,1\}$ solutions (the value of each variable is 0 or 1). The corresponding feasibility problems are defined analogously.

10.7 HW (7 points) Show that the $(0,1)$-ILP feasibility problem is NP-complete by Karp-reduction from 3-COL (3-colorability).

10.12 REVIEW Recall that $\tau(G)$ denotes the size of a minimum cover of a graph. A factor-2 approximation of the minimum cover problem requires finding a cover $C$ such that $|C|\le 2\tau(G)$. In class we have shown how to find such an approximation in polynomial time by writing the min cover problem as a $(0,1)$-ILP, solving its LP-relaxation (finding an optimal rational solution to the same set of constraints), and rounding.

10.13 HW (7 points) Find, in polynomial time, a factor-2 approximation to the weighted minimum cover problem. The input is a graph $G=(V,E)$ and a function $w:V\to \rrr^+$ which assigns nonnegative weights to the vertices. We seek to approximate the optimum

$$ \OPT=\min \left\{\sum_{i\in C} w_i \mid C \text{ is an independent set }\right\}$$ Prove that we can find a factor-2 approximation to $\OPT$, i.e., we can find a cover $C\subseteq V$ such that $\sum_{i\in C} w_i \le 2\OPT$, in polynomial time.

10.16 DO (a) In a graph, the edges are pairs of vertices. In a 3-uniform hypergraph, the edges are triples of vertices. A cover is defined the same way as for graphs: it is a subset of the set of vertices that has at least one vertex from each edge. Give a factor-3 approximation to the minimum cover problem for 3-uniform hypergraphs in polynomial time. Use the LP/rounding approach. (b) State the weighted version and solve it within a factor of 3 in polynomial time.

10.17 DO It was stated in class that we cannot approximate $\alpha(G)$ (the independence number) within a factor of 100 in polynomial time, unless $\PPP = \NP$. We know that computing $\tau(G)$ exactly is essentially the same as computing $\alpha(G)$ exactly (because $\alpha(G)+\tau(G)=n$). How is it possible that by computing $\tau(G)$ approximately (within a factor of 2), we cannot infer a good approximation to $\alpha(G)$ ?

Go to top

REFRESH your browser to see latest update

Homework set #9 (due Mon, March 9 before class except where otherwise stated)

9.1 DEF The decision version of the MAXIMUM INDEPENDENT SET problem for graphs corresponds to the language IND-DEC $= \{(G,k) \mid G$ is a graph, $k\ge 1$ is an integer, and $G$ has an independent set of size $k\}$.

9.2 DO Prove: IND-DEC $\in\NP$.
Solution: Input: a pair $(G,k)$ where $G$ is a graph and $k\ge 1$ is an integer Witness: an independent set of size $k$ in $G$.

9.3 THEOREM IND-DEC is NP-complete.

9.4 PROOF PLAN: by Karp-reduction from 3-SAT. Given a 3-CNF formula $\vf$, we need to construct, in polynomial time, an input $(G_{\vf},k_{\vf})$ to IND-DEC such that
$\vf$ is satisfiable if and only if $G_{\vf}$ has an independent set of size $k_{\vf}$.

9.5 DO Check that this proof plan corresponds to the definition of Karp-reduction.

9.6 CONSTRUCTION. Let $\vf = C_1\wedge \dots \wedge C_m$ where each $C_i$ is a 3-clause (disjunction of 3 literals).
We define $k_{\vf} :=m$. Now we construct the graph $G_{\vf}$.
To each occurrence of a literal in $\vf$ we assign a vertex in $G_{\vf}$. So $G_{\vf}$ will have $3m$ vertices. If $x_j$ is a literal in $C_i$, we create a vertex called $v_{i,j}$. If $\xbar_j$ is a literal in $C_i$, we create a vertex called $\vbar_{i,j}$. We shall say that the 3 vertices corresponding to the literals in $C_i$ form the column $V_i$ in $V$ (so $V$ is arranged in $m$ columns, each column having 3 vertices).

Having defined the vertices of $G_{\vf}$, we now define adjacency in $G_{\vf}$. We make the three vertices in column $V_i$ into a triangle, so for instance if $C_i = x_2 \vee \xbar_5 \vee x_7$ then we make the vertices $v_{i,2}$, $\vbar_{i,5}$, and $v_{i,7}$ pairwise adjacent. For $i\neq j$, if vertices $v_{i,\ell}$ and $\vbar_{j,\ell}$ exist, we make them adjacent. So we make a pair of vertices from different columns adjacent if their corresponding literals are each others' negation.

9.7 DO Prove: $\alpha(G_{\vf}) \le m$ (where $\alpha(G)$ denotes the independence number of the graph $G$, i.e., the maximum size of independent sets in $G$).

The next two exercises verify that we indeed constructed a Karp reduction.

9.8 DO Prove: If $\alpha(G_{\vf})=m$ then $\vf$ is satisfiable.

9.9 DO Prove: If $\vf$ is satisfiable then $\alpha(G_{\vf})=m$.

9.10 DO Check that the two preceding exercises suffice to prove that CONSTRUCTION 9.6 is indeed a Karp reduction from 3-SAT to IND-DEC.

9.13 REVIEW the definition of the 3-DIMENSIONAL MATCHING problem discussed in class. Denote the corresponding language 3DM. We sated but did not prove that 3DM is NP-complete.

9.14 DEF The SUBSET-SUM problem (SSS) takes as input a list $a_1,\dots,a_m,b$ of positive integers and ask if there exists a subset $I\subseteq [n]$ such that $\sum_{i\in I} a_i = b$.

9.15 REVIEW the Karp-reduction 3DM $\prec_{\text{Karp}}$ SSS. This proves that SSS is NP-complete.

9.16 HW (3+8 points) (a) State the decision version of the integral Knapsack problem (both the weights and the values are integers). Call this problem KNAP-DEC.
(b) Use the NP-completeness of SUBSET-SUM to prove that KNAP-DEC is NP-complete.

9.20 REVIEW from class. Given $\epsilon > 0$, solve the KNAPSACK PROBLEM approxiately with $\le \epsilon$ error, using $O(n^3/\epsilon)$ steps. Here $\epsilon$ is a given positive real; both the weights and the values are real. The steps are: comparison of reals, arithmetic operations with reals, rounding of reals, comparison and arithmetic of integers, moving/copying integers. Solving a maximization problem within $\epsilon$ means finding a solution that satisfies the constraints exactly and whose value $s$ satisfies $s \ge (1-\epsilon)$OPT where OPT is the optimal value. Hint. Do scaling and rounding for the values but not for the weights (because the weight limit must be observed exactly). Use the value version of the Dynamic Programming solution (Ex. 3.8).

9.26 STUDY the "Branch-and-Bound" handout. Solve its DO exercises.

9.28 HW (6 points) The Branch-and-Bound solution of the Maximum Independent Set problem in the handout leads to the recurrence $$ T(n) \le T(n-1)+T(n-4)+O(n^2) $$ where $T(n) \ge 0$. Prove that this recurrence implies $T(n) = O(\psi^n)$ where $\psi\approx 1.38$ is the positive solution of the equation $x^4-x^3-1=0$. (The LaTeX code of the Greek letter $\psi$ is $\tt \backslash psi$.)

9.32 HW (6 points) Prove: $\NP\subseteq \RRR$.

Go to top

Homework set #8 (due Mon, March 2 before class except where otherwise stated)

8.1 DEF A DAG (directed acyclic graph) is a digraph without directed cycles.

8.2 DO (a) Prove that a DAG with $n$ vertices has at most $\binom{n}{2}$ edges. (b) For every $n$, find a DAG with $\binom{n}{2}$ edges.

8.4 DEF A topological sort of a digraphs $G=(V,E)$ is a numbering of the vertices as $V=\{v_1,\dots,v_n\}$ such that if $(v_i,v_j)\in E$ then $i < j$.

8.5 DO Prove: If a digraph $G$ admits a topological sort then $G$ is a DAG.

8.6 DO Prove the converse: every DAG has a topological sort.

8.7 Bonus (7 points, due Wed, March 4) Give a linear-time algorithm that takes a digraph as input (given by adjacency lists) and returns either a directed cycle or a topological sort. Do not use DFS (depth-first-search).

8.15 Bonus (8 points) Prove: PARITY$_n$ can be computed by a depth-3 Boolean circuit of size $O(n\cdot 2^{\sqrt{n}})$.

8.17 DO (Claude Shannon 1950: non-constructive proof of existence) Prove that for every $n$ there exists a Boolean function $f$ with $n$ variables such that every Boolean circuit that computes $f$ has at least $2^{n/2}$ gates.

8.18 DO In fact, most Boolean functions require at least $2^{n/2}$. Estimate the probability that a random Boolean function can be computed by fewer than $2^{n/2}$ gates.

* * *

8.21 TERMINOLOGY Let $\Omega$ be a set. Sometimes we refer to $\Omega$ as the universe. Recall that a predicate on $\Omega$ is a function $f:\Omega\to \{0,1\}$. Let $A\subseteq\Omega$. The characteristic function of $A$ relative to $\Omega$ is the predicate $f_A:\Omega\to\{0,1\}$ defined by $$ f_A(x) = \begin{cases} 1 &\text{ if }& x\in A \\ 0 &\text{ if }& x \in \Omega\setminus A \end{cases} $$ Recall that the correspondence $A \mapsto f_A$ establishes a bijection between the subsets of $\Omega$ and the predicates on $\Omega$. Deciding membership in $A$ amounts to evaluating the characteristic function of $A$. Using this correspondence, we shall interchangeably refer to subsets of our universe and the membership problems for the subsets, i.e., the predicates on the universe.

8.22 DEF The universe to which we most frequently apply the correspondence between predicates and subsets is the set $\Omega = \Sigma^*$, the set of finite strings (words) over a finite set $\Sigma$ called the alphabet. $\Sigma^*$ includes the empty string $\Lambda$. A language over $\Sigma$ is a subset $L\subseteq \Sigma^*$. A decision problem over the alphabet $\Sigma$ asks to evaluate a predicate on $\Sigma^*$. This is the same as the membership problem for the corresponding language. As mentioned in the preceding item, languages over $\Sigma$ are in one-to-one correspondence with decision problems over $\Sigma$ (predicates on $\Sigma^*$). We shall use the terminology of decision problems and languages interchangeably.

8.23 EXAMPLES (a) $\{0,1\}^*$ encodes the nonnegative integers (by writing them in binary and omitting the 1 at the beginning - so the empty string encodes 0). PRIMES will refer to the set of strings that describe the prime numbers; it will also refer to the problem of primality (deciding whether a given number is prime), which is the membership problem in this set.
(b) Consider a natural encoding of graphs over a finite alphabet $\Sigma$. Not all strings need to encode graphs, but it should be easy to decide whether a given string encodes a graph. Let 3-COL $\subset \Sigma^*$ denote the set of (codes corresponding to) 3-colorable graphs. 3-COL will also refere to the problem of 3-colorability.

8.24 NOTATION Let $\calA$ be a code in your favorite universal programming language such as Python or C. Let $\Sigma, \Pi$ be finite alphabets; we refer to $\Sigma$ as the "input alphabet" and $\Pi$ as the "output alphabet." The symbol "$\infty$" should not be a member of either alphabet. If we run $\calA$ on input $x\in\Sigma^*$ then either the program terminates (halts) in a finite number of steps with output $y\in\Pi^*$, in which case we write $\calA(x)=y$, or the program never halts, in which case we write $\calA(x)=\infty$. For $x\in \Sigma^*$ we write $\calA[x]$ to denote the program that takes no input and runs exactly as $\calA$ runs on input $x$.

8.26 DEF We say that a code $\calB$ is a halting program if $\calB$ does not take an input and halts in a finite number of steps. HALTING denotes the set of all halting programs. The HALTING problem is the membership problem in this set; it asks, given a Python code that does not take an input, does it halt (terminate)?

8.27 DEF Let $\Sigma, \Pi$ be finite alphabets. We say that function $f:\Sigma^*\to\Pi^*$ is computable (or recursive) if there exists a Python code $\calA$ that given input $x\in\Sigma^*$ runs and terminates (halts) in a finite number of steps and returns the output $f(x)$, so $(\forall x\in\Sigma^*)(\calA(x)=f(x))$. (So computable functions ar the same as recursive functions.)

8.28 DO The concept of a computable function does not depend on which universal language you choose (Python, C, Turing machine transition tables, etc.). Is this a theorem or a definition?

8.29 DEF A set (language) $L\subseteq\Sigma^*$ is computable (recursive) is its characteristic function is computable. The set of computable sets is denoted $\RRR$. Non-recursive languages are called undecidable (no algorithm can decide membership in such a language).

8.30 DEF A set (language) $L\subseteq\Sigma^*$ is computably enumerable (recursively enumerable) if there exists a Python code $\calA$ such that $(\forall x\in\Sigma^*)(x\in L \iff \calA(x)\neq\infty)$, i.e., $(\forall x\in\Sigma^*)(x\in L \iff \calA[x] \text{ is a halting program })$.

8.32 TERMINOLOGY: class vs.set. We say that $\RRR$ and $\RE$ are classes of languages. This means the same as "sets of languages," so we use the term "class" as a synonym of "set." But there is difference in usage: "class" does not refer to every set, only to subsets of something huge; in this case, we are talking about the subsets of the set of all languages.

8.33 DO $\RRR \subseteq \RE$

8.34 DO (a) 3-COL $\in \RRR$ (b) HALTING $\in \RE$.

8.37 THEOREM The HALTING problem is undecidable. In other words, HALTING $\notin \RRR$.
Proof by "diagonalization" -- a reincarnation of Georg Cantor's method with which he proved that there is no bijection between the integers and the real numbers (itself a reincarnation of the ancient paradox "all Cretans lie").
Proof by contradiction. Assume HALTING $\in\RRR$ and let $\calA$ be a program that decides HALTING. Note that HALTING is a language over an alphabet $\Sigma$ which contains all the symbols required to describe a Python program as a string. Let $\calP$ denote the set of all syntactically correct Python programs. So $\calP\subseteq\Sigma^*$. We shall construct a halting program $\calB\in\calP$ with the following property: $$(\forall x\in\calP)(\calB(x)\neq\infty \iff x(x)=\infty)$$ In other words, $\calB$ halts on input $x$ if and only if $x$ (a Python program) does not halt on input $x$.
First we observe that this property is absurd: taking $x=\calB$ we get that $\calB(\calB)\neq\infty \iff \calB(\calB)=\infty$. This contradiction proves the Theorem.
Now we need to construct $\calB$. We apply $\calA$ to $x[x]$. If $\calA(x[x])\neq\infty$ then we send $\calB$ in an infinite loop, so $\calB(x)=\infty$. If $\calA(x[x])=\infty$ then we halt $\calB$, so $\calB(x)\neq\infty$. This completes the proof.

8.40 NOTATION (set of complements) For a language $L\subseteq\Sigma^*$ we write $\Lbar=\Sigma^*\setminus L$ for the complement of $L$ (with respect to the alphabet $\Sigma$; we usually omit mentioning $\Sigma$ which should be clear from the context). Let $\calC$ be a set of languages. Then co$\calC$ denotes the set of complements of the languages in $\calC$. So $L\in\text{co}\calC$ if and only if $\Lbar\in\calC$.
WARNING. $\text{co}\calC$ is not the complement of $\calC$ but the set of complements of the languages in $\calC$.

8.41 DO Let $\calC$ and $\calD$ be classes of languages. Prove: if $\calC \subseteq \calD$ then $\text{co}\calC \subseteq \text{co}\calD$.

8.43 HW (8 points, due Wed, March 4) Prove: $\RRR = \RE\cap\text{ co}\RE$. In other words, if a language $L$ and its complement are computably enumerable then $L$ is computable.

8.45 DO Prove: a language $L\subseteq\Sigma^*$ belongs to $\RE$ if and only if either $L$ is empty or there exists a computable function $f:\{0,1\}^*\to\Sigma^*$ such that $L$ is the range of $f$, i.e., $L=\{f(x) \mid x\in\{0,1\}^*\}$.

* * *

8.50 DEF (Relativized computation) An oracle $\calA$ is a device (black box) that magically computes a function $x\mapsto \calA(x)$. The oracle has an input tape, an output tape, and an activation switch. When activated, the oracle reads its input tape $(x)$, prints the output $\calA(x)$, and resets the activation switch to "OFF." A computation relative to the oracle is, in addition to normal computation, allowed to write, time to time, on the oracle's input tape, activate the oracle, and read the oracle's output. The operation of the oracle is free (but the computation incurs the cost of computing the inputs to the oracle and reading the outputs).

8.51 DEF The meaning of relativized computation is that we reduce our computational task to another one (the one solved by the oracle). If we have an algorithm to compute the oracle function, we can replace the the oracle calls by a subroutine. We say that a function $f_1 : \Sigma_1^*\to \Pi_1^*$ is Turing-reducible to the function $f_2 : \Sigma_2^*\to \Pi_2^*$ if $f_1$ is computable relative to an $f_2$-oracle (an oracle that computes $f_2$).

8.52 DO Assume $f_1$ is Turing-reducible to $f_2$. Prove: (a) If $f_2\in \RRR$ then $f_1\in\RRR$. (b) If $f_1$ is undecidable then $f_2$ is undecidable. (c) True or false: if $f_2$ is undecidable then $f_1$ is undecidable.

8.53 Bonus (5+7 points, due Friday, March 6) (a) Prove: $\RE \neq \text{co}\RE$.
(b) Prove that the following statement is false.
Assume the language $L_1$ is Turing-reducible to the language $L_2$. If $L_2\in \RE$ then $L_1\in\RE$.

8.54 DEF Let $L_1\subseteq \Sigma_1^*$ and $L_2\subseteq \Sigma_2^*$ be languages. We say that a computable function $f:\Sigma_1^*\to\Sigma_2^*$ is a many-one reduction from $L_1$ to $L_2$ if $(\forall x\in\Sigma_1^*)(x\in L_1 \iff f(x)\in L_2)$. We say that $L_1$ is many-one reducible to $L_2$ if there exists such a computable function $f$; this circumstance is denoted $L_1 \prec_{\text{m-1}} L_2$.

8.55 DO Let $L_1, L_2$ be languages. Assume $L_1 \prec_{\text{m-1}} L_2$. Prove: (a) $L_1$ is Turing-reducible to $L_2$. (b) If $L_2\in \RRR$ then $L_1\in \RRR$. (c) If $L_2\in \RE$ then $L_1\in \RE$. Contrast this last statement with 8.53 (b).

* * *

8.57 DEF A relativized computation is polynomial time if the number of steps is polynomially bounded in terms of the bit-length of its input. The steps include the oracle calls. In particular, the length of the output of each oracle call must be polynomially bounded in terms the length or the original input to the algorithm. (Otherwise the algorithm could not read the oracle's output.)

8.58 DEF $f_i :\Sigma_i^*\to \Pi_i^*$ be functions for $i=1,2$. We say that $f_1$ is Cook-reducible to $f_2$ $f_1$ can be computed in polynomial time relative to $f_2$, i.e., with reference to an oracle that computes $f_2$. We denote this circumstance by $f_1 \prec_{\text{Cook}} f_2$. Cook-reducibility is also called polynomial-time Turing-reducibility.

8.60 DO If a function $f$ is Cook-reducible to a polynomial-time computable function $g$ then $f$ is polynomial-time computable.

8.62 DO (Optimization vs. decision) Consider the integral KNAPSACK problem (review the handout). The input consists of a list of $n$ weights, $(w_1,\dots,w_n)$, a list of $n$ values, $(v_1,\dots,v_n)$, and a weight limit $W$. (All these parameters are positive integers.) We wish to find $$ \max_{I\subseteq [n]}\, \left(\sum_{i\in I}\, v_i \left| \sum_{i\in I}\right. w_i\le W\right).$$ The corresponding decision problem sets a target value $u\ge 0$ and asks if that value can be reached. We express this decision problem as the language $$ \text{KNAP-DEC} =\left\{(w_1,\dots,w_n, v_1,\dots,v_n, W, u) \, \left| \, \left(\exists I\subseteq [n]\right)\left(\sum_{i\in I} \right. w_i\le W \text{ and } \sum_{i\in I} v_i\ge u\right) \right\}.$$ Prove: KNAPSACK $\prec_{\text{Cook}}$ KNAP-DEC. Let $t$ denote the number of oracle calls made by the algorithm. Express $t$ in terms of the input parameters. Prove that $t=O(N)$ where $N$ is the the bitlength of the input.

8.64 DO Prove: KNAP-DEC $\prec_{\text{Cook}}$ KNAPSAC.

8.66 HW (8 points) The FACTORING problem takes as input a positive integer and asks to find its prime factorization. E.g., if the input is $360$ then the output is $2\cdot 2\cdot 2\cdot 3\cdot 3\cdot 5)$. The decision version of this problem corresponds to the factoring language defined as the set FACT-DEC $=\{(x,y)\mid x\ge y\ge 2 \text{ and } (\exists d)(2\le d\le y)(d \text{ divides } x)\}$. Prove: FACTORING $\prec_{\text{Cook}}$ FACT-DEC. If $x$ is an $n$-bit integer, show that your algorithm makes $O(n^c)$ oracle calls for some constant $c$. Make $c$ as small as possible. WARNING: The definition of FACT-DEC was originally incorrect and was updated 2-29 at 10pm.

8.67 DO Prove: FACT-DEC $\prec_{\text{Cook}}$ FACTORING.

8.68 Bonus (8 points) Assume we are given a 3-colorability oracle. (The input to the oracle is a graph, and the output is YES/NO depending on whether the graph is 3-colorable.)
We are given a graph $G=(V,E)$. The oracle tells us that $G$ is 3-colorable. Use the oracle to find a 3-coloring of $G$ in polynomial time. Our job is to construct (adaptively) a sequence of a polynomial number of inputs to the oracle and use the answers to construct the coloring.

8.70 HW (7 points) Prove that the relation of Cook reducibility is transitive. Specifically, let $f_i :\Sigma_i^*\to \Pi_i^*$ for $i=1, 2, 3$ (where the $\Sigma_i$ and $\Pi_i$ are finite alphabets). Assume that $f_1$ is Cook-reducible to $f_2$ in time $C_1n^{d_1}$ and $f_2$ is Cook-reducible to $f_3$ in $C_2n^{d_2}$ time then $f_1$ is Cook-reducible in time $C_3n^{d_3}$ for some values $C_3$ and $d_3$ that depend only on $C_1,C_2,d_1,d_2$. Find the smallest values of $C_3$ and $d_3$ inferable from the given information.

* * *

8.75 DEF The language class $\PPP$ is the set of polynomial-time computable languages, i.e., the languages $L$ such that the membership problem in $L$ can be solved in polynomial time. WARNING: $\PPP$ does not include all polynomial-time solvable problems, only the polynomial-time solvable decision problems.

8.76 EXAMPLES. The following languages are known to belong to $\PPP$: connected graphs, bipartite graphs, planar graphs, bipartite graphs with a perfect matching, DAGs. As of 2002 we know that PRIMES $\in \PPP$ (Agrawal-Kayal-Saxena algorithm (AKS algorithm)).

8.77 DO Prove: P = coP.

8.80 DEF (The class $\NP$) NP is a class of languages. Informally, a language $L$ belongs to NP if membership in $L$ has efficiently verifiable witnesses. So the "yes" answer to the decision problem (membership in $L$) has a "short proof." The "no" answer may or may not have a short proof. The validity of a witness is decided by a "judge" (a polynomial-time algorithm). In the formal definition below, $L_1$ is the judge, $w$ is the witness. $|x|$ denotes the length of the string $x$.
Formal definition. Let $L\subseteq\Sigma^*$. The language $L$ belongs to NP if
$(\exists L_1\in\PPP)(\exists C,\Sigma_1)(\forall x\in\Sigma^*) (x\in L \iff (\exists w\in\Sigma_1^*)((x,w)\in L_1 \wedge |w|\le |x|^C))$.

8.81 EXAMPLES. The decision problems corresponding to the "puzzles" we have considered belong to NP:

3-COL (input: a graph witness: a 3-coloring)
CLIQUE-DEC (input: a pair $(G,k)$ where $G$ is a graph and $k$ is a positive integer witness: a clique of size $\ge k$)
KNAP-DEC (defined in Ex. 8.62) witness: a subset $I\subseteq [n]$ such that $\sum_{i\in I} w_i \le W$ and $\sum_{i\in I} v_i \ge u$.
PLANAR GRAPHS (input: a graph witness: a plane drawing of the graph)
NONPLANAR GRAPHS (input: a graph witness: a topological $K_5$ or $K_{3,3}$ subgraph
FACT-DEC (the factoring decision problem defined in Ex. 8.66) (input: a pair $(x,y)$ of integers where $x\ge y\ge 2$ witness: an integer $d$ such that $2\le d\le y$ and $d\mid x$)
SAT (satisfiable Boolean circuits) (input: a Boolean circuit witness: a satisfying assignment)
3-SAT (input: a 3-CNF formula witness: a satisfying assignment)

8.82 HW (7 points, due Wed, March 4) Prove: $\PPP\subseteq\NP$. You need to give an NP-definition of each language in P. Give an NP-definition with the shortest witnesses. It suffices, for each $L\in \PPP$ to state the input and the witness, so your solution should be very short (one line).

8.83 Bonus (8 points, due Friday, March 6) In a test, students were asked to reproduce the formal definition of NP. Kyle gave an exact copy of the definition given in 8.80 except he omitted the statement $|w|\le |x|^C$ (so his definition would permit long witnesses). Let us call the language class defined by Kyle's definition $\NP^*$. Determine $NP^*$. Reason your answer.

* * *

8.86 DEF (Karp-reducibility) Let $L_1\subseteq \Sigma_1^*$ and $L_2\subseteq \Sigma_2^*$ be languages. We say that a function $f:\Sigma_1^*\to\Sigma_2^*$ is a Karp-reduction from $L_1$ to $L_2$ if
(a) $f$ is polynomial-time computable;
(b) $(\forall x\in\Sigma_1^*)(x\in L_1 \iff f(x)\in L_2)$.
We say that $L_1$ is Karp-reducible to $L_2$ if there exists such a function $f$. We denote this circumstance by $L_1 \prec_{\text{Karp}} L_2$.
Karp-reducibility is also called "polynomial-time many-one reducibility." DO: compare the definition of Karp-reducibility with the definition of many-one reducibility (8.54).

8.87 DEF We say that the language $L$ is $\NP$-complete if
(a) $L\in \NP$
(b) $(\forall M\in\NP)(M \prec_{\text{Karp}} L)$.
The class of NP-complete languages is denoted NPC.

8.88 COMMENT In this well-defined sense, the NP-complete languages (decision problems) are the hardest problems in NP.

8.89 DO All NP-complete problems are equally hard in the sense that any two NP-complete problems are mutually Karp-reducible.

8.90 DO (a) If $(\exists L\in\NPC)(L\in\PPP)$ then $\NP=\PPP$.
(b) Prove: If we can decide 3-colorability in polynomial time then we can factor integers in polynomial time. Use the fact that 3-COL $\in\NPC$.

8.91 HW (7 points, due Fri, March 6) Prove: 3-SAT $\prec_{\text{Karp}}$ HALTING.

8.93 QUESTION. Do there exist NP-complete problems? Do there exist natural NP-complete problems? It turns out that most problems of the type "given a puzzle, is it feasible (does it have a solution)" are NP-complete. 3-colorability is such a puzzle.

8.94 THEOREM (Cook-Levin, 1972) SAT is NP-complete.

8.95 DO Let $M\in\NP$ and $L\in \NPC$ be languages. Prove: If $L \prec_{\text {Karp}} M$ then $M\in\NPC$.

8.96 DO A language $L$ is NP-complete if and only if

$L\in\NP$
SAT $\prec_{\text {Karp}} L$

8.100 Bonus (8 points, due Wed, March 4) Prove: SAT $\prec_{\text{Karp}}$ 3-SAT. (Recall: SAT denotes the set of satisfiable Boolean circuits, and 3-SAT the set of satisfiable 3-CNF formulas.)

8.101 THEOREM (Karp, 1973) 3-SAT $\prec_{\text{Karp}}$ 3-COL. This proves that 3-COL is NP-complete.

8.102 In his seminal 1973 paper, Karp also showed that the following problems are NP-complete (this list is incomplete):

CLIQUE-DEC (see 8.81)
HAMILTONIAN GRAPHS (graphs with a Hamilton cycle, i.e., a cycle that passes through each vertex)
3D-MATCHING (input: a set $V$ and a set of 3-subsets $\{A_1,\dots,A_m\}$ $(|A_i|=3)$ witness: a subset $I\subseteq [m]$ such that the sets $\{A_i \mid i\in m\}$ partition $V$)
SUBSET-SUM (input: positive integers $a_1,\dots,a_m,b$ witness: a subset $I\subseteq [m]$ such that $\sum_{i\in I} a_i = b$)

Subsequently, hundreds of problems of interest to operations research, machine learning, etc., were shown to be NP-complete. See the book "Computers and intractability" by Garey and Johnson.

8.104 HW (6 points, due Wed, March 4) Let COVER-DEC $=\{(G,k) \mid G \text{ is a graph, } k \text{ is a positive integer, and } G \text{ has a vertex cover of size }\le k\}$.
Use the NP-completeness of CLIQ-DEC to prove that COVER-DEC is NP-complete.

8.109 DO Prove: $\PPP\subseteq \NP\cap \coNP$.

8.110 HW (8 points, due Fri, March 6) The following two are famous conjectures.
1. $\PPP\neq\NP$
2. $\NP \neq \coNP$
Which of these is stronger (implies the other)? Prove your answer.

Go to top

Homework set #7 (due Mon, Feb 24 before class except where otherwise stated)

7.1 REVIEW the "Dijkstra's algorithm" handout.

7.2 STUDY the "Loop invariants" handout.

7.3 DO Solve the exercises in the "Loop invariant" handout. In particular, produce a complete proof of correctness of Dijkstra's algorithm.

7.4 STUDY the "Euclid's algorithm" handout. Understand the definition of "greatest common divisor." (It is not what you learned in school.) Understand the loop invariant in Euclid's algorithm. Use the loop invariant to prove correctness and thereby prove the existence of a greatest common divisor for every pair of integers.

7.4 STUDY the "Repeated squaring" handout. Understand the loop invariant in this algorithm and how to use it to prove correctness of the algorithm.

7.9 HW (6 points) Solve the problem stated in the "Edit-distance" handout.

* * *

7.10 DEF Let $G=(V,E)$ be a graph. An independent set in $G$ is a subset $A\subseteq V$ such the no pair of vertices in $A$ is adjacent. The independence number of $G$ is the maximum size of its independent sets; it is denoted by $\alpha(G)$.

7.11 DEF A subset $B\subseteq V$ is a clique if every pair of vertices in $B$ is adjacent. The clique number of $G$ is the maximum size of its cliques; it is denoted by $\omega(G)$.

7.12 DO Prove: $\omega(G)=\alpha(\Gbar)$.

7.13 DO Prove: (a) $\alpha(C_n)=\lfloor n/2 \rfloor$ $(n\ge 3)$ (b) $\omega(C_n)=2$ if $n\ge 4$ and $\omega(C_3)=3$.

7.14 DEF A vertex cover of $G$ is a subset $T\subseteq V$ such that every edge includes at least one vertex from $T$. In other words, $T$ "hits" every edge. The covering number of $G$ is the minimum size of its vertex covers; it is denoted by $\tau(G)$. (The LaTeX code of the Greek letter $\tau$ is $\tt \backslash tau$.)

7.15 HW (3 points, due Wed, Feb 26) Prove: $\alpha(G)+\tau(G)=n$

7.16 HW (4 points, due Wed, Feb 26) Find a graph $G$ and a minimal vertex cover $T$ (meaning that no proper subset of $T$ is a vertex cover) such that $|T| > 100\tau(G)$. Make your graph as small as possible.

7.17 DO Prove: for all graphs $G$ we have $\nu(G)\le \tau(G)$ where $\nu(G)$ is the matching number (maximum size of a matching).

7.18 HW (5 points, due Wed, Feb 26) Prove: for all graphs $G$ we have $\tau(G)\le 2\nu(G)$.

7.19 Bonus (9 points, due Wed, Feb 26) Prove: If $G$ is a bipartite graph then $\nu(G)=\tau(G)$. Use the Max-Flow-Min-Cut Theorem.

7.20 DEF A Boolean variable takes values 0 or 1. A Boolean function in $n$ variables is a function $f(x_1,\dots,x_n)$ of $n$ Boolean variable sthat takes values 0 or 1, so we can write $f : \{0,1\}^n \to \{0,1\}$.

7.21 DO Count the Boolean functions in $n$ variables. The answer is $2^{2^n}$. Why?

7.22 DEF A literal is a Boolean variable $x_i$ or its negation $\xbar_i$. A disjunctive clause is an OR of literals, e.g., $x_3 \vee \xbar_7 \vee \xbar_8 \vee x_{10}$. A CNF (Conjunctive Normal Form) is an AND of disjunctive clauses: $C_1 \wedge \dots \wedge C_m$ where the $C_i$ are disjunctive clauses. Analogously, a conjunctive clause is an AND of literals, and a DNF (Disjunctive Normal Form) is an OR of conjunctive clauses.

7.23 DO (De Morgan's laws) If $f$ and $g$ are Boolean functions then (a) $\overline{f \vee g}={\overline f} \wedge {\overline g}$ (b) $\overline{f \wedge g}={\overline f} \vee {\overline g}$

7.24 DO Prove: every Boolean function can be written as a CNF and also as a DNF.

7.25 DEF (Parity function) PARITY$_n(x_1,\dots,x_n)=\left(\sum_{i=1}^n x_i \bmod 2\right)$ where the $x_i$ are Boolean variables. Another notation for this Boolean function is $x_1 \oplus \dots \oplus x_n$.

7.26 DO Find a CNF and a DNF for $x\oplus y$.
Answer (verify!): $x\oplus y = (x \vee y)\wedge(\xbar \vee \ybar) = (x \wedge \ybar)\vee(\xbar \wedge y)$.

7.27 Bonus (5 points, due Wed, Feb 26) Prove: if we express PARITY$_n$ as a CNF then the expression will have at least $2^{n-1}$ disjunctive clauses.

No more items due this week -- no problems due Friday, Feb 28. Problems assigned in class but not yet posted will be due Monday, March 2.

Go to top

Homework set #6 (due Mon, Feb 17 before class except where otherwise stated)

6.1 DO Let $f$ be an $s\to t$ flow in a flow network and $(A,B)$ an $s\to t$ cut. Prove: $\val(f)\le c(A,B)$ where $\val(f)$ denotes the value of the flow and $c(A,B)$ the capacity of the cut.

6.2 DO Review the Max-Flow-Min-Cut Theorem (Ford-Fulkerson, 1956):
$$ \max_f \val(f) = \min_{A,B} c(A,B) $$ where the maximum is taken over all $s\to t$ flows $f$ and the minimim is taken over all $s\to t$ cuts $(A,B)$.
Note that the fact that $\max_f \val(f)\le \min_{A,B} c(A,B)$ is immediate from the preceding exercise, so the essence of the result if the statement that there exist a flow $f$ and a cut $(A,B)$ such that $\val(f) = c(A,B)$.

6.5 HW (8 points) Use the Ford-Fulkerson algorithm to find a maximum matching in a bipartite graph in $O(mn)$ time in the unit cost model (arithmetic with $O(\log n)$-digit numbers incurs unit cost).
So your input is a bipartite graph $G$ given in adjacency list representation. The output should be a maximum matching in $G$, i.e., a maximum possible number of pairwise disjoint edges. What you need to do is reduce this question to a max-flow problem by constructing a flow network $H$ in which the maximum flows correspond to the maximum matchings in $G$.

6.7 HW (6 points) Let $f:\nnn\to \nnn$ be a monotone non-decreasing function (i.e., a function such that $x\le y \implies f(x)\le f(y)$) satisfying the recurrent inequality $$ f(n) \le f\left(\left\lfloor\frac{n}{5}\right\rfloor\right) + f\left(\left\lfloor\frac{7n}{10}\right\rfloor\right) + O(n). $$ Prove: $f(n) = O(n).$

6.8 DO Review SELECTION by rank using a linear number of comparisons. The input is a list $L$ of $n$ items (real numbers) and an integer $1\le r\le n$. The task is to find the item of rank $r$ in $L$, i.e., the $r$-th smallest among the items on the list. The only information we can get about the items on the list is pairwise comparisons. Use the preceding exercise. Do not ignore rounding. In using the preceding exercise, you will need to justify the rounding used there. Give a detailed account of what goes into the $O(n)$ term; part of it is to deal with rounding.

6.9 DO We are given a list of $n$ items (real numbers) where $n$ is odd. We are asked to decide whether the first item on the list is the median, using comparisons only. Prove: we need at least $n-1$ comparisons in the worst case.

6.10 HW (2+(0+3+5+7+2) points) Recall that a spanning subgraph of a graph $G=(V,E)$ is a subgraph $H=(W,F)$ such that $V=W$ (same set of vertices) and $F\subseteq E$.
(a) Count the spanning subgraphs of $G$. Your answer should be a very simple formula in terms of the basic parameters of $G$. State your answer, do not prove.
(b) Given a graph $G$ with $m$ edges, we wish to find a spanning bipartite subgraph with at least $m/2$ edges. We use the following algorithm. Initially we color each vertex red or blue, arbitrarily (in any way). We say that vertex $v$ is bad if $v$ has more neighbors of its own color than of the opposite color. We say that the edge $e=\{a,b\}$ is bad if $a$ and $b$ have the same color; $e$ is good otherwise.
(b1) Observe that the set of good edges forms a bipartite spanning subgraph. More formally, let $G=(V,E)$ be the given graph and let $A\subseteq E$ be the set of good edges. Then $(V,A)$ is a spanning bipartite subgraph of $G$.
(b2) Prove: if there are no bad vertices then there are at least $m/2$ good edges.

Now here is the algorithm. While there exists a bad vertex, pick a bad vertex and recolor it. End(while).
Item (b2) shows that if this algorithm ever terminates, we found the desired subgraph.
(b3) Prove that the following statement is false: "The number of bad vertices goes down in every round." In fact, it is possible that in some round, the number of bad vertices increases by 100. Give an example where this happens. Your example should be as small as possible (minimum number of edges). State the number of edges in your example. You do not need to prove minimality.
(b4) Prove that the algorithm terminates in at most $m$ rounds.
(b5) Prove that every graph $G$ has a bipartite spanning subgraph with at least $m/2$ edges.

* * *

6.15 STUDY the following material from the instructor's Fall 2019 course " Introduction to Mathematical Reasoning via Discrete Mathematics" (MR). The numbers below refer to the "Homework, material covered" section.

Divisibility (MR 1.16-1.24, 2.2-2.6),
greatest common divisor (MR 3.4-3.13), especially understand MR DEF 3.4
Euclid's algorithm handout, Bézout's Lemma (MR 4.15-1.17)
Equivalence relations, partitions (MR 5.12-5.33, 7.10-7.16).
Congruences (MR 6.18-6.40, especially 6.26-6.29, 7-17-7.22)
Fermat's little Theorem (MR 8.13-8.17)

6.16 DO Prove the Fundamental Theorem of Equivalence Relations (MR 7.12-7.15).

6.17 HW (5+2 points, due Wednesday, Feb 19) (a) Let $x$ be an integer. Prove: $\gcd(3x+8,5x+1)$ is either 1 or 37. (Proof: one line.)
(b) Find the smallest positive integer $x$ such that the gcd is 37. Describe how you found it. No proof required.

6.18 HW (7 points, due Wednesday, Feb 19) Prove: $(\forall a)(a^{13}\equiv a \pmod{91})$. Give a simple proof; avoid numerical calculations. You may use the exercises and theorems in the sections of MR listed above. (State what you use.) [Typo in the equation fixed 2-16, 10:40pm]

6.19 HW (7 points, due Wednesday, Feb 19) Prove: Euclid's algorithm computes the gcd of two $n$-bit integers in $O(n^3)$ bit-operations. Therefore Euclid's algorithm runs in polynomial time. To prove this result, solve Exercise 2.3 in the "Euclid's algorithm" handout. You may assume Exercise 2.4 of the handout without proof.

6.22 STUDY the "Repeated squaring" handout.

6.23 DO Consider the following computational task. Input: Nonnegative integers $a,b$. Outpout: the power $a^b$. Prove: this task cannot be solved in polynomial time. (Hint: the output is too long)

6.24 DO (Modular exponentiation) Prove: Given $a,b,m$ positive integers, the quantity $(a^b \bmod{m})$ (the remainder of $a^b$ when divided by $m$) can be computed in polynomial time.

6.25 DO Let $M$ denote the $2\times 2$ matrix $$M=\left(\matrix{0 & 1 \\ 1 & 1}\right) .$$ Determine the matrix $M^k$. Prove your answer. The entries of the matrix will come from a familiar sequence. Hint. Experiment with small values of $k$. Observe a pattern, make a conjecture. Prove your conjecture by induction.

6.26 DO Let $F_k$ denote the $k$-th Fibonacci number. ($F_0=0, F_1=1, F_k = F_{k-1}+F_{k-2}$). Prove: $\displaystyle{F_k \sim \frac{\phi^k}{\sqrt{5}}}$ where $\displaystyle{\phi = \frac{1+\sqrt{5}}{2}\approx 1.618}$ is the golden ratio.

6.27 Bonus (5+2 points, due Wednesday, Feb 19) Let $k$ be a positive integer (given in binary). (a) Determine asymptically the number of bits of the Fibonacci number $F_k$. (b) Prove: $F_k$ cannot be computed in polynomial time.

6.28 Bonus (7 points, due Wednesday, Feb 19) Let $k$ and $m$ be positive integers (given in binary). Compute $(F_k \bmod m)$ in polynomial time. Give a specific upper bound on the running time of your algorithm in the form $O(n^C)$ where $n$ is the bit-length of the input and $C$ is a constant. Prove your bound. Find the smallest value of $C$ for which you can prove such a bound.

Go to top

Homework set #5 (due Mon, Feb 10 before class except where otherwise stated)

5.1 DO Recall that Dijkstra's algorithm solves the single-source min-weight paths problem for the case that all weights are non-negative. Show that Dijkstra's algorithm fails when negative weights are permitted. Find a counterexample that is acyclic (has no directed cycles). Make your counterexample as small as possible.

HW 5.2 (4+4+4+6 points) We consider flow networks with positive integer capacities. We call such a network nontrivial if there exists a directed $s\to\dots\to t$ path. We call an edge critical if decreasing the capacity of this edge results in a decrease in the maximum flow. We call an edge a bottleneck edge if increasing its capacity results in an increase in the maximum flow.
For each statement below, decide whether the statement is true for all nontrivial flow networks with positive integer capacities. If true, prove; if false, provide a counterexample.
(a) a critical edge always exists
(b) a bottleneck edge always exists
(c) every critical edge is a bottleneck edge
(d) every bottleneck edge is a critical edge.

Give the smallest possible counterexamples (minimum possible number of edges). (You do not have to prove minimality.) State your counterexamples in adjacency list form with associated weights (so each entry of the adjacency list $\Adj[u]$ will be a pair $(v,w)$ where $v$ is vertex and $w$ is the weight of the edge $(u,v)$). Clearly indicate the source and the target nodes (the vertices $s$ and $t$). Discuss why your example is indeed a counterexample. Accompany your solution with a (hand-drawn) picture of the counterexample.

5.5 Bonus (10 points) Let $\calA$ be an algorithm that takes as input a heap-sorted list of data and returns the data in a sorted list. $\calA$ is a comparison-based algorithm: the only thing it can do with the data is pick two of them and compare them. Let $t(n)$ be the number of comparisons performed by $\calA$ in the worst case, i.e., the maximum number of comparisons performed by $\calA$ on inputs of size $n$ ($n$ items). Prove: $t(n)\gtrsim n \log_2 n$.
WARNING: this is NOT a question about Heapsort. The algorithm has random access to the data, it is allowed to compare any two of the items. (The algorithm may ignore the heap or use some of the comparison information encoded in the heap.) Bookkeeping is free; this includes copying, comparing, and doing arithmetic with addresses, recording the outcomes of previous comparisons, etc. All computation is also free; this includes making the decision what pair of items to compare next, based on the outcome of previous comparisons.

5.6 DEF Let $f: \nnn\to\rrr$ be a function. We say that $f$ is polynomially bounded if there exists a constant $c$ such that $f(n) = O(n^c)$.

5.7 HW (5 points) Let $f: \nnn\to\rrr$. Assume $f(n)\ge 1$ for all sufficiently large $n$. Prove: $f$ is polynomially bounded if and only if $\ln f(n) = O(\ln n)$.

5.8 HW (5 points) Find a function $f$ that is not polynomially bounded such that for every $\epsilon > 0$ the function $f$ satisfies $\displaystyle{f(n) < \eee^{n^{\epsilon}}}$ for all sufficiently large $n$. [WARNING: the phrase "for all sufficiently large $n$" was added Feb 9 at 10pm]

5.9a DEF We say that an algorithm runs in polynomial time if the number of steps it takes on an input of bit-length $n$ is polynomially bounded as a function of $n$.

5.9b DO Let $x$ be a positive integer. Prove that the number of bits (binary digits) of $x$ is $\lceil \log_2 (x+1) \rceil$. Prove that this quantity is asymptotically equal to $\log_2 x$ (as $x\to\infty$).

5.10 HW (3+10 points) Consider the network flow problem with integer capacities. Let our input digraph have $n$ vertices, $m$ edges, and let $C_{\max}$ be the maximum capacity: $C_{\max}=\max_{e\in E}c_e$. We have shown that the Ford-Fulkerson algorithm takes at most $n\cdot C_{\max}$ rounds and each round takes linear time.
(a) Does it follow from this information that the F-F algorithm does not run in polynomial time? (Careful, the answer is "no" -- why?)
(b) Prove that F-F in fact does not run in polynomial time in the worst case. -- What you need to show is an infinite sequence of instances of the execution of the F-F algorithm such that the running time on those instances is not polynomially bounded. An "instance" is a flow network (the input); describe and attach a (hand) drawing. An "execution" needs to specify your choice of the augmenting path in case there are several $s\to t$ paths in the residual digraph $G_f$. Estimate the bit-length of the input and the running time for each of your instances and show that the latter is not polynomially bounded as a function of the former.

5.11 HW (6+9B points) (NOTE: please check the parenthetical explanations added to the first paragraph on 2-10, 1am. If these explanations will result in a change of your solution, please let me know by Monday 3pm and email me the updated solution by Tuesday 10am.)
We have shown that the scaling version of the Ford-Fulkerson algorithm makes at most $2m(1+\log_2 C)$ rounds where $C=\sum_{e\in E}c_e$ is the total capacity. (The outer loop makes at most $\lceil \log_2 C\rceil$ rounds, the inner loop $\le 2m$ rounds for each value of the scaling paraneter $\Delta$.) Each round of the inner loop takes $O(m\log_2 C)$ time. (It requires $O(m)$ arithmetic operations with $O(\log C)$-digit numbers.) So the running time is $O(m^2 (\log C)^2)$. (Warning: in class, the complexity was stated as $O(m^2 \log C)$, which is the number of arithmetic operations, but we need to consider the bit-complexity of those operations, which adds the $\log C$ factor.) Let $N$ denote the bit-length of the input.
(a) Show that the algorithm runs in time $O(N^4)$. Do not forget to state a lower bound on $N$ in terms of the variables used in the upper bound on time.
(b) (Bonus) Let $C_{\max}$ denote the maximum capacity. Assume at least half the edges $e\in E$ have capacity $c_e \ge C_{\max}/10$. Prove that the running time is $O(N^2 (\log N)^2)$. $%N=m \log C_{\max}$ $% C=m C_{\max}$ $%T=(m\log C)^2=(m(\log m + \log C_{\max})^2$

5.12 DO Solve the First Quiz problems.

Go to top

Homework set #4 (due Mon, Feb 3 before class except where otherwise stated)

Remember that several problems in HW set #3 are also due Feb 3.
WARNING. It turns out that some of the assigned problems were discussed in the discussion session. Those problems have not been canceled but their point value has been reduced.

QUIZ Thu Feb 6 Ry-276 4:30-5:00 (30 min during discussion session)

4.1 DO Let $m(h)$ denote the maximum number of nodes of a balanced binary tree of height $h$. (a) Prove: $n(h)= 2^{h+1}-1$.
(b) Prove: if a balanced binary tree has $n$ nodes and its height is $h$ then $2^h\le n < 2^{h+1}$ and therefore $h = \lfloor \log_2 n\rfloor$.

4.2 DO Review the array implementation of a balanced binary tree link structure.

4.3 DO Study the Heap data structure.

4.4 DO Write a pseudocode for the bubble-up and bubble-down operations in a heap.

4.5 DO Write a pseudocode for the HEAPIFY algorithm that arranges $n$ data in a heap.

4.6 DO Prove: $\displaystyle{\sum_{k=0}^{\infty}\frac{k}{2^k}=}2$.
For $|q| < 1$ compute the sum $\displaystyle{\sum_{k=0}^{\infty}kq^k}.$ Give a simple closed-form expression.

4.7 DO Use the precedig exercise to prove that Heapify uses $O(n)$ comparisons.

4.8 DO Describe the HEAPSORT algorithm in elegant pseudocode.

4.9 DO A PRIORITY QUEUE is a dynamic data structure that serves the following instructions: CREATE EMPTY QUEUE, INSERT, CHANGE-KEY, EXTRACT-MIN. Each data is a pair (node,key) where "key" is a real number that can be updated.
(a) Review the Heap implementation of the PRORITY QUEUE.
(b) Write pseudocodes for the Heap implementation of the INSERT, CHANGE-KEY, EXTRACT-MIN operations.

4.10 STUDY the "Dijkstra's algorithm" handout.

4.11 DO Dijkstra's algorithm uses a Priority queue and therefore the complexity of the algorithm depends on the specific implementation of the Priority queue. Show that with the Heap implementation of the Priority queue, Dijkstra's algorithm runs in time $O(n+m)\log n$.

4.12 HW ($k$-way merge) (12 points) Given $k$ sorted lists of data (real numbers), merge them into a single sorted list using comparisons. Your algorithm should cost $O(n \log k)$ (including bookkeeping), where $n$ denotes the total number of items on the $k$ lists combined, and there is unit cost associated with each elementary operation with real numbers (comparison, copying, deleting) and each link update. Describe your algorithm in clean pseudocode. In the description you may refer to high level data structure commands implemented in class. Indicate why the algorithm will have the complexity (cost) stated. Comment. Copying real numbers can be replaced by link updates, so the only operation with real numbers that we need is comparison.

* * *

In the rest of this section we consider graphs (by the definition used in this class, this means undirected graphs).

4.13 STUDY the "Greedy coloring" handout. Below we refer to this handout as "GRCOL". The handout begins with the definition of the chromatic number of a graph, denoted $\chi(G)$. We say that $G$ is $k$-colorable if $k$ colors suffice for a legal coloring, i.e., if $\chi(G)\le k$. a graph is bipartite if it is 2-colorable.

4.14 DO Recall that $C_n$ denotes the cycle of length $n$ ($n\ge 3$) and $K_n$ the clique of size $n$ ($n\ge 0$). Let $G$ be a graph with $n$ vertices. Prove: (a) $\chi(G)\le n$ (b) $\chi(G)=n \iff G\cong K_n$ (c) $\chi(C_n)=2$ if $n$ is even and $\chi(C_n)=3$ if $n$ is odd.

4.15 DO If $H\subseteq G$ (subgraph) then $\chi(H)\le \chi(G)$.

4.16 DEF $G$ is bipartitie if and only if $G$ contains no odd cycles. -- This gives a necessary and sufficient condition of bipartiteness. One of these two is obvious - which one?

4.17 DO A graph is "triangle-free" is it contains no $K_3$. (a) Find the smallest triangle-free graph that is not 2-colorable ($\chi(G)\ge 3$). (b) Find the smallest $K_4$-free graph that is not 3-colorable ($\chi(G)\ge 4$).

4.18 HW (6 points) Find a triangle-free graph that is not 3-colorable. The smallest such graph has 11 vertices and can be placed in the plane so that it has a 5-fold rotational symmetry: by rotating the plane by $72^o$, we get the same picture. (Edges of the picture will cross. Other graphs with 5-fold rotational symmetry include the pentagon ($C_5$), the 5-clique ($K_5$), the 6-clique ($K_6$), and the Petersen graph.) Make a nice drawing of your graph (by hand). You do not need to prove that your graph is triangle-free, but you have to prove that it is not 3-colorable. You do not need to prove that this is the smallest graph with the required properties.

4.18 DO GRCOL Problem (a) (greedy coloring uses no more than $1 + \deg_{\max}$ number of colors).

4.19 HW (6 points) GRCOL (b) ("Greedy coloring is terrible")

4.20 Bonus (5 points) GRCOL (c) ("Greedy coloring can be opimal")

4.21 Bonus (5 points) GRCOL (d) (linear-time implementation of greedy coloring)

4.22 STUDY the "Greedy matching" (GRMATCH) handout.

4.23 HW (2+2 points) Determine $\nu(C_n)$ and $\nu(K_n)$. Give the answers as simple formulas; no proof needed.

4.24 HW (3 points) Find a graph $G$ with 100 edges and $\nu(G)=1$.

4.25 HW (4+5 points) GRMATCH (a1) $+$ (a2).

4.26 Bonus (3 points) GRMATCH (b).

Go to top

Homework set #3 (due Mon, Jan 27 before class except where otherwise stated)

3.1 DO Study the BFS (Breadth-First Search) handout. Below we refer to exercises stated in that handout.
Solve all exercises stated in the BFS handout. (Some of them are assigned as HW problems below.)

3.2 HW (5 points) BFS Ex.4(c)

3.3 HW (7 points) BFS Ex.6

3.4 HW (7 points) BFS Ex.7

3.5 HW (3+3 points) BFS Ex.8

3.6 Bonus (8 points) BFS Ex.10 (parent tree)

3.7 HW (8 points) Handout: All-ones square. Note: this is a dynamic programming problem. Write your solution in elegant pseudocode. Half the credit goes for the "brain" of the algorithm: a simple and accurate definition of the meaning of the entries of the array you will fill incrementally.

3.8 Bonus (Integer Valued Knapsack) (14 points) Consider the Knapsack problem with integer values (rather than integer weights, as in class - now the weights and the weight limit are reals). Let $V$ denote the sum of all values. Solve the knapsack problem at cost O(nV). (Same warning as for the preceding problem.)

3.9 Bonus (3 points) Prove for every $n\ge 0$ the inequality $\displaystyle{n! \ge \left(\frac{n}{\eee}\right)^n}$.
Hint. Use the power series expansion of the $\eee^x$ function:
$$\displaystyle{\eee^x =\sum_{k=0}^{\infty} \frac{x^k}{k!}}$$ Note: Every empty product is defined to have value 1, including $0! = 0^0 = 1$.

3.10 HW (3+3+3 points) (a) Prove: for every $n\ge 1$ $$ n(\log_2 n - 1.45) \le \log_2 (n!) \le n\cdot \log_2 n$$ Use the preceding exercise. (Note: $\log_2 \eee \approx 1.4427$)
(b) Prove: $\log_2 (n!) \sim n\cdot \log_2 n$
(c) Prove: Any comparison-based algorithm that sorts $n$ items requires $\gtrsim n\log_2 n$ comparisons. Write no more than four lines to justify this lower bound.

3.11 Bonus (3 points) due Monday, Feb 3 Prove that for all $0\le x \le 1/2$ we have $$ 4^x \ge \frac {1}{1-x}\ .$$ (Note that for $x=0$ and for $x=1/2$ the two sides are equal.) We shall use this inequality in the analysis of MERGE-SORT.
Hint. Assume $0 < x \le 1/2$. We need to show that $2x\ln 2 \ge -\ln(1-x)$ (why?). Use the power series expansion
$$-\ln(1-x) = \sum_{k=1}^{\infty}\frac{x^k}{k}$$ and use the fact that $\ln 2 = - \ln(1/2).$

3.12 Bonus (4 points) due Monday, Feb 3 Prove: For any integer $k\ge 0$ we have $$ (k+1)\log_2(2k+1) \ge (k+1)\log_2 (k+1) + k \ .$$ Use the preceding exercise. Your proof should be no more than ten lines.

3.13 HW ((3+3)+4 points) (Analysis of MERGE-SORT) due Monday, Feb 3 Let $f(n)$ denote the number of comparisons required by the MERGE-SORT algorithm to sort $n$ items. The Divide-and-Conquer approach of the MERGE-SORT algorithm leads to he recurrent inequality \begin{equation} f(n) \le f\left(\left\lfloor\frac{n}{2}\right\rfloor\right)+ f\left(\left\lceil\frac{n}{2}\right\rceil\right)+ (n-1) \end{equation} for all $n\ge 2$, with the initial value $f(1)=0$. Prove that any function $f$ satisfying these conditions must satisfy $f(n)\le n \log_2 n$ for all $n\ge 1$. Do NOT ignore rounding. Use the method of reverse inequalities:
(a) Prove that the function $g(n):=n\cdot\log_2 n$ satisfies $g(1)=0$ and $$ g(n) \ge g\left(\left\lfloor\frac{n}{2}\right\rfloor\right)+ g\left(\left\lceil\frac{n}{2}\right\rceil\right)+ (n-1) \ $$ (the reverse of the inequality that holds for $f$).
(b) Use this fact to prove by induction on $n$ that $f(n)\le g(n)$ for all $n$.
Prove (a) separately for (a1) even and (a2) odd values of $n$. Hint for the hard case (a2). Let $n=2k+1$ ($k\ge 0$). So you need to prove $(2k+1)\log_2(2k+1) \ge k\log_2 k+(k+1)\log_2(k+1)+2k$. Prove the following two inequalities:
(a21) $k\log_2(2k+1) \ge k\log_2 k + k$
(a22) $(k+1)\log_2(2k+1) \ge (k+1)\log_2(k+1) + k$
Then just add these two inequalities.
For the difficult case, (a22), refer to the preceding exercise.

3.14 HW (1+1+4 points) (a) Let $a_n$ and $b_n$ be sequences. Prove: if $a_n=O(b_n)$ then $2a_n = O(b_n)$.
(b) Let $a_n$, $b_n$, and $c_n$ be sequences of positive numbers. Assume $a_n=O(b_n)$ and $b_n\le c_n$. Prove: $a_n=O(c_n)$. Point out why we needed these sequences to be positive.
(c) What is wrong with the following "proof" of the absurd statement that $2^n = O(n)$ ?
"Proof" by induction on $n$. True for $n=1$ since $2 = O(1)$. Assume true for $n$, let us prove it for $n+1$. Let $a_n = 2^n$, so we know $a_n=O(n)$ by the Inductive Hypothesis. But $2^{n+1}=2\cdot 2^n = 2a_n$, so by part (a) we get $2^{n+1}=O(n)$. Since $n \le n+1$, it follows by part (b) that $2^{n+1} = O(n+1)$.

Go to top

Homework set #2 (due Wed, Jan 22 before class except where otherwise stated)

Note that a number of problems assigned in Problem set #1 are also due Jan 22.

2.1 DO (a) Prove: $\ln x = o(x)$ as $x\to\infty$. (Here $x$ approaches infinity across all real numbers.) Hint: L'Hospital's rule.
(b) Prove: $\ln x = o(\sqrt{x})$ as $x\to\infty$. Do NOT use L'Hospital's rule in your proof. Instead, use part (a) with a simple change of variables. (This method, where we use a weaker result to prove a stronger result, is referred to as "bootstrapping.")
(c) Prove: $(\forall \epsilon > 0)(\ln x = o(x^{\epsilon})$ as $x\to\infty)$. Same instructions as in part (b): no L'Hospital, just simple bootstrapping.

2.2 DO Prove: $\eee^x -1 \sim x$ as $x\to 0$. (Either use L'Hospital's rule, or, better yet, use just the definition of the derivative.)

2.3 DO Study the handout on the Method of Reverse Inequalities. Answer the questions stated in that handout.

2.4 HW (7 points) Assume the function $f : \nnn\to \rrr$ satisfies the inequality $$ f(n) \le f(n-1) + f(n-2) + O(n^8) .$$ (What this means is that there exists a constant $C$ such that $f(n) \le f(n-1) + f(n-2) + C\cdot n^8$ holds for all sufficiently large $n$.) Prove: $f(n) = O(\phi^n)$ where $\phi = (1+\sqrt{5})/2$ denotes the Golden Ratio.

2.5 HW (7 points) Assume the function $g : \nnn\to \rrr$ satisfies the inequality $$ g(n) \le 2g(\lfloor n/2\rfloor) + O(n) .$$ Prove: $g(n) = O(n \log n)$.

2.6 DO STUDY the basic concepts of Graph Theory (vertices, edges, the adjacency relation, degree, subgraphs, complement of a graph, paths, cycles, cliques, independent sets, bipartite graphs, coloring, chromatic number) from LN and other online sources such as the instructor's Honors Graph Theory course (Spring 2019).

2.7 HW (6 points) Let $P_n$ denote the path of length $n-1$. (This graph has $n$ vertices, hence the subscript.) An independent set is a subset of the vertices such that there are no pair of verices in the subset is adjacent. Let $h(n)$ denote the number of independent sets in $P_n$. Find $h(n)$. Your answer should be a very simple formula in terms of a familiar sequence. State and prove your answer. (Note that the empty set is independent.) (To check that you understand the question correctly, here are the first few values of $h$: $h(0)=1$, $h(1)=2$, $h(2)=3$. Your answer should be consistent with these values. What is $h(4)$ ?)

2.8 HW (7 points) Let $G$ be a digraph (directed graph). Let $G^-$ denote the same digraph with all edges reversed. Assume $G$ is given in an adjacency list representation. Show that an adjacency list representation of $G^-$ can be found in linear time, i.e., time $O(m+n)$, where $n$ is the number of vertices and $m$ is the number of edges. (The vertices are labeled $1,2,\dots,n$.) Here "time" means the number of steps, and we use the convention that it takes just one step to copy a number in the range $1,\dots,n$ or to add/substract two such numbers. Write a simple and elegant pseudocode. Elegance counts. Do not use commands or notation specific to programming languages; just use the for and while loop instructions and the ${\mathrm{APPEND}}(L,t)$ command where $L$ is a list and $t$ is an item; the output of this command is the list $L$ with item $t$ added at the end. You can also create an empty list. You can also use Engish phrases in a pseudocode as long as they are unambiguous.

2.9 HW (7 points) We call an adjacency list representation $A$ of a digraph monotone if for each vertex $i$, the list $A[i]$ is increasing: $A[i][1]\le A[i][2] \le \dots$ Given an adjacency list representation of a digraph $G$, create a monotone adjacency list representation of $G$ in linear time ($O(m+n)$). Your pseudocode should be very simple, just a few lines, with reference to pseudocodes constructed in prior work (in class or homework). (Updated Jan 19 11:20pm. The previous version erroneously referred to "adjacency array representation".)

2.10 HW (7 points) A digraph is strongly connected if each vertex is accessible from each vertex. Given a digraph in an adjacency list representation, decide whether it is strongly connected, in linear time. Your pseudocode should be very simple, just a few lines, with reference to pseudocodes constructed in prior work (in class or homework).

2.11 HW (3+5+5B+5B points) (a) Let $a_0, a_1,\dots$ be an infinite sequence of real numbers $a_i > 1$. Prove: if $a_{n+1}=O(a_n)$ then $\ln a_n = O(n)$. (b) Find an infinite sequence of real numbers $a_i > 1$ such that $\ln a_n = O(n)$ but the $a_{n+1}=O(a_n)$ relation does not hold. (c) (Bonus) Let $a_n = \eee^{\sqrt{n}}$. Prove: $a_{n+1}\sim a_n$. (d) (Bonus) Let $a_0, a_1,\dots$ be an infinite sequence of real numbers $a_i > 1$. Prove: if $a_{n+1}\sim a_n$ then $\ln a_n = o(n)$ (little-oh).

Go to top

Homework set #1 (due Mon, Jan 13 before class except where otherwise stated)

1.1 Please answer the questionnaire on the course home page (by email! do not hand in)

1.2 STUDY (due Wed, Jan 15): LN Chap 2 (asymptotic notation); replace Chap. 2.2 by ASY (see above).

1.3 (Due: continuous) STUDY each HANDOUT related to class material. In the HANDOUTS, most algorithms are described in "pseudocode." STUDY from these samples how to write pseudocode. Pseudocodes give exact loop structure (for, while) but may use English statements as instructions. The goal is clear logic, conciseness, elegance.

1.4 STUDY the following online materials (click the "Texts" and "Handouts" tabs on the banner) (due Wed, Jan 15):

Asymptotic Equality and Inequality (ASY) (an updated version of LN, Chap 2.2)
Asymptotic Notation (LN, Chapters 2.1, 2.3, 2.4)
Binary Search
Divide and Conquer: the Karatsuba integer multiplication algorithm

1.5 DO: solve the problems in ASY, Section 2 (due Wed, Jan 15)

1.6 HW (7 points) (asymptotic equality vs little-oh notation) due Wed, Jan 15. Find two sequences $a_n$ and $b_n$ of positive real numbers such that $a_n \sim b_n$ but $a_n^n = o(b_n^n)$. Prove that your answer is correct. The proof should be very short and elegant. Elegance counts.

1.7 HW (6+6 points) (taking the log of an asymptotic equality?) due Wed, Jan 22. ASY 1.10

1.8 Bonus (14 points) (solvig an asymptotic inequality) due Wed, Jan 22. ASY 2.20

1.9 (12 coins) (a) DO: Given 12 coins, one of which is fake, we want to find the fake coin and also decide whether it is heavier or lighter. Show that this can be done with 3 measurements on a balance. (A balance has two trays. You can put any number of coins on each tray. The outcome of a measurement is "Left heavy," "Right heavy" or "Equal.") (b) REWARD PROBLEM: Find an non-adaptive algorithm for this problem, i.e., specify your measurements (subsets of coins to put in each tray) in advance (you choice of which coins must not depend on the outcomes of previous measurements). Show that 3 measurements still suffice.

1.10 HW (5+10 points) (13/14 coins)
UPDATE (posted Jan 14, 6pm): Part (a) extended to Wed, Jan 22 because I was late with the posting. You may submit your solution either on Wed, Jan 15 or Wed, Jan 22. (Do not submit it on Fri, Jan 17, because I will not be there. My colleague Lorenzo Orecchia will teach the class.)

Given n coins, one of which is fake, we want to find the fake coin and also decide whether it is heavier or lighter, using measurements on a balance. (a) Prove that 3 measurements are not enough if n = 14. Make your solution short and elegant. Elegance counts. (b) Prove that 3 measurements are not enough if n = 13. (A correct solution to part (b) does not earn you the points for part (a); you need to give a very short direct solution to (a).)

1.11 DO (Find min) due Wed, Jan 15 Given n real numbers ("weights"), $x_1,\dots,x_n$, we want to find one that is smallest. We use the comparison model. In this model, we do not see the numbers (they come in closed boxes), we can only compare them (with some physical experiment). Comparisons are binary: given two indices $i$ and $j$ $(1\le i < j \le n)$, the comparison answers the question "is $x_i\le x_j$"? (Comparisons cost one unit each, bookkeeping is free.) Prove: (a) $n - 1$ comparisons suffice. (b) $n - 2$ comparisons do not suffice.

1.12 HW (6 points) (Find min and max), due Wed, Jan 15. Given n real numbers, we want to find both a smallest and a largest among them. We use the comparison model. Prove: 3n/2 comparisons suffice to find both. (Note that this is rather surprising; how can determining the min reduce the cost of determining the max?)

1.13 DO (communication complexity) Review the model of "communication complexity" from you class notes. The randomized protocol we studied for the "equality" problem is called the Rabin--Simon--Yao protocol. Work out the analysis.

1.14 HW (communication complexity)
UPDATED Jan 12, 6pm. The following errors were corrected: the word "digit" should have been "bits" (binary digits) (so $X$ and $Y$ are an $N$-bit integers and $p$ is a $k$-bit prine number) and $X[i]$ denotes the $i$-th bit of $X$.) Moreover, the intended bound was $(k+1)(1+\lceil\log_2 N\rceil)$ bits of communication, rather than $k(1+\lceil\log_2 N\rceil)$ bits. Thank you to the students who warned me about these errors. So here is the updated wording.

1.14 (a) (14 points) due Wed, Jan 15 Alice is given an $N$-bit integer $X$ and Bob is given an $N$-bit integer $Y$ (initial zeros are permitted). Having executed the Rabin--Simon--Yao protocol, Alice and Bob have shared knowledge of a $k$-bit prime number $p$ (where the specific value of $k$ is irrelevant for this problem) such that $X \not\equiv Y\pmod p$. This proves that $X\neq Y$, but now the little green men also demand that Alice and Bob determine a position $i$ ($1\le i\le N$) such that $X[i]\neq Y[i]$ (where $X[i]$ denotes the $i$-th bit of $X$, and similarly for $Y$). Prove that Alice and Bob can accomplish this task by an additional $(k+1)(1+\lceil\log_2 N\rceil)$ bits of deterministic communication (back and forth) between Alice and Bob. (No coins flipped.) (Here $\lceil x\rceil$ denotes the rounded-up value of $x$. For instance, $\lceil 4.1\rceil = 5$ and $\lceil 4\rceil = 4$ and $\lceil -4.1\rceil = -4$.) Give a clear description of the protocol, and a clean analysis. You may refer to a relevant handout. Actually, fewer than $(k+1)(1+\log_2 N)$ bits suffice, no need for the "ceiling" sign (another typo).

1.14 (b) (Bonus problem, 5 points) due Wed, Jan 22 TO BE ADDED Prove: there exists a constant $C$ such that if $k\ge C$ then Alice and Bob need fewer than $(k+1)((\log_2 N) - 100)$ bits of communication in problem (a).

1.14 (c) (Bonus problem, 6 points) due Wed, Jan 22 Now we justify some of the original typos. Assume $X$ and $Y$ are $N$-digit integers (decimal!) and $p$ is a $k$-bit prime (binary!). Prove: if $k\ge 8$ then Alice and Bob need fewer than $(k+1)\log_2 N$ bits of communication to find the position of a (decimal) digit where $X$ and $Y$ differ. (Note: Alice and Bob send bit-strings to each other, they cannot send decimal digits. As before, we count the bits communicated. -- Note: there is no "ceiling" sign in this result, this is not a typo.) Indicate where you are using the assumption that $k\ge 8$.

1.15 DO (Strassen's matrix multiplication) due Wed, Jan 22. If we multiply two n by n matrices according to the textbook definition, we need to perform $\Theta(n^3)$ arithmetic operations. In 1969, Volker Strassen discovered that this was not optimal: he reduced the multiplication of two n by n matrices to multiplication of 7 pairs of (n/2) by (n/2) matrices. The cost of the reduction is O(n²) (it involves computing certain linear combinations of matrices of these dimensions and bookkeeping). (a) Study the exact details of this recursive step, e.g., from the Wikipedia entry "Strassen's algorithm." (b) State the recurrence for the complexity T(n) of this algorithm. (c) Prove that T(n) = O(n^α) where α = log₂7 ≈ 2.807355. (Assume n is a power of 2.)

1.16 REWARD PROBLEM (coins in a line, from Peter Winkler's "Mathematical Puzzles: A Conoisseur's Collection"): Prove: (a) if n is even, Alice can always collect at least half the value. (b) This is not the case for odd n. Give a counterexample for every odd n > 1. (c) Suppose there are only two kinds of coins, of value 100 and 101 cents. Prove that a random sequence of such coins is almost surely a counterexample when n is odd.

1.17 Bonus (14 points) (Evenly splitting the coins) due Wed, Jan 22. Suppose we have 2n coins, some of which are fake. All the fake coins have equal weight, and they are lighter than the true coins (which also have equal weight). Using O(log n) measurements on a balance, divide the coins into two sets of n coins of nearly equal weight. (A measurement consist of putting a subset of the coins in the left tray and another (disjoint) subset in the right tray.) Each measurement has three possible outcomes: left-heavy, the trays balance (have equal weight), right-heavy.) "Nearly equal" means if the number of fake coins is even, they must be evenly split between the two sets; if their number is odd, one set must have exactly one more fake coin than the other set. Write your algorithm in elegant pseudocode. You may refer to a relevant handout.

Go to top

CMSC 27230: Honors Theory of Algorithms -- Winter 2020

Homework and Material Covered

REFRESH your browser to see latest update

REFRESH your browser to see latest update

Homework set #10 (due Wednesday, March 11 before class)

REFRESH your browser to see latest update

Homework set #9 (due Mon, March 9 before class except where otherwise stated)

Homework set #8 (due Mon, March 2 before class except where otherwise stated)

Homework set #7 (due Mon, Feb 24 before class except where otherwise stated)

Homework set #6 (due Mon, Feb 17 before class except where otherwise stated)

Homework set #5 (due Mon, Feb 10 before class except where otherwise stated)

Homework set #4 (due Mon, Feb 3 before class except where otherwise stated)

Homework set #3 (due Mon, Jan 27 before class except where otherwise stated)

Homework set #2 (due Wed, Jan 22 before class except where otherwise stated)

Homework set #1 (due Mon, Jan 13 before class except where otherwise stated)

View the instructor's class material from previous years

Return to the Department of Computer Science home page

Course home

Go to top