## Papers

**Circuit complexity, proof complexity, and polynomial identity testing**.

IEEE Symposium on Foundations of Computer Science (FOCS), to appear October 2014.

Also available as ECCC Technical Report TR14-052, April 2014 (ECCC) and arXiv:1404.3820 [cs.CC, cs.LO, math.LO] (arXiv)

Abstract BibTeX

@inproceedings{GrochowPitassiPIT, AUTHOR = {Grochow, Joshua A. and Pitassi, Toniann}, TITLE = {Circuit complexity, proof complexity, and polynomial identity testing}, YEAR = {2014}, BOOKTITLE = {55th Annual IEEE Symposium on Foundations of Computer Science (FOCS)}, NOTE = {To appear. Also available as arXiv:1404.3820 [cs.CC] and ECCC Technical Report TR14-052}, }

We introduce a new and very natural algebraic proof system, which has tight connections to (algebraic) circuit complexity. In particular, we show that any super-polynomial lower bound on any Boolean tautology in our proof system implies that the permanent does not have polynomial-size algebraic circuits (VNP is not equal to VP). As a corollary to the proof, we also show that **super-polynomial lower bounds on the number of lines in Polynomial Calculus proofs (as opposed to the usual measure of number of monomials) imply the Permanent versus Determinant Conjecture**. Note that, prior to our work, there was no proof system for which lower bounds on an arbitrary tautology implied *any* computational lower bound.

Our proof system helps clarify the relationships between previous algebraic proof systems, and begins to shed light on why proof complexity lower bounds for various proof systems have been so much harder than lower bounds on the corresponding circuit classes. In doing so, we highlight the importance of polynomial identity testing (PIT) for understanding proof complexity.

More specifically, we introduce certain propositional axioms satisfied by any Boolean circuit computing PIT. (The existence of efficient proofs for our PIT axioms appears to be somewhere in between the major conjecture that PIT is in P and the known result that PIT is in P/poly.) We use these PIT axioms to **shed light on AC ^{0}[p]-Frege lower bounds, which have been open for nearly 30 years**, with no satisfactory explanation as to their apparent difficulty. We show that either:

- Proving super-polynomial lower bounds on AC
^{0}[p]-Frege implies VNP_{GF(p)}does not have polynomial-size circuits of depth d—a notoriously open question for any d≥4—thus explaining the difficulty of lower bounds on AC^{0}[p]-Frege, or - AC
^{0}[p]-Frege cannot efficiently prove the depth d PIT axioms, and hence we have a lower bound on AC^{0}[p]-Frege.

Finally, using the algebraic structure of our proof system, **we propose a novel way to extend techniques from algebraic circuit complexity to prove lower bounds in proof complexity**. Although we have not yet succeeded in proving such lower bounds, this proposal should be contrasted with the difficulty of extending AC^{0}[p] circuit lower bounds to AC^{0}[p]-Frege lower bounds.

**Algorithms for group isomorphism via group extensions and cohomology**.

*IEEE Conference on Computational Complexity (CCC)*, June 2014.

Also available as arXiv:1309.1776 [cs.DS] (arXiv) and ECCC Technical Report TR13-123, September 2013. (ECCC)

Abstract BibTeX

@inproceedings{GrochowQiaoGpIso, TITLE = {Algorithms for group isomorphism via group extensions and cohomology}, AUTHOR = {Grochow, Joshua A. and Qiao, Youming}, BOOKTITLE = {IEEE Conference on Computational Complexity (CCC14)}, YEAR = {2014}, NOTE = {Also available as arXiv:1309.1776 [cs.DS] and ECCC Technical Report TR13-123}, }

The isomorphism problem for groups given by their multiplication tables (GpI) has long been known to be solvable in n^{O(log n)} time, but only recently has there been significant progress towards polynomial time. For example, Babai *et al.* (ICALP 2012) gave a polynomial-time algorithm for groups with no abelian normal subgroups. Thus, at present it is crucial to understand groups with abelian normal subgroups to develop n^{o(log n)}-time algorithms.

Towards this goal we advocate a strategy via the extension theory of groups, which describes how a normal subgroup N is related to the quotient group G/N via G. This strategy "splits" GpI into two subproblems: one regarding group actions on other groups, and one regarding group cohomology. The solution of these problems is essentially necessary and sufficient to solve GpI. Most previous works on GpI naturally align with this strategy, and it thus helps explain in a unified way the recent polynomial-time algorithms for other group classes. In particular, most results in the Cayley table model have focused on the group action aspect, despite the general necessity of cohomology, for example for p-groups of class 2—believed to be the hardest case of GpI.

To make progress on the group cohomology aspect of GpI, we consider *central-radical groups*, proposed in Babai *et al.* (SODA 2011): the class of groups such that G mod its center has no abelian normal subgroups. Recall that Babai *et al.* (ICALP 2012) consider the class of groups G such that G itself has no abelian normal subgroups. Following the above strategy, we solve GpI in n^{O(log log n)} time for central-radical groups, and in polynomial time for several prominent sub-classes of central-radical groups. We also achieve an n^{O(log log n)}-time bound for groups whose solvable normal subgroups are elementary abelian but not necessarily central. **As far as we are aware, this is the first time that a nontrivial algorithm with worst-case guarantees has tackled both aspects of GpI—actions and cohomology—simultaneously.**

**Prior to our work, nothing better than the trivial n ^{O(log n)}-time algorithm was known,** even for groups with a central radical of constant size, such as Z(G)=Z

_{2}. To develop these algorithms we utilize several mathematical results on the detailed structure of cohomology classes, as well as algorithmic results for code equivalence, coset intersection and cyclicity testing of modules over finite-dimensional associative algebras. We also suggest several promising directions for future work.

**Rotor-routing and spanning trees on planar graphs**.

*International Mathematics Research Notices*, 2014. (doi)

Also available as arXiv:1308.2677 [math.CO] August 2013. (arXiv)

Abstract BibTeX

@article{ChanChurchGrochowRotor, AUTHOR = {Chan, Melody and Church, Thomas and Grochow, Joshua A.}, TITLE = {Rotor-routing and spanning trees on planar graphs}, JOURNAL = {Int. Math Res. Not.}, FJOURNAL = {International Mathematics Research Notices}, YEAR = {2014}, NOTE = {Also available as arXiv:1308.2677 [math.CO]}, DOI = {10.1093/imrn/rnu025}, }

^{0}(G) of a finite graph G is a discrete analogue of the Jacobian of a Riemann surface which was rediscovered several times in the contexts of arithmetic geometry, self-organized criticality, random walks, and algorithms. Given a ribbon graph G, Holroyd

*et al.*used the "rotor-routing" model to define a free and transitive action of Pic

^{0}(G) on the set of spanning trees of G. However, their construction depends

*a priori*on a choice of basepoint vertex. Ellenberg asked whether this action does in fact depend on the choice of basepoint. We answer this question by proving that the action of Pic

^{0}(G) is independent of the basepoint if and only if G is a planar ribbon graph.

**Unifying known lower bounds via geometric complexity theory**.

J. A. Grochow

*IEEE Conference on Computational Complexity (CCC)*, June 2014.

Invited to a special issue of the journal

*Computational Complexity*, to appear 2015.

Also available as arXiv:1304.6333 [cs.CC] April 2013. (arXiv)

Abstract BibTeX

@inproceedings{GrochowGCTUnity, TITLE = {Unifying known lower bounds via geometric complexity theory}, AUTHOR = {Grochow, Joshua A.}, BOOKTITLE = {IEEE Conference on Computational Complexity (CCC14)}, YEAR = {2014}, NOTE = {Invited to a special issue of the journal \emph{Computational Complexity}, to appear 2015. Also available as arXiv:1304.6333 [cs.CC]}, }

^{0}[p], multilinear formula and circuit size lower bounds (Raz

*et al.*), the degree bound (Strassen, Baur–Strassen), the connected components technique (Ben-Or), depth 3 arithmetic circuit lower bounds over finite fields (Grigoriev–Karpinski), lower bounds on permanent versus determinant (Mignon–Ressayre, Landsberg–Manivel–Ressayre), lower bounds on matrix multiplication (Bürgisser–Ikenmeyer) (these last two were already known to fit into GCT), the chasms at depth 3 and 4 (Gupta–Kayal–Kamath–Saptharishi; Agrawal–Vinay; Koiran), matrix rigidity (Valiant) and others. That is, the original proofs, with what is often just a little extra work, already provide representation-theoretic obstructions in the sense of GCT for their respective lower bounds. This enables us to expose a new viewpoint on GCT, whereby it is a natural unification of known results and broad generalization of known techniques. It also shows that the framework of GCT is at least as powerful as known methods, and gives many new proofs-of-concept that GCT can indeed provide significant asymptotic lower bounds. This new viewpoint also opens up the possibility of fruitful two-way interactions between previous results and the new methods of GCT; we provide several concrete suggestions of such interactions. For example, the representation-theoretic viewpoint of GCT naturally provides new properties to consider in the search for new lower bounds.

**Matrix Lie algebra isomorphism**. (Previously: Lie algebra conjugacy. More accurately: Matrix isomorphism of matrix Lie algebras.)

J. A. Grochow

*IEEE Conference on Computational Complexity (CCC)*, June 2012. (doi)

Also available as arXiv:1112.2012 [cs.CC, cs.DS, cs.SC, math.RT] (arXiv) and ECCC Technical Report TR11-168 (ECCC)

See my Ph.D. thesis for a more complete version.

Short Abstract Detailed Abstract BibTeX

@inproceedings{GrochowLieAlgebraIso, AUTHOR = {Grochow, Joshua A.}, TITLE = {Matrix {Lie} algebra isomorphism}, BOOKTITLE = {IEEE Conference on Computational Complexity (CCC12)}, YEAR = {2012}, PAGES = {203--213}, NOTE = {Also available as arXiv:1112.2012 [cs.CC] and ECCC Technical Report TR11-168.}, DOI = {10.1109/CCC.2012.34}, }

*Comm. ACM*, 2012, and references therein). A matrix Lie algebra is a set L of matrices that is closed under linear combinations and the operation [A,B] = AB - BA. Two matrix Lie algebras L, L' are matrix isomorphic if there is an invertible matrix M such that conjugating every matrix in L by M yields the set L'. We show that certain cases of MatIsoLie—for the wide and widely studied classes of

*semisimple*and

*abelian*Lie algebras—are equivalent to graph isomorphism and linear code equivalence, respectively. On the other hand, we give polynomial-time algorithms for other cases of MatIsoLie, which allow us to mostly derandomize a recent result of Kayal on affine equivalence of polynomials.

*Comm. ACM*, 2012, and references therein). A matrix Lie algebra is a set L of matrices such that A,B ∈ L implies AB - BA ∈ L. Two matrix Lie algebras are conjugate if there is an invertible matrix M such that L

_{1}= ML

_{2}M

^{-1}. We show that certain cases of Lie algebra conjugacy are equivalent to graph isomorphism. On the other hand, we give polynomial-time algorithms for other cases of Lie algebra conjugacy, which allow us to mostly derandomize a recent result of Kayal on affine equivalence of polynomials. Affine equivalence is related to many complexity problems such as factoring integers, graph isomorphism, matrix multiplication, and permanent versus determinant. Specifically, we show:

- Abelian Lie algebra conjugacy is as hard as graph isomorphism. A Lie algebra is abelian if all of its matrices commute pairwise.
- Abelian diagonalizable Lie algebra conjugacy of n × n matrices can be solved in poly(n) time when the Lie algebras have dimension O(1). The dimension of a Lie algebra is the maximum number of linearly independent matrices it contains. A Lie algebra L is diagonalizable if there is a single matrix M such that for every A in L, MAM
^{-1}is diagonal. - Semisimple Lie algebra conjugacy is equivalent to graph isomorphism. A Lie algebra is semisimple if it is a direct sum of simple Lie algebras.
- Semisimple Lie algebra conjugacy of n × n matrices can be solved in polynomial time when the Lie algebras consist of only O(log n) simple direct summands.
- Conjugacy of completely reducible Lie algebras—that is, a direct sum of an abelian diagonalizable and a semisimple Lie algebra—can be solved in polynomial time when the abelian part has dimension O(1) and the semisimple part has O(log n) simple direct summands.

**Report on "Mathematical Aspects of P vs. NP and its Variants"**.

arXiv:1203.2888 [cs.CC, math.AG, math.NT, math.RT] (arxiv)

Abstract BibTeX

@misc{GrochowRusekReport, AUTHOR = {Grochow, Joshua A. and Rusek, Korben}, TITLE = {Report on ``{Mathematical} {Aspects} of {P} vs. {NP} and its {Variants}''}, YEAR = {2012}, HOWPUBLISHED = {arXiv:1203.2888 [cs.CC]}, NOTE = {Workshop held at {Brown--ICERM} in August, 2011, organizers: {Saugata} {Basu}, {J.} {M.} {Landsberg,} and {J.} {Maurice} {Rojas}}, }

*Comm. ACM*, 2012, and references therein), and number theory and other ideas in the Blum-Shub-Smale model.

**Complexity classes of equivalence problems revisited**.

*Information and Computation*209(4):748-763, 2011. (doi)

Also available as arXiv:0907.4775v2 [cs.CC], 2009. (arXiv)

Originally my master's thesis. See my my Ph.D. thesis for the latest updates.

Abstract BibTeX

@article{FortnowGrochowPEq, AUTHOR = {Fortnow, Lance and Grochow, Joshua A.}, TITLE = {Complexity classes of equivalence problems revisited}, JOURNAL = {Inform. and Comput.}, FJOURNAL = {Information and Computation}, VOLUME = {209}, YEAR = {2011}, NUMBER = {4}, PAGES = {748--763}, ISSN = {0890-5401}, NOTE = {Also available as arXiv:0907.4775 [cs.CC]}, }

*canonical form*for the equivalence relation of set equality. Other canonical forms arise in graph isomorphism algorithms. To determine if two graphs are cospectral (have the same eigenvalues), we compute their characteristic polynomials and see if they are equal; the characteristic polynomial is a

*complete invariant*for cospectrality. Finally, an equivalence relation may be decidable in P without either a complete invariant or canonical form. Blass and Gurevich (

*SIAM J. Comput.*, 1984) ask whether these conditions on equivalence relations—having an FP canonical form, having an FP complete invariant, and being in P—are distinct. They showed that this question requires non-relativizing techniques to resolve. We extend their results, and give new connections to probabilistic and quantum computation.

**Code equivalence and group isomorphism**.

*ACM-SIAM Symposium on Discrete Algorithms (SODA)*, 2011. (pdf)

Abstract BibTeX

@inproceedings {BabaiCodenottiGrochowQiaoSODA11, AUTHOR = {Babai, L{\'a}szl{\'o} and Codenotti, Paolo and Grochow, Joshua A. and Qiao, Youming}, TITLE = {Code equivalence and group isomorphism}, BOOKTITLE = {Proceedings of the {Twenty-Second} {Annual} {ACM--SIAM} {Symposium} on {Discrete} {Algorithms} ({SODA11})}, PAGES = {1395--1408}, PUBLISHER = {SIAM}, ADDRESS = {Philadelphia, PA}, YEAR = {2011}, }

The isomorphism problem for groups given by their multiplication tables has long been known to be solvable in time n^{log n + O(1)}. The decades-old quest for a polynomial-time algorithm has focused on the very difficult case of class-2 nilpotent groups (groups whose quotient by their center is abelian), with little success. In this paper we consider the opposite end of the spectrum and initiate a more hopeful program to find a polynomial-time algorithm for *semisimple groups*, defined as groups without abelian normal subgroups. First, we prove that the isomorphism problem for this class can be solved in time n^{O(log log n)}. We then identify certain bottlenecks to polynomial-time solvability and give a polynomial-time solution to a rich subclass, namely the semisimple groups where each minimal normal subgroup has a bounded number of simple factors. We relate the results to the filtration of groups introduced by Babai and Beals (1999).

One of our tools is an algorithm for equivalence of (not necessarily linear) codes in simply-exponential time in the length of the code, obtained by modifying Luks's algorithm for hypergraph isomorphism in simply-exponential time in the number of vertices (FOCS 1999).

We comment on the complexity of the closely related problem of permutational isomorphism of permutation groups.

**Genomic analysis reveals a tight link between transcription factor dynamics and regulatory network architecture**.

*Molecular Systems Biology*5:294, 2009. (pdf) (doi)

Abstract BibTeX

@article{jothiBalajiEtAlMSB2009, AUTHOR = {Jothi, Raja and Balaji, S. and Wuster, Arthur and Grochow, Joshua A. and Gsponer, J\"{o}rg and Przytycka, Teresa M. and Aravind, L. and Babu, M. Madan}, TITLE = {Genomic analysis reveals a tight link between transcription factor dynamics and regulatory network architecture}, JOURNAL = {Mol. Syst. Biol.}, FJOURNAL = {Molecular Systems Biology}, VOLUME = {5}, NUMBER = {294}, YEAR = {2009}, PUBLISHER = {EMBO and Nature Publishing Group}, DOI = {10.1038/msb.2009.52}, }

**Network motif discovery using subgraph enumeration and symmetry-breaking**.

In

*RECOMB 2007*, Lecture Notes in Computer Science 4453, pp. 92-106. Springer-Verlag, 2007. (pdf) (doi)

See my master's thesis for a more complete version.

Abstract BibTeX

@inproceedings{GrochowKellisRECOMB2007, AUTHOR = {Grochow, Joshua A. and Kellis, Manolis}, TITLE = {Network motif discovery using subgraph enumeration and symmetry-breaking}, BOOKTITLE = {Research in Computational Molecular Biology (RECOMB07)}, SERIES = {Lecture Notes in Computer Science}, VOLUME = {4453}, YEAR = {2007}, PAGES = {92--106}, PUBLISHER = {Springer-Verlag}, ISBN = {978-3-540-71680-8}, ISSN = {0302-9743}, DOI = {10.1007/978-3-540-71681-5_7}, }

The study of biological networks and network motifs can yield significant new insights into systems biology. Previous methods of discovering network motifs—network-centric subgraph enumeration and sampling—have been limited to motifs of 6 to 8 nodes, revealing only the smallest network components. New methods are necessary to identify larger network sub-structures and functional motifs.

Here we present a novel algorithm for discovering large network motifs that achieves these goals, based on a novel symmetry-breaking technique, which eliminates repeated isomorphism testing, leading to an exponential speed-up over previous methods. This technique is made possible by reversing the traditional network-based search at the heart of the algorithm to a motif-based search, which also eliminates the need to store all motifs of a given size and enables parallelization and scaling. Additionally, our method enables us to study the clustering properties of discovered motifs, revealing even larger network elements.

We apply this algorithm to the protein-protein interaction network and transcription regulatory network of *S. cerevisiae*, and discover several large network motifs, which were previously inaccessible to existing methods, including a 29-node cluster of 15-node motifs corresponding to the key transcription machinery of *S. cerevisiae*.

## Theses

**Symmetry and equivalence relations in classical and geometric complexity theory**.

J. A. Grochow

Doctoral dissertation, U. Chicago, 2012. Advisors: Prof. Ketan Mulmuley and Prof. Lance Fortnow (pdf)

Informal Summary Abstract BibTeX

@phdthesis{grochowPhD, AUTHOR = {Grochow, Joshua A.}, TITLE = {Symmetry and equivalence relations in classical and geometric complexity theory}, YEAR = {2012}, SCHOOL = {University of Chicago}, ADDRESS = {Chicago, IL}, }

This thesis studies some of the ways in which symmetries and equivalence relations arise in classical and geometric complexity theory. The Geometric Complexity Theory Program is aimed at resolving central questions in complexity such as P versus NP using techniques from algebraic geometry and representation theory. The equivalence relations we study are mostly algebraic in nature and we heavily use algebraic techniques to reason about the computational properties of these problems. We first provide a tutorial and survey on Geometric Complexity Theory to provide perspective and motivate the other problems we study.

One equivalence relation we study is matrix isomorphism of matrix Lie algebras, which is a problem that arises naturally in Geometric Complexity Theory. For certain cases of matrix isomorphism of Lie algebras we provide polynomial-time algorithms, and for other cases we show that the problem is as hard as graph isomorphism. To our knowledge, this is the first time graph isomorphism has appeared in connection with any lower bounds program.

Finally, we study algorithms for equivalence relations more generally (joint work with Lance Fortnow). Two techniques are often employed for algorithmically deciding equivalence relations: 1) finding a complete set of easily computable invariants, or 2) finding an algorithm which will compute a canonical form for each equivalence class. Some equivalence relations in the literature have been solved efficiently by other means as well. We ask whether these three conditions—having an efficient solution, having an efficiently computable complete invariant, and having an efficiently computable canonical form—are equivalent. We show that this question requires non-relativizing techniques to resolve, and provide new connections between this question and factoring integers, probabilistic algorithms, and quantum computation.

**The complexity of equivalence relations**.

J. A. Grochow.

Master's thesis, U. Chicago, 2008. Advisor: Prof. László Babai (pdf)

Journal version above.

Abstract BibTeX

@mastersthesis{GrochowEquivalence2008, AUTHOR = {Grochow, Joshua A.}, TITLE = {The complexity of equivalence relations}, SCHOOL = {University of Chicago}, YEAR = {2008}, MONTH = {December}, }

To determine if two given lists of numbers are the same set, we would sort both lists and see if we get the same result. The sorted list is a *canonical form* for the equivalence relation of set equality. Other canonical forms for equivalences arise in graph isomorphism and its variants, and the equality of permutation groups given by generators. To determine if two given graphs are cospectral, however, we compute their characteristic polynomials and see if they are the same; the characteristic polynomial is a *complete invariant* for the equivalence relation of cospectrality. This is weaker than a canonical form, and it is not known whether a canonical form for cospectrality exists. Note that it is a priori possible for an equivalence relation to be decidable in polynomial time without either a complete invariant or canonical form.

Blass and Gurevich (*SIAM J. Comput.*, 1984) ask whether these conditions on equivalence relations—having an FP canonical form, having an FP complete invariant, and simply being in P—are in fact different. They showed that this question requires non-relativizing techniques to resolve. Here we extend their results using generic oracles, and give new connections to probabilistic and quantum computation.

We denote the class of equivalence problems in P by PEq, the class of problems with complete FP invariants Ker, and the class with FP canonical forms CF; CF ⊆ Ker ⊆ PEq, and we ask whether these inclusions are proper. If x ~ y implies |y| ≤ poly(|x|), we say that ~ is polynomially bounded; we denote the corresponding classes of equivalence relation CF_{p}, Ker_{p}, and PEq_{p}. Our main results are:

- If CF=PEq then NP=UP=RP and thus PH = BPP;
- If CF = Ker then NP = UP, PH = ZPP
^{NP}, integers can be factored in probabilistic polynomial time, and deterministic collision-free hash functions do not exist; - If Ker=PEq then UP ⊆ BQP;
- There is an oracle relative to which CF ≠ Ker ≠ PEq; and
- There is an oracle relative to which CF
_{p}= Ker_{p}and Ker ≠ PEq.

**On the structure and evolution of protein interaction networks**.

J. A. Grochow.

Master's thesis, M. I. T., 2006. Advisor: Prof. Manolis Kellis (pdf)

One chapter was published in RECOMB 2007

(This thesis won the Charles and Jennifer Johnson Thesis Award.)

Abstract BibTeX

@mastersthesis{GrochowNetworks2006, AUTHOR = {Grochow, Joshua A.}, TITLE = {On the structure and evolution of protein interaction networks}, SCHOOL = {Massachusetts Institute of Technology}, YEAR = {2006}, MONTH = {August}, }

The study of protein interactions from the networks point of view has yielded new insights into systems biology. In particular, "network motifs" become apparent as a useful and systematic tool for describing and exploring networks. Finding motifs has involved either exact counting (e.g. Milo *et al.*, *Science*, 2002) or subgraph sampling (e.g. Kashtan *et al.*, *Bioinf.* 2004 and Middendorf *et al.*, *PNAS* 2005). In this thesis we develop an algorithm to count all instances of a particular subgraph, which can be used to query whether a given subgraph is a significant motif. This method can be used to perform exact counting of network motifs faster and with less memory than previous methods, and can also be combined with subgraph sampling to find larger motifs than ever before—we have found motifs with up to 15 nodes and explored subgraphs up to 20 nodes. Unlike previous methods, this method can also be used to explore motif clustering and can be combined with network alignment techniques (e.g. Graemlin or pathBLAST).

We also present new methods of estimating parameters for models of biological network growth, and present a new model based on these parameters and underlying binding domains.

Finally, we propose an experiment to explore the effect of the whole genome duplication on the protein-protein interaction network of *S. cerevisiae*, allowing us to distinguish between cases of subfunctionalization and neofunctionalization.