|
Matt Knepley's HomePage
|
|
I have been a member of the PETSc Team since 1997. PETSc is an open-source software package designed for the solution of partial differential equations and used in diverse applications from surgery to environmental remediation. PETSc is the most complete, most flexible, most powerful, and easiest-to-use platform for solving the nonlinear algebraic equations arising from the discretization of partial differential equations. PETSc is designed to allow engineers and scientists to perform large-scale numerical simulations of physical phenomena rapidly and efficiently. Representative simulations to date include fluid flow for aircraft, ship, and automobile design; blood flow simulation for medical device design; porous media flow for oil reservoir simulation for energy development and groundwater contamination modeling; materials properties; economic modeling; structural mechanics for construction design; combustion modeling; and nuclear fission and fusion for energy development. These simulations allow the effects of design decisions to be evaluated and compared, including cost benefits, safety concerns, and environmental impact. PETSc has enabled simulations by industry, universities, and research laboratories, from desktop systems to the most powerful supercomputers, and for critical applications affecting the nation's energy, environment, and health. A testament to the widespread recognition of this package is the fact that PETSc has had over 400 downloads per month during the past three years. A Google search for "PETSc" on January 6, 2009, showed 199,000 pages referencing PETSc. In addition, a search of Google books found 662 books that reference PETSc. PETSc won the R & D 100 Award from in 2009. PETSc solvers were cited as one of the Top Ten Computational Science Accomplishments of the U.S. Department of Energy (DOE) in 2008. Two HPC hardware companies, Cray and SiCortex, provide and support PETSc with their products. Several commercial simulation packages, including FIDAP 8.5, TigerCAD, and RF3P, use PETSc for their algebraic solvers. Also, products from the companies Tech-X and Actel use PETSc. Microsoft has distributed previous versions of PETSc with its HPC software suite, the Microsoft Cluster CD. PETSc has been used by Boeing for computational fluid dynamics simulations and Shell for solving inverse problems for oil reservoir management. CFDResearch in Huntsville, Alabama, uses PETSc in its fluid simulations. PETSc solvers have also been used at the U.S. Army Engineering Research and Development Center and by the South Florida Water Management District modeling the Everglades. Xuemin Tu, Barry Smith, Jed Brown, and I are working on approaches to nonlinear preconditioning. The most important aspect here is composability. This is exactly what makes KSP/PC so useful. Complex, optimal solvers can be built from simple pieces. We would like to carry over this design to the nonlinear problem. As a first step, we have recently published the linear story, and we will soon put out the nonlinear side. You can see our work in PETSc if you choose a nonlinear preconditioner |
|
I collaborate with Ridgway Scott, Andy Terrel, Peter Brune, and Robert Kirby on automation of finite element discretization, as well as better integration of mesh abstraction with discretization and solver routines. I am also involved with the FEniCS project at Simula Research, working in particular with Hans Petter Langtangen and Anders Logg. The overarching theme for this work is that automation in scientific computing, exemplified by compilers, should be raised to a higher level of abstraction. By confining ourselves to a well understood domain, in this case finite element methods, we can make the symbolic problem tractable. A prime example of this program is the FEniCS Form Compiler (FFC) which I use to generate GPU integration code in this paper. For several years now, we have been exploring unstructured geometric coarsening, and its application to unstructured multigrid methods. Peter has taken the lead on this work. Our recent paper shows simple applications of these ideas, but we would like to extend this to incorporate AMG ideas and also to larger and more complex problems. This work rests on the Sieve API for unstructured mesh representation and manipulation, detailed in this paper. Here is another paper by Anders Logg discussing his implementation of these ideas in FEniCS. Sieve has also been used in PyLith as the basis for FEM assembly and provided the functionality for inserting cohesive cells into a mesh with specified fault vertices. I have completed a rewrite of Sieve in C, which is now presented by the PETSc DMPlex class. There are a complete set of FEM tests in SNES ex62. In order to run the tests, you must configure with # For finite element supportand you can run the tests using the new Python build system python2.7 ./config/builder2.py check src/snes/examples/tutorials/ex62.cThe test suite exhibits all the traditional preconditioners for the iso-viscous Stokes problem using PC FieldSplit, as well as matrix-free application of the operator combined with user-defined preconditioning. We considered using homotopy methods for solution of small blocks of the nonlinear algebraic problems, perhaps as part of FAS. However, no homotopy solver has yet proved competitive to Newton for systems arising from FEM discretiztions. Currently, we cannot benefit from the fact that we need only find a single physical solution while retaining the robustness of the homotopy method. |
|
The combination of low power consumption, large memory bandwidth, high flop rate, and low cost make GPUs an ideal platform for desktop scientific computing. Moreover, it is likely that a large fraction of future supercomputers will also employ accelerators (as 3 of the top ten machines in the world already do). In the last 4 decades of scientific computing, the most successful strategy for accomodating new hardware is the production of high quality libraries, such as MPI and PETSc. We propose to develop not only parallel algebraic solvers, but discretization and multiphysics libraries, suitable for hybrid machines with significant accelerator components. Our strategy is to use a combination of low-level library routines and code generation techniques to produce implementations for our higher level API. We use the Thrust and Cusp libraries from NVIDIA to provide linear algebra, shown here, and more basic operations such as sorting and keyed reductions used in this paper. For operations that a more variable, such as those that depend on the weak form, we use code generation. The key point is to circumscribe the domain so that the generation process becomes tractable. In generating FEM integration routines, we produce not only different block sizes, but different organization of memory writes and computation. With Andy Terrel, Rob Kirby, and Andreas Klöckner, I am exploring code transformation tools which will enable us to automatically optimize more general calculations. In particular, we are using Ignition and Loo.py to explore quadrature-based methods for FEM residual and Jacobian evaluation. In our upcoming paper, we show examples for simple forms. |
|
I collaborate with Felipe Cruz on the PetFMM code for simulation using the fast multipole method. The main idea is that a very simple serial implementation of FMM can be combined with optimization techniques from graph partitioning to produce an efficient and scalable parallel FMM. PetFMM completely reuses the serial code, and also uses templating to optimize the implementation. It templates over the structure used to describe source and targets, the neighborhood stencil, the expansion order, and the dimension. We are able to guide the optimization routine with good theoretical estimates of the communication and computation complexity. In fact, this paper will show that optimal communication complexity is achievable with PetFMM. I also work with Rio Yokota and Andreas Klöckner on a version of FMM which will autogenerate expansions given the kernel function. We believe this is the step that most severely limits the wide adoption of the method. The implementation uses sympy for symbolic manipulation and code generation, and PyCUDA for kernel execution and organization. I also collaborate with Rio Yokota and Lorena Barba on the PetRBF code for parallel, radial basis function interpolation. |
|
I am co-author of the PyLith code (description) with Brad Aagaard and Charles Williams. We have developed open-source finite-element software for 2-D and 3-D dynamic and quasi-static modeling of crustal deformation. This software, PyLith (current release is 1.6) can be used for quasi-static viscoelastic modeling or dynamic spontaneous rupture and/or ground-motion modeling. Unstructured and structured finite-element discretizations provide support for spatial scales ranging from tens of meters to hundreds of kilometers with temporal scales in dynamic problems ranging from milliseconds to minutes and temporal scales in quasi-static problems ranging from minutes to thousands of years. PyLith development is part of the NSF funded Computational Infrastructure for Geodynamics (CIG) and the software runs on a wide variety of platforms (laptops, workstations, and Beowulf clusters). Binaries (Linux, Darwin, and Windows systems) and source code are available from geodynamics.org. PyLith uses a suite of general, parallel, graph data structures called Sieve for storing and manipulating finite-element meshes. This permits use of a variety of 2-D and 3-D cell types including triangles, quadrilaterals, hexahedra, and tetrahedra. Current features include prescribed fault ruptures with multiple earthquakes and aseismic creep, spontaneous fault ruptures with a variety of fault constitutive models, time-dependent Dirichlet and Neumann boundary conditions, absorbing boundary conditions, time-dependent point forces, and gravitational body forces. PyLith supports infinitesimal and small strain formulations for linear elastic rheologies, linear and generalized Maxwell viscoelastic rheologies, power-law viscoelastic rheologies, and Drucker-Prager elastoplastic rheologies. Current development focuses on coupling quasi-static and dynamic simulations to resolve multi-scale deformation across the entire seismic cycle and coupling elasticity to heat and/or fluid flow. |
|
I am working with Richard Katz and Marc Spiegelman on magma dynamics problems, including the mid-ocean ridge spreading shown on the left. Magmatic plate boundaries (mid ocean ridges, subduction zones) and intra-plate hot spots are among the most important and complex dynamic and chemical systems in the solid earth. While volumetrically small, these regions have global consequences for mantle flow and plate tectonics through the spontaneous localization of weak plate boundaries. Magmatism in these regions also provides the dominant mechanism for geochemical fractionation, mixing and sampling of the Earth's mantle. Many of the observational constraints on both plate-boundary processes and global mantle dynamics come from regional studies of lavas from mid-ocean ridges, subduction zones and oceanic hot spots and form focus areas for many initiatives such as GeoPRISMS, EarthScope and core EAR and MGG Geophysics programs. Optimal preconditioners for many multiphysics problems, such as variable-viscosity Stokes and thermal convection, arise by combining existing optimal preconditioners for individual blocks of the Jacobian, associated with a given physics. For instance, a classic Stokes preconditioner applies multigrid to the momentum balance and scaling to the mass balance, combining them multiplicatively, using one block elimination step. The PETSc composable solvers allow a user to define different preconditioners for each equation block, and also different combination methods, inclduing Schur complements, between sets of equations, and it can all be specified dynamically on the command line without any code changes. Thus, the same code that runs full LU to verify a solution can also scale optimally on the largest machines available. Moreover, we can choose to split at various levels. Multigrid could be used for the overall Stokes system, and then splitting can be used for each smoother (so-called segregated KKT smoothers), inverting the classical scheme above. Finally, we can use split schemes to precondition the fully-coupled problem. |
|
Electrostatic interactions play an essential role in the structure and function of biomolecules (proteins, DNA, cell membranes, etc.). One of the most challenging aspects for understanding these interactions is the fact that biologically active molecules are almost always in solution---that is, they are surrounded by water molecules and dissolved ions. These solvent molecules add many thousands or even millions more degrees of freedom to any theoretical study, many of which are of only secondary importance for investigations of interest. Model the solvent using a discontinuous dielectric coefficient, and solve the Poisson equation directly using a boundary-integral formulation and the boundary-element method, BEM, to determine the induced charge distribution on the molecular surface which accounts for the change in polarization charge across the dielectric boundary. I am collaborating with Jay Bardhan to develop the BIBEE approximation algorithm for electrostatics problems with discontinuous dielectric coefficient in complex geometries. BIBEE approximates the operator arising from a boundary integral discretization of the Poisson equation, giving rigorous upper bounds and effective lower bounds for the solvation energy. |
|
I am collaborating with Dirk Gillespie and Robert Eisenberg at Rush University Medical Center on the modeling of biological ion channels using classical Density Functional Theory of fluids. Since its inception 30 years ago, classical density functional theory (DFT) of fluids has developed into a fast and accurate theoretical tool to understand the fundamental physics of inhomogeneous fluids. To determine the structure of a fluid, DFT minimizes a free energy functional of the fluid density by solving the Euler-Lagrange equations for the inhomogeneous density profiles of all the particle species When solving on a computer, the density can be discretized (called free minimization) or a parameterized function form such as Guassians can be assumed (called parameterized minimization). This approach has been used to model freezing, electrolytes, colloids, and charged polymers in confining geometries and at liquid-vapor interfaces. DFT is different from direct particle simulations where the trajectories of many particles are followed over long times to compute averaged quantities of interest (e.g., density profiles). DFT computes these ensemble-averaged quantities directly. However, developing an accurate DFT is difficult and not straightforward. In fact, new, more accurate DFTs are still being developed for such fundamental systems as hard-sphere fluids, electrolytes, and polymers. Our group has already applied one dimensional DFT to biological problems involving ion channel permeation, successfully matching and predicting experimental data. We have extended this to 3D using efficient numerical schemes for functional evaluation and nonlinear system solution. |
|
I am collaborating with David Ketcheson, Aron Ahmadia and Kyle Mandli on nonlinear wave mechanics and the development of the PyClaw framework, of which PetClaw is the parallel extension. The PyClaw design is based upon a loosely-coupled model for simulation software, in which computational operations are wrapped in a higher level language, here Python, and spliced together. The glue in Python is numpy, a lightweight library which allows no-copy access across languages to multidimensionl arrays. With this, ClawPack could be joined to PETSc efficiently using only Python code. It took 100 lines of Python to parallelize PyClaw, subclassing only three classes. |
|
I am collaborating with Blaise Bourdin to produce a scalable code to simulate fracture mechanics. In the area of brittle fracture, there has seen tremendous progress over the recent years, but some issues remain problematic. The most widely accepted theories, based on Griffith's criterion, are limited to the propagation of an isolated, pre-existing crack along a given path. Extending Griffith's theory into a global minimization principle, while preserving its essence, the concept of energy restitution in between surface and bulk terms, G. Francfort and J.-J. Marigo proposed a new formulation for the brittle fracture problem, capable of predicting the creation of new cracks, their path, and their interactions, in two and three space dimensions. The essence of this model is the global minimization over all crack sets \(K\) and all admissible displacement
fields \(u\) of an energy similar to
|
Matthew G. Knepley
Computation Institute
University of Chicago
Searle Chemistry Laboratory
5735 S. Ellis Avenue
Chicago, IL 60637
Work: (773) 41-EULER
Cell: (773) 450-3622