Tri Huynh, Michael Maire, Matthew R. Walter
University of Chicago | TTI-Chicago
We introduce a novel architecture that integrates a large addressable memory space into the core functionality of a deep neural network. Our design distributes both memory addressing operations and storage capacity over many network layers. Distinct from strategies that connect neural networks to external memory banks, our approach co-locates memory with computation throughout the network structure. Mirroring recent architectural innovations in convolutional networks, we organize memory into a multiresolution hierarchy, whose internal connectivity enables learning of dynamic information routing strategies and data-dependent read/write operations. This multigrid spatial layout permits parameter-efficient scaling of memory size, allowing us to experiment with memories substantially larger than those in prior work. We demonstrate this capability on synthetic exploration and mapping tasks, where the network is able to self-organize and retain long-term memory for trajectories of thousands of time steps. On tasks decoupled from any notion of spatial geometry, such as sorting or associative recall, our design functions as a truly generic memory and yields results competitive with those of the recently proposed Differentiable Neural Computer.


Multigrid Memory Architecture
Multigrid memory architecture. Top Left: A multigrid convolutional layer [Ke et al., 2017] transforms input pyramid X , containing activation tensors {x0, x1, x2}, into output pyramid Y via learned filter sets that act across the concatenated representations of neighboring spatial scales. Top Right: We design an analogous variant of the convolutional LSTM [Xingjian et al., 2015], in which X and Y are indexed by time and encapsulate LSTM internals, e.g. memory cells (c), hidden states (h), and outputs (o). Bottom: Connecting many such layers, both in sequence and across time, yields a multigrid mesh capable of routing input at into a much larger memory space, updating a distributed memory representation, and providing multiple read-out pathways (e.g. z0,t or z3,t).
Multigrid Memory Interfaces
Memory interfaces. Left: Multiple readers (red, orange) and a single writer (blue) simultaneously manipulate a multigrid memory. Readers are multigrid CNNs; each convolutional layer views the hidden state of the corresponding grid in memory by concatenating it as an additional input. Right: Distinct encoder (green) and decoder (purple) networks, each structured as a deep multigrid memory mesh, cooperate to perform a sequence-to-sequence task. We initialize the memory pyramid (LSTM internals) of each decoder layer by copying it from the corresponding encoder layer.


Interpretation Instruction
Mapping & Localization in Random Walk
Joint Exploration, Mapping & Localization
Mapping & Localization in Spiral Motion, with 3x3 Queries
Mapping & Localization in Spiral Motion, with 3x3 and 9x9 Queries