John Reppy's Publications

Standard ML

3CPS: The Design of an Environment-focussed Intermediate Representation.
Benjamin Quiring, John Reppy, and Olin Shivers. In 33nd International Symposia on Implementation and Application of Functional Languages (IFL 2021), September 2021. Paper accepted for the conference post-proceedings.
[ bib ]

We describe the design of 3CPS, a compiler intermediate representation (IR) we have developed for use in compiling call-by-value functional languages such as SML, OCaml, Scheme, and Lisp. The language is a low-level form designed in tandem with a matching suite of static analyses. It reflects our belief that the core task of an optimising compiler for a functional language is to reason about the environment structure of the program. Our IR is distinguished by the presence of extent annotations, added to all variables (and verified by static analysis). These annotations are defined in terms of the semantics of the IR, but they directly tell the compiler what machine resources are needed to implement the environment structure of each annotated variable.

A New Backend for Standard ML of New Jersey.
Kavon Farvardin and John Reppy. In 32nd International Symposia on Implementation and Application of Functional Languages (IFL 2020), September 2020. Awarded the Peter Landin Prize for best paper.
[ bib | DOI | .pdf ]

This paper describes the design and implementation of a new backend for the Standard ML of New Jersey (SML/NJ) system that is based on the LLVM compiler infrastructure. We first describe the history and design of the current backend, which is based on the MLRisc framework. While MLRisc has many similarities to LLVM, it provides a lower-level, policy agnostic, approach to code generation that enables customization of the code generator for non-standard runtime models (i.e., register pinning, calling conventions, etc). In particular, SML/NJ uses a stackless runtime model based on continuation-passing style with heap-allocated continuation closures. This feature, and others, pose challenges to building a backend using LLVM. We describe these challenges and how we address them in our backend.

From Folklore to Fact: Comparing Implementations of Stacks and Continuations.
Kavon Farvardin and John Reppy. In Proceedings of the SIGPLAN 2020 Conference on Programming Language Design and Implementation, pages 75--90, New York, NY, June 2020. ACM. Awarded Distinguished Paper.
[ bib | DOI | .pdf ]

The efficient implementation of function calls and non-local control transfers is a critical part of modern language implementations and is important in the implementation of everything from recursion, higher-order functions, concurrency and coroutines, to task-based parallelism. In a compiler, these features can be supported by a variety of mechanisms, including call stacks, segmented stacks, and heap-allocated continuation closures.
An implementor of a high-level language with advanced control features might ask the question “what is the best choice for my implementation?” Unfortunately, the current literature does not provide much guidance, since previous studies suffer from various flaws in methodology and are outdated for modern hardware. In the absence of recent, well-normalized measurements and a holistic overview of their implementation specifics, the path of least resistance when choosing a strategy is to trust folklore, but the folklore is also suspect.
This paper attempts to remedy this situation by providing an “apples-to-apples” comparison of six different approaches to implementing call stacks and continuations. This comparison uses the same source language, compiler pipeline, LLVM-backend, and runtime system, with the only differences being those required by the differences in implementation strategy. We compare the implementation challenges of the different approaches, their sequential performance, and their suitability to support advanced control mechanisms, including supporting heavily threaded code. In addition to the comparison of implementation strategies, the paper's contributions also include a number of useful implementation techniques that we discovered along the way.

The History of Standard ML.
David MacQueen, Robert Harper, and John Reppy. Proceedings of the ACM on Programming Languages, 4(HOPL), June 2020.
[ bib | DOI | .pdf ]

The ML family of strict functional languages, which includes F#, OCaml, and Standard ML, evolved from the Meta Language of the LCF theorem proving system developed by Robin Milner and his research group at the University of Edinburgh in the 1970s. This paper focuses on the history of Standard ML, which plays a central rôle in this family of languages, as it was the first to include the complete set of features that we now associate with the name “ML” (i.e., polymorphic type inference, datatypes with pattern matching, modules, exceptions, and mutable state).
Standard ML, and the ML family of languages, have had enormous influence on the world of programming language design and theory. ML is the foremost exemplar of a functional programming language with strict evaluation (call-by-value) and static typing. The use of parametric polymorphism in its type system, together with the automatic inference of such types, has influenced a wide variety of modern languages (where polymorphism is often referred to as generics). It has popularized the idea of datatypes with associated case analysis by pattern matching. The module system of Standard ML extends the notion of type-level parameterization to large-scale programming with the notion of parametric modules, or functors.
Standard ML also set a precedent by being a language whose design included a formal definition with an associated metatheory of mathematical proofs (such as soundness of the type system). A formal definition was one of the explicit goals from the beginning of the project. While some previous languages had rigorous definitions, these definitions were not integral to the design process, and the formal part was limited to the language syntax and possibly dynamic semantics or static semantics, but not both.
The paper covers the early history of ML, the subsequent efforts to define a standard ML language, and the development of its major features and its formal definition. We also review the impact that the language had on programming-language research.

Compiling Successor ML Pattern Guards.
John Reppy and Mona Zahir. In Proceedings of the 2019 ACM SIGPLAN ML Family Workshop, August 2019.
[ bib | .pdf ]

Successor ML is a collection of proposed language extensions to Standard ML. A number of these extensions address pattern matching; including adding richer record patterns, or-patterns, and pattern guards. Pattern guards in Successor ML are more general than those found in other languages, which raises some interesting implementation issues.
This paper describes the approach to pattern guards that we are developing as part of an effort to add Successor ML features to the Standard ML of New Jersey system. We present our approach in a way that is applicable to either backtracking or decision-tree implementations of pattern matching.

Compiling with Continuations and LLVM.
Kavon Farvardin and John Reppy. In Kenichi Asai and Mark Shinwell, editors, Proceedings ML Family Workshop / OCaml Users and Developers workshops (September 22-23, 2016), Volume 285 of Electronic Proceedings in Theoretical Computer Science, pages 131--142. Open Publishing Association, 2018.
[ bib | DOI | .pdf ]

LLVM is an infrastructure for code generation and low-level optimizations, which has been gaining popularity as a backend for both research and industrial compilers, including many compilers for functional languages. While LLVM provides a relatively easy path to high-quality native code, its design is based on a traditional runtime model which is not well suited to alternative compilation strategies used in high-level language compilers, such as the use of heap-allocated continuation closures.
This paper describes a new LLVM-based backend that supports heap-allocated continuation closures, which enables constant-time callcc and very-lightweight multithreading. The backend has been implemented in the Parallel ML compiler, which is part of the Manticore system, but the results should be useful for other compilers, such as Standard ML of New Jersey, that use heap-allocated continuation closures.

Compiling with continuations and LLVM.
Kavon Farvardin and John Reppy. In Proceedings of the 2016 ACM SIGPLAN Workshop on ML, September 2016.
[ bib | .pdf ]

This paper describes a new LLVM-based backend for the Parallel ML compiler (part of the Manticore system). This backend is novel in that it supports heap-allocated first-class continuations (a first for LLVM), which, in turn enables language features, such as callcc, lightweight concurrency mechanisms, and PML's parallelism features.

SML3d: 3D Graphics for Standard ML.
John Reppy. In Proceedings of the 2014 ACM SIGPLAN Workshop on ML, September 2014.
[ bib | .pdf ]

The SML3d system is a collection of libraries designed to support real-time 3D graphics programming in Standard ML (SML). This paper gives an overview of the system and briefly highlights some of the more interesting aspects of its design and implementation.

A Declarative API for Particle Systems.
Pavel Krajcevski and John Reppy. In Proceedings of the Thirteenth International Symposium on Practical Aspects of Declarative Languages (PADL 2011), Volume 6539 of Lecture Notes in Computer Science, pages 130--144, New York, NY, January 2011. Springer-Verlag.
[ bib | .pdf ]

Recent trends in computer-graphics APIs and hardware have made it practical to use high-level functional languages for real-time graphics applications. Thus we have the opportunity to develop new approaches to computer graphics that take advantage of the high-level features of functional languages. This paper describes one such project that uses the techniques of functional programming to define and implement a combinator library for particle systems. Particle systems are a popular technique for rendering fuzzy phenomena, such as fire, smoke, and explosions. Using our combinators, a programmer can provide a declarative specification of how a particle system behaves. This specification includes rules for how particles are created, how they evolve, and how they are rendered. Our library translates these declarative specifications into a low-level intermediate language that can be compiled to run on the GPU or interpreted by the CPU.

Calling Variadic Functions from a Strongly-typed Language.
Matthias Blume, Mike Rainey, and John Reppy. In Proceedings of the 2008 ACM SIGPLAN Workshop on ML, pages 47--58, September 2008.
[ bib | .pdf ]

The importance of providing a mechanism to call C functions from high-level languages has been understood for many years and, these days, almost all statically-typed high-level-language implementations provide such a mechanism. One glaring omission, however, has been support for calling variadic C functions, such as printf. Variadic functions have been ignored because it is not obvious how to give static types to them and because it is not clear how to generate calling sequence when the arguments to the function may not be known until runtime. In this paper, we address this longstanding omission with an extension to the NLFFI foreign-interface framework used by Standard ML of New Jersey (SML/NJ) and the MLton SML compiler. We describe two different ways of typing variadic functions in NLFFI and an implementation technique based on the idea of using state machines to describe calling conventions. Our implementation is easily retargeted to new architectures and ABIs, and can also be easily added to any HOT language implementation that supports calling C functions.

Type-sensitive control-flow analysis.
John Reppy. In Proceedings of the 2006 ACM SIGPLAN Workshop on ML, pages 74--83, September 2006.
[ bib | DOI | .pdf ]

Higher-order typed languages, such as ML, provide strong support for data and type abstraction. While such abstraction is often viewed as costing performance, there are situations where it may provide opportunities for more aggressive program optimization. Specifically, we can exploit the fact that type abstraction guarantees representation independence, which allows the compiler to specialize data representations. This paper describes a first step in supporting such optimizations; namely a control-flow analysis that uses the program's type information to compute more precise results. We present our algorithm as an extension of Serrano's version of 0-CFA and we show that it respects types. We also discuss applications of the analysis with examples of optimizations enabled by the analysis that would not be possible normal CFA.

The Standard ML Basis Library.
Emden R. Gansner and John H. Reppy, editors. Cambridge University Press, 2004.
[ bib | http ]

Concurrent Programming in ML.
John H. Reppy. Cambridge University Press, Cambridge, England, 1999.
[ bib ]

AML: Attribute Grammars in ML.
S.G. Efremidis, K.A. Mughal, L. Søraas, and John Reppy. Nordic Journal of Computing, 4(1), 1997.
[ bib ]

Unrolling lists.
Zhong Shao, John Reppy, and Andrew Appel. In ACM Conference on Lisp and Functional Programming, pages 185--195, June 1994.
[ bib ]

A portable and optimizing back end for the SML/NJ compiler.
Lal George, Florent Guillame, and John Reppy. In Fifth International Conference on Compiler Construction, pages 83--97, April 1994.
[ bib ]

A High-performance Garbage Collector for Standard ML.
John H. Reppy. Technical memo, AT&T Bell Laboratories, December 1993.
[ bib | .pdf ]

Abstract Value Constructors: Symbolic Constants for Standard ML.
William E. Aitken and John H. Reppy. Technical Report TR 92-1290, Department of Computer Science, Cornell University, June 1992. A shorter version appears in the proceedings of the “ACM SIGPLAN Workshop on ML and its Applications,” 1992.
[ bib | .pdf ]

Standard ML (SML) has been used to implement a wide variety of large systems, such as compilers, theorem provers, graphics libraries, and even operating systems. While SML provides a convenient, high-level notation for programming large applications, it does have certain deficiencies. One such deficiency is the lack of a general mechanism for assigning symbolic names to constant values. In this paper, we present a simple extension of SML that corrects this deficiency in a way that fits naturally with the semantics of SML. Our proposal is a generalization of SML's datatype constructors: we introduce constants that generalize nullary datatype constructors (like nil), and templates that generalize non-nullary datatype constructors (like ::). Constants are identifiers bound to fixed values, and templates are identifiers bound to structured values with labeled holes. Templates are useful because they allow users to treat the representation of structured data abstractly without having to give up pattern matching.

Attribute grammars in ML.
S.G. Efremidis, K.A. Mughal, and John Reppy. In ACM SIGPLAN Workshop on ML and its Aplications, June 1992.
[ bib ]

Higher-order Concurrency.
John H. Reppy. PhD thesis, Cornell University, 1992. Available as Computer Science Technical Report 92-1285.
[ bib ]

CML: A higher-order concurrent language.
John H. Reppy. In Proceedings of the SIGPLAN 1991 Conference on Programming Language Design and Implementation, pages 293--305, New York, NY, June 1991. ACM.
[ bib | .pdf ]

Asynchronous signals in Standard ML.
John H. Reppy. Technical Report TR 90-1144, Department of Computer Science, Cornell University, Ithaca, NY, August 1990.
[ bib | .pdf ]

This file was generated by bibtex2html 1.98.

Last updated on February 22, 2022
Comments to John Reppy.