next_inactive up previous


Against Nonmonotonic Logic
(DRAFT)

Michael J. O'Donnell1

8 April, 1993

Abstract:

The monotonicity property of inference in conventional formal systems of logic ( ${\bf S}\vdash A$ and ${\bf S}\subseteq{\bf T}$ implies ${\bf T}\vdash A$) appears to prohibit certain crucial steps in practical intuitive reasoning. For example, reasoning by default (using assertions that are assumed to hold until explicitly contradicted) appears to be impossible in monotonic systems. Many crucial steps in practical reasoning certainly require nonmonotonic behaviors in inference procedures. But, I argue that it is highly misleading to describe these nonmonotonic behaviors in terms of nonmonotonic inference relations $\vdash $. Rather, inference procedures should use metatheoretic operators, with conventionally monotonic inference relations, to support nonmonotonic behavior.

Defeasible Reasoning Requires Nonmonotonicity

Most formal systems for logical reasoning are essentially definitions of a relation $\vdash $, called logical inference. When ${\bf T}$ is a set of propositions expressed in the language used by such a formal system, and $A$ is another such proposition, ${\bf T}\vdash A$ means that the formal system allows $A$ to be inferred from hypotheses in ${\bf T}$. In the vast majority of formal systems studied by logicians, the relation $\vdash $ is required to be monotonic in its left argument.

Definition 1   A logical inference relation $\vdash $ is monotonic if and only if, for all sets of propositions ${\bf S}$ and ${\bf T}$, and for all propositions $A$, if ${\bf S}\vdash A$ and ${\bf S}\subseteq{\bf T}$, then ${\bf T}\vdash A$.

Certain practical considerations appear to argue in favor of formal systems with nonmonotonic logical inference relations. I will sketch these practical considerations, but argue nonetheless that they are best accommodated by logical inference relations that are conventionally monotonic.

Practical reasoning often involves defeasible [#!defeasible!#] (retractable) steps. In particular, there are often default assumptions [#!default!#] about typical cases, that are accepted until specific reasons to reject them are found. The example of concluding that a given bird flies until learning that the bird is a penguin has achieved chestnut status.

Example 1 (Penguins [#!penguins!#])   Consider a formal language with formulae expressing at least the following six propositions

\begin{displaymath}
\begin{array}{rccl}
\{&\mathord{\mbox{\it Typical birds fl...
...}},&\mathord{\mbox{\it George does not fly\/}}&\}
\end{array} \end{displaymath}

Given knowledge of the set of three propositions

\begin{displaymath}{\bf T}_1=\{\mathord{\mbox{\it Typical birds fly\/}},\;\matho...
...uins do not fly\/}},\;\mathord{\mbox{\it George is a bird\/}}\}\end{displaymath}

it seems quite rational to suppose that George is in fact a typical bird, and conclude that $\mathord{\mbox{\it George flies\/}}$. That relation between a set of propositions and another proposition is often expressed as

\begin{displaymath}{\bf T}_1\vdash \mathord{\mbox{\it George flies\/}}\end{displaymath}

But, given the additional knowledge that $\mathord{\mbox{\it George is a penguin\/}}$, it seems irrational to conclude that $\mathord{\mbox{\it George flies\/}}$. Letting

\begin{displaymath}{\bf T}_2={\bf T}_1\cup\{\mathord{\mbox{\it George is a penguin\/}}\}\end{displaymath}

this relation is expressed as

\begin{displaymath}{\bf T}_2\not\:\vdash \mathord{\mbox{\it George flies\/}}\end{displaymath}

But

\begin{displaymath}{\bf T}_1\subseteq{\bf T}_2\end{displaymath}

so the appropriate formal system of logic in which to perform practical reasoning about birds, penguins, and ability to fly seems to be nonmonotonic.

The desire to have a reasoning procedure that assumes typical conditions until finding evidence of a special case seems to prohibit monotonicity in the penguin example above. But, a closer analysis reveals more than one formal site for the nonmonotonicity of the procedure.

Two Formal Sites for Nonmonotonic Behavior

I believe that the penguin example, and many similar examples, provide a strong case for nonmonotonic behavior in a reasoning procedure. That is, there are certainly cases in practice where the discovery of new knowledge requires a retraction of old assertions. But, the explication of this nonmonotonic behavior can be attached to at least two different sites in a formal system of reasoning: the inference relation $\vdash $, and the formulae themselves. I will argue for the latter choice, but first I will reject some particular forms for doing so, in spite of their initial appeal.

In principle, nonmonotonic behavior of procedures using a monotonic inference relation may be explained by weakening the conclusions of inference. For example, instead of ${\bf T}_1\vdash \mathord{\mbox{\it George flies\/}}$, we might have

\begin{displaymath}{\bf T}_1\vdash \mathord{\mbox{\it George is a typical
case\/}}\Rightarrow \mathord{\mbox{\it George flies\/}}\end{displaymath}

or

\begin{displaymath}{\bf T}_1\vdash \mathord{\mbox{\it Special
case\/}}\lor\mathord{\mbox{\it George flies\/}}\end{displaymath}

Nonmonotonic behavior is produced by introducing a large or even infinite set of propositions, such as $\mathord{\mbox{\it George is a typical case\/}}$, as default assumptions, then removing them as soon as counterevidence appears. The extra clauses in conclusions serve as switches to enable or disable the real conclusions, such as $\mathord{\mbox{\it George flies\/}}$, by respectively including or excluding the default assumptions.

The basic idea of weakening conclusions to maintain monotonicity may be refined [#!weakening!#], but I believe that most reasoners rightly reject this complication of the form of final conclusions. Since the normal mode of assertion of a proposition $A$ in everyday life implicitly means something like, ``As well as can be determined, $A$ holds,'' it seems inappropriate to introduce explicit formal disclaimers into every proposition. The best known formal systems for mathematical reasoning (the Classical First-Order Predicate Calculus heads the list, but the same holds for higher-order classical logics, set theories, and formalizations of intuitionistic or constructive reasoning) all treat the formal assertion of a proposition $A$ as meaning something like, ``$A$ is absolutely true, with no possibility of error.'' But, nowhere in popular usage, not even in mathematical journals, is that formal notion actually practiced. If it were, Russell's antinomy would have been a catastrophe rather than a stimulus to further research. So, I reject the weakening of conclusions by explicit formal disclaimers, since it forces a detailed accounting for assumptions that are made uniformly and universally in the practice of language, and thereby obscures the natural structure of practical assertions of propositions.

Rather than explaining nonmonotonic behavior by weakening the statements of the conclusions of inference, we need to strengthen the hypotheses. An obvious possibility is to add the same sorts of propositions that are used to weaken the conclusions, yielding

\begin{displaymath}{\bf T}_1\cup \{\mathord{\mbox{\it George is a typical case\/}}\}\vdash \mathord{\mbox{\it George flies\/}}\end{displaymath}

or the more transparent

\begin{displaymath}{\bf T}_1\cup \{\mathord{\mbox{\it George is not a
flightless bird\/}}\}\vdash \mathord{\mbox{\it George flies\/}}\end{displaymath}

The trouble is that assertions of typicality, such as $\mathord{\mbox{\it George is a typical case\/}}$, do not represent direct intuitive judgements, but rather are themselves the results of sophisticated epistemological reasoning. Instead of begging the question of how one decides to assume $\mathord{\mbox{\it George is a typical case\/}}$, we should reduce that judgement to concepts that can be referred more directly to observations and intuition. The use of $\mathord{\mbox{\it George is not a flightless bird\/}}$ avoids the explicit reference to typicality, but even more obviously begs the question of how we establish the hypothesis that $\mathord{\mbox{\it George is not a flightless bird\/}}$, which is essentially as hard as the final conclusion $\mathord{\mbox{\it George flies\/}}$.

Another way of augmenting the hypotheses is to use a formal operator $\mathord{\mbox{\it Search\/}}$, where for propositions $A$ and sets ${\bf T}$ of propositions, $\mathord{\mbox{\it Search\/}}(A,{\bf T})$ means that a reasonably diligent search for known propositions relevant to $A$ yields only ${\bf T}$. Then, the inference that $\mathord{\mbox{\it George flies\/}}$ is represented by

\begin{displaymath}\mathord{\mbox{\it Search\/}}(\mathord{\mbox{\it George flies\/}},{\bf T}_1)\vdash \mathord{\mbox{\it George flies\/}}\end{displaymath}

while

\begin{displaymath}{\bf T}_1\not\:\vdash \mathord{\mbox{\it George flies\/}}\end{displaymath}

This approach captures the idea that the information in ${\bf T}_1$ alone does not allow us to conclude rationally that $\mathord{\mbox{\it George flies\/}}$; rather, the fact that a diligent search for knowledge bearing on $\mathord{\mbox{\it George flies\/}}$ came up with nothing more than ${\bf T}_1$ is required to support the conclusion. $\mathord{\mbox{\it Search\/}}$ is unusual in that it asserts something about the very inference procedure that is considering propositions built up from $\mathord{\mbox{\it Search\/}}$, so the operator is metatheoretic and reflective [#!reflection!#], in some ways analogous to the negation as failure used in certain implementations of Prolog [#!negation-as-failure!#].

$\mathord{\mbox{\it Search\/}}$ can trigger nonmonotonic behavior, since the addition of new information to the knowledge base, or even the discovery of more information that was already in it, can change $\mathord{\mbox{\it Search\/}}(A,{\bf T})$ from true to false. In the penguin example, once we discover the proposition $\mathord{\mbox{\it George is a penguin\/}}$ among our knowledge, the metatheoretic proposition $\mathord{\mbox{\it Search\/}}(\mathord{\mbox{\it George flies\/}},{\bf T}_1)$ becomes false, and the conclusion $\mathord{\mbox{\it George flies\/}}$ can no longer be drawn, because the hypothesis required to infer it does not hold. $\mathord{\mbox{\it Search\/}}(\mathord{\mbox{\it George flies\/}},{\bf T}_2)$ becomes true at this point, but it does not support the inference that $\mathord{\mbox{\it George flies\/}}$.

Further refinement of these ideas may show how to avoid the annoying repetition of $\mathord{\mbox{\it George flies\/}}$ on both sides of $\vdash $, or may reveal that search keys have a different type than propositions, with some useful relations determining which search keys are relevant to inferring which propositions. One natural operator to consider is the consistency operator $\mathord{\mbox{\it Consistent\/}}$ [#!consistent!#]. When ${\bf T}$ is a set of propositions, $\mathord{\mbox{\it Consistent\/}}({\bf T})$ asserts that ${\bf T}$ is internally consistent (noncontradictory). Using $\mathord{\mbox{\it Consistent\/}}$, we might approach the penguin problem by allowing

\begin{displaymath}{\bf T}_1\cup \{\mathord{\mbox{\it Consistent\/}}({\bf T}_1\c...
...George flies\/}}\})\}\vdash \mathord{\mbox{\it George flies\/}}\end{displaymath}

In a reasonable system,

\begin{displaymath}\mathord{\mbox{\it Consistent\/}}({\bf T}_1\cup \{\mathord{\mbox{\it George flies\/}}\})\end{displaymath}

holds, but

\begin{displaymath}\mathord{\mbox{\it Consistent\/}}({\bf T}_2\cup \{\mathord{\mbox{\it George flies\/}}\})\end{displaymath}

may not hold. In spite of the attractive conceptual simplicity of reasoning using $\mathord{\mbox{\it Consistent\/}}$, the huge cost of testing consistency, and the lack of apparent flexibility for dealing with conflicting defaults, suggest that we need an operator that refers somehow to the results of the limited searches for knowledge carried out by practical inference procedures, rather than the more abstract notion of consistency in a logical system. On the other hand, some sort of resource-limited test of consistency might be the right basis for introducing nonmonotonicity. Much further thought is required to determine whether expressions describing resource limitations need to be included as arguments to the operator, or whether they may be fixed or quantified out in a standard way.

In the next two sections, I argue that the use of a nonmonotonic logical inference relation $\vdash $ is highly misleading, and not the appropriate way to support nonmonotonic behavior in a procedure for reasoning. Rather, I argue that nonmonotonic behavior should be supported by including, in the hypotheses of a logical inference, an explicit formal representation of the metatheoretic fact that certain propositions contain all of the knowledge relevant to a given problem discovered by a reasonably diligent search. I use the form $\mathord{\mbox{\it Search\/}}(A,{\bf T})$, where $A$ is a proposition and ${\bf T}$ is a finite set of propositions, to represent such metatheoretic assertions. I do not believe that this form is exactly the right one in practice, but it has enough of the essential elements to serve for the current comparative argument.

Two Interpretations of the Logical Inference Relation

The crux of my argument depends on choosing between two interpretations of the logical inference relation denoted by $\vdash $, one of which allows the relation to be nonmonotonic, and the other of which demands monotonicity. Roughly, ${\bf T}\vdash A$ means that, given knowledge of the propositions in ${\bf T}$, it is rational to infer $A$ as well. Two different ways of making this rough idea more precise yield radically different results for practical reasoning.

Definition 2   Let $A$ be a proposition, and ${\bf T}$ be a set of propositions.

\begin{displaymath}{\bf T}\mathrel{\vdash _L}A\end{displaymath}

means that, whenever an agent knows each of the propositions in ${\bf T}$, the agent may rationally conclude that $A$ holds as well (the subscript $L$ stands mnemonically for ``local'').

\begin{displaymath}{\bf T}\mathrel{\vdash _G}A\end{displaymath}

means that, whenever an agent knows each of the propositions in ${\bf T}$, and the agent knows that ${\bf T}$ includes at least all the results of some reasonably diligent search for all available information relevant to judging the correctness of $A$, then the agent may rationally conclude that $A$ holds as well (the subscript $G$ stands for ``global'').

The only distinction between $\mathrel{\vdash _L}$ and $\mathrel{\vdash _G}$ relevant to my discussion is in the different modes of asserting ${\bf T}$--as merely some set of known propositions in one case, or a set containing all known relevant propositions in the other case. The use of propositions rather than formulae, and sets ${\bf T}$ rather than some other structures of propositions or formulae, makes no difference to the present discussion. By ``knowledge'' I mean reasonably reliable information that is relatively easily accessible to an agent. In particular, I do not assume that all knowledge is true. Some readers may wish to substitute ``rational and informed belief'' for ``knowledge.''

Since knowing each of the formulae in a set ${\bf T}$ implies knowing each of the formulae in every subset of ${\bf T}$, $\mathrel{\vdash _L}$ must be monotonic. $\mathrel{\vdash _G}$ is typically nonmonotonic, since as we make the set ${\bf T}$ larger, the information that ${\bf T}$ contains all available relevant knowledge becomes weaker, and therefore supports fewer inferences. In principle, logicians may choose to study any interesting relation between propositions and sets of propositions that they please. But, by focusing attention on an infelicitously chosen relation, and referring to it as ``logical inference,'' we may mislead those who are trying to apply formalisms to practical problems. An intuitively well grounded focus of attention may clarify thinking. In the next section, I analyze the concrete operations performed by a reasoning agent, and argue that $\mathrel{\vdash _L}$ is a useful relation for studying such agents, while $\mathrel{\vdash _G}$ is not.


The Case for Monotonic Logical Inference, with Nonmonotonic Procedures

The choice of an inference relation such as $\mathrel{\vdash _L}$ or $\mathrel{\vdash _G}$ as the basis for studying practical reasoning should depend on the following properties of a typical useful reasoning agent.

  1. The agent has access to a huge set of propositions representing knowledge about the world, through some combination of queries to an internal database, observations of the world, and receipt of communications from other agents.
  2. Access to available knowledge takes time, and consumes valuable resources, in a manner that is generally monotonic in some reasonable measure of the size of the set of propositions accessed.
  3. The agent must do essentially all of its reasoning from small subsets of propositions selected from the available knowledge.
  4. The agent must often interleave inference and access to knowledge--that is, an inference from knowledge accessed so far will often be required in order to determine how to search for further useful knowledge.
From these properties of a reasoning agent, it follows that the individual steps by which inferred propositions are added to current knowledge are of the form Property 4 requires $\mathrel{\vdash _L}$ here, instead of $\mathrel{\vdash _G}$, since some inferences must be made before a reasonably diligent search is completed. $\mathrel{\vdash _G}$ essentially forces a reasoner into a two-phase procedure of search followed by inference.

In principle, propositions might be stratified hierarchically, so that whenever a judgement about the truth of $A$ is required for a reasonably diligent search for all information relevant to $B$, then no judgement about $B$ is required for a reasonably diligent search for all information relevant to $A$. Then $\mathrel{\vdash _G}$ might conceivably work, if we require ${\bf T}$ to include all the propositions found by the latest search. That is, the two-phase procedure using $\mathrel{\vdash _G}$ may be repeated to simulate interleaving of search and inference. But, experience doesn't suggest such a natural stratification, and it seems pointlessly expensive to require completion of a reasonably diligent search in the large number of cases where the inference does not require it semantically.

Certainly, $\mathrel{\vdash _G}$ is an interesting relation, and it is normally a stronger inference tool (not a logically stronger relation), in the sense that there are normally ${\bf T}$ and $A$ such that ${\bf T}\mathrel{\vdash _G}A$, but ${\bf T}\mathrel{\not\:\vdash _L}A$, while the reverse will not happen with conventional modes of assertion. So, in those cases where an appropriate search has been completed, and all of its results included in ${\bf T}$, an agent should be allowed to add $A$ such that ${\bf T}\mathrel{\vdash _G}A$. But, the agent must notice the fact that a diligent search is complete (this is necessary even if all inference steps use $\mathrel{\vdash _G}$, since the agent must decide when to stop gathering members of ${\bf T}$, and start applying $\mathrel{\vdash _G}$), so it seems natural to encode the act of noticing completion of the search as a metatheoretical proposition added to current knowledge. This leads to the encoding of

\begin{displaymath}{\bf T}\mathrel{\vdash _G}A\end{displaymath}

in a form such as

\begin{displaymath}\mathord{\mbox{\it Search\/}}(A,{\bf T})\mathrel{\vdash _L}A\end{displaymath}

Certainly, the syntactic form for representing the proposition $\mathord{\mbox{\it Search\/}}(A,{\bf T})$ can be refined to be compatible with some useful data structure for noticing search completion, so this encoding need not reduce the efficiency of a reasoning procedure originally based directly on $\mathrel{\vdash _G}$.

So, it appears that the information carried by $\mathrel{\vdash _L}$ is essential in the design of practical reasoning procedures, and that the information in $\mathrel{\vdash _G}$ may be encoded into $\mathrel{\vdash _L}$ by adding appropriate metatheoretic operators, which construct assertions that certain sets of propositions contain all of the results of certain searches, to our formal language. On the other hand, I doubt that $\mathrel{\vdash _L}$ can be defined efficiently and naturally from $\mathrel{\vdash _G}$. An obvious candidate definition is

A naive application of this definition leads to an outrageously expensive implementation of $\mathrel{\vdash _L}$. It seems unlikely that there is an efficient general implementation. Furthermore, it is not even clear that the definition is extensionally correct in practically interesting formal systems of logic. The only if part probably holds in most reasonable systems. Monotonicity of $\mathrel{\vdash _L}$ requires that ${\bf T}'\mathrel{\vdash _L}A$. When ${\bf T}'\mathrel{\vdash _L}A$ we normally expect that ${\bf T}'\mathrel{\vdash _G}A$ as well, since the in the second case the hypothesis carries the information given directly by propositions in ${\bf T}'$ as well as the information that ${\bf T}'$ contains all results of some appropriate search. But, the if part seems to depend in a delicate way on the range of quantification of extended sets ${\bf T}'$ of propositions, which varies depending on the generality of the formal language in use.

I do not claim that the particular syntactic form of $\mathord{\mbox{\it Search\/}}(A,{\bf T})$ is the right one for practical reasoning. But, this form illustrates some of the crucial properties of the right solution.

  1. Inferences using default reasoning must have metatheoretic hypotheses, since the basis for using a default assumption is the inability to discover a counterargument, rather than some direct observation about the world outside of a reasoning agent.
  2. Some sort of search parameter must be involved in such metatheoretic hypotheses, since all practical searches are guided by some description of a goal, and the results of search depend critically on this description. The $A$ in $\mathord{\mbox{\it Search\/}}(A,{\bf T})$ is a very naive presentation of this search parameter.
  3. Some sort of description of the results of search must be involved in such metatheoretic hypotheses, since it is the absence of a counterindication in those results that enables the default assumption. The ${\bf T}$ is a very naive presentation of the results of the search.
  4. Although the extensional contents of the knowledge base available to a reasoning agent determines the truth or falsehood of such metatheoretic hypotheses, the contents of that knowledge base must not be mentioned explicitly in the formal presentation of the hypotheses, since it is too big. The whole point of a formal logical language is to represent the information actually manipulated by a reasoning agent, which should normally be much smaller than the world about which the agent is reasoning. For comparative discussion of the results of reasoning from different knowledge bases, some description of the knowledge base may be treated as an additional parameter to the logical inference relation $\vdash $, but it must be understood that the reasoning agent has no direct access to it, only to the formal propositions that result from searching it. In particular, useful definitions of $\vdash $ must not depend on qualities of the knowledge base other than those derived by the search procedures used by the reasoning agent. If several different knowledge bases are available simultaneously to a single agent, the name of a knowledge base to be searched may be modelled as part of the search parameter mentioned in item 2, but the contents of the knowledge base must not be an explicit parameter.

The Hard Problem Remains

Of course, arguments of the sort advanced above do not solve any hard technical problems. The hard problem to solve in automating practical reasoning is the problem of reacting to updates (changes in knowledge) [#!update-problem!#]. That problem is essentially the same at a technical level, whether we use a monotonic inference relation $\mathrel{\vdash _L}$ or a nonmonotonic $\mathrel{\vdash _G}$. In conventional monotonic systems, without metatheoretic operators, the addition of new knowledge leads only to the addition of new inferences. The hard part of the update problem is how to handle retraction of previously accepted knowledge. There is no clear consensus even on the proper form in which to express retractions.

Both nonmonotonic inference relations, such as $\mathrel{\vdash _G}$, and metatheoretic operators, such as $\mathord{\mbox{\it Search\/}}$, seem to add a complication to the problem of changing knowledge, since they both lead to cases where added knowledge requires conclusions to be retracted. I conjecture, though, that once the basic problem of managing retraction of knowledge is solved, the extra interaction of additions and retractions introduced by nonmonotonicity will become much clearer. I propose that, by representing nonmonotonic reasoning behavior through metatheoretic operators, such as $\mathord{\mbox{\it Search\/}}$, with the monotonic inference relation $\mathrel{\vdash _L}$, we can improve the flexibility and transparency by which future breakthroughs in the update problem are applied to a wide range of interesting domains of reasoning.

While the 4 properties of useful reasoning agents mentioned in Section 4 support the use of formal systems that are conventional to the extent of being monotonic, they by no means support the particular choices of classical or intuitionistic reasoning. In fact, an additional property of reasoning agents argues strongly against such choices:

  1. Some of the knowledge provided to the agent is wrong, and possibly even contradictory.
Both classical and intuitionistic logic trivialize in the presence of contradictory information--everything follows logically from any contradiction, and the mere fact that classically or intuitionistically ${\bf T}\vdash A$ does not make it safe for a reasoning agent to conclude $A$ given knowledge of each proposition in ${\bf T}$. Even in the absence of contradiction, classical and intuitionistic logic seem to be too sensitive to errors, due to their conceptual foundation on the assertion of absolute truth rather than of rational belief based on fallible information, although I can find no rigorous discussion of this sensitivity in the literature.

An obvious fix is to have the agent test the consistency of ${\bf T}$ before applying logical inference. Unfortunately, consistency checking ranges from intractably expensive to fundamentally impossible, so the time and resource for consistency checking would dominate that for real reasoning steps. Rather, property 5 requires the use of paraconsistent formal systems [#!paraconsistent!#], such as relevance logics [#!relevance!#] instead of classical or intuitionistic formal systems. Further research is required to illuminate the choice of a particular paraconsistent formal system for a particular reasoning agent. The right paraconsistent formal system can reduce the harm done by errors and contradictions in the hypotheses used by a reasoning agent, allowing useful reasoning to go on during the typically long time between the introduction of an error or contradiction and its detection. But, they still do not address the hard problem of retracting information when an error is finally detected.

About this document ...

Against Nonmonotonic Logic
(DRAFT)

This document was generated using the LaTeX2HTML translator Version 99.2beta6 (1.42)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html -split 0 attack.tex

The translation was initiated by Mike O'Donnell on 2001-12-05


Footnotes

... O'Donnell1
Supported in part by NSF grant no. CCR-9016905

next_inactive up previous
Mike O'Donnell 2001-12-05