Axiom introduced by English philosopher and mathematician Bertrand Russell (1872-1970) in connection with the ramified theory of types. It says that any higher-order property or proposition can be reduced to an equivalent first-order one.
The ramified theory caused difficulties for defining real numbers (using Dedekind sections) and for the process known as mathematical induction (roughly: if a property belongs to the first term in a series, and to the successor of any term to which it belongs, then it belongs to them all).
Russell introduced the axiom to deal with these problems, but it was widely felt to be unfounded, and was later dispensed with by Frank P. Ramsey (1903-1930) in Chapter 1 of his Foundations of Mathematics (1931).
B Russell, ‘Mathematical Logic as Based on the Theory of Types’, American Mathematical Monthly (1908); reprinted in R C Marsh, ed., Logic and Knowledge (1956) and in J van Heijenoort, ed., From Frege to Godel (1967)
With Russell’s discovery (1901, 1902) of a paradox in Gottlob Frege’s 1879 Begriffsschrift and Frege’s acknowledgment of the same (1902), Russell tentatively introduced his solution as “Appendix B: Doctrine of Types” in his 1903 The Principles of Mathematics. This contradiction can be stated as “the class of all classes that do not contain themselves as elements”. At the end of this appendix Russell asserts that his “doctrine” would solve the immediate problem posed by Frege, but “there is at least one closely analogous contradiction which is probably not soluble by this doctrine. The totality of all logical objects, or of all propositions, involves, it would seem a fundamental logical difficulty. What the complete solution of the difficulty may be, I have not succeeded in discovering; but as it affects the very foundations of reasoning…”
By the time of his 1908 Mathematical logic as based on the theory of types Russell had studied “the contradictions” (among them the Epimenides paradox, the Burali-Forti paradox, and Richard’s paradox) and concluded that “In all the contradictions there is a common characteristic, which we may describe as self-reference or reflexiveness”.
In 1903, Russell defined predicative functions as those whose order is one more than the highest-order function occurring in the expression of the function. While these were fine for the situation, impredicative functions had to be disallowed:
A function whose argument is an individual and whose value is always a first-order proposition will be called a first-order function. A function involving a first-order function or proposition as apparent variable will be called a second-order function, and so on. A function of one variable which is of the order next above that of its argument will be called a predicative function; the same name will be given to a function of several variables [etc].
He repeats this definition in a slightly different way later in the paper (together with a subtle prohibition that they would express more clearly in 1913):
A predicative function of x is one whose values are propositions of the type next above that of x, if x is an individual or a proposition, or that of values of x if x is a function. It may be described as one in which the apparent variables, if any, are all of the same type as x or of lower type; and a variable is of lower type than x if it can significantly occur as argument to x, or as argument to an argument to x, and so forth. [emphasis added]
This usage carries over to Alfred North Whitehead and Russell’s 1913 Principia Mathematica wherein the authors devote an entire subsection of their Chapter II: “The Theory of Logical Types” to subchapter I. The Vicious-Circle Principle: “We will define a function of one variable as predicative when it is of the next order above that of its argument, i.e. of the lowest order compatible with its having that argument. . . A function of several arguments is predicative if there is one of its arguments such that, when the other arguments have values assigned to them, we obtain a predicative function of the one undetermined argument.”
They again propose the definition of a predicative function as one that does not violate The Theory of Logical Types. Indeed the authors assert such violations are “incapable [to achieve]” and “impossible”:
We are thus led to the conclusion, both from the vicious-circle principle and from direct inspection, that the functions to which a given object a can be an argument are incapable of being arguments to each other, and that they have no term in common with the functions to which they can be arguments. We are thus led to construct a hierarchy.
The authors stress the word impossible:
if we are not mistaken, that not only is it impossible for a function φz^ to have itself or anything derived from it as argument, but that, if ψz^ is another function such there are arguments a with which both “φa” and “ψa” are significant, then ψz^ and anything derived from it cannot significantly be argument to φz^.
Russell’s axiom of reducibility
The axiom of reducibility states that any truth function (i.e. propositional function) can be expressed by a formally equivalent predicative truth function. It made its first appearance in Bertrand Russell’s (1908) Mathematical logic as based on the theory of types, but only after some five years of trial and error. In his words:
Thus a predicative function of an individual is a first-order function; and for higher types of arguments, predicative functions take the place that first-order functions take in respect of individuals. We assume then, that every function is equivalent, for all its values, to some predicative function of the same argument. This assumption seems to be the essence of the usual assumption of classes [modern sets] . . . we will call this assumption the axiom of classes, or the axiom of reducibility.
For relations (functions of two variables such as “For all x and for all y, those values for which f(x,y) is true” i.e. ∀x∀y: f(x,y)), Russell assumed an axiom of relations, or [the same] axiom of reducibility.
In 1903, he proposed a possible process of evaluating such a 2-place function by comparing the process to double integration: One after another, plug into x definite values am (i.e. the particular aj is “a constant” or a parameter held constant), then evaluate f(am,yn) across all the n instances of possible yn. For all yn evaluate f(a1, yn), then for all yn evaluate f(a2, yn), etc until all the x = am are exhausted). This would create an m by n matrix of values: TRUE or UNKNOWN. (In this exposition, the use of indices is a modern convenience.)
In 1908, Russell made no mention of this matrix of x, y values that render a two-place function (e.g. relation) TRUE, but by 1913 he has introduced a matrix-like concept into “function”. In *12 of Principia Mathematica (1913) he defines “a matrix” as “any function, of however many variables, which does not involve any apparent variables. Then any possible function other than a matrix is derived from a matrix by means of generalisation, i.e. by considering the proposition which asserts that the function in question is true with all possible values or with some values of one of the arguments, the other argument or arguments remaining undetermined”. For example, if one asserts that “∀y: f(x, y) is true”, then x is the apparent variable because it is unspecified.
Russell now defines a matrix of “individuals” as a first-order matrix, and he follows a similar process to define a second-order matrix, etc. Finally, he introduces the definition of a predicative function:
A function is said to be predicative when it is a matrix. It will be observed that, in a hierarchy in which all the variables are individuals or matrices, a matrix is the same thing as an elementary function [cf. 1913:127, meaning: the function contains no apparent variables]. ¶ “Matrix” or “predicative function” is a primitive idea.
From this reasoning, he then uses the same wording to propose the same axioms of reducibility as he did in his 1908.
As an aside, Russell in his 1903 considered, and then rejected, “a temptation to regard a relation as definable in extension as a class of couples”, i.e. the modern set-theoretic notion of ordered pair. An intuitive version of this notion appeared in Frege’s (1879) Begriffsschrift (translated in van Heijenoort 1967:23); Russell’s 1903 followed closely the work of Frege (cf. Russell 1903:505ff). Russell worried that “it is necessary to give sense to the couple, to distinguish the referent from the relatum: thus a couple becomes essentially distinct from a class of two terms, and must itself be introduced as a primitive idea. It would seem, viewing the idea philosophically, that sense can only be derived from some relational proposition . . . it seems therefore more correct to take an intensional view of relations, and to identify them rather with class-concepts than with classes”. As shown below, Norbert Wiener (1914) reduced the notion of relation to class by his definition of an ordered pair.