# Principle of indifference (16TH CENTURY)

The fundamental principle of statistical theory that unless there is a reason for believing otherwise, each possible event should be regarded as equally likely.

In this crude form, the principle leads to paradoxes because we can group the alternatives in different ways: the next flower I meet might be blue or red, so its being blue has a probability of one-half; but it also might be blue or crimson or scarlet, so the probability of blue is only one-third).

Evidently we require not mere absence of knowledge of reasons favoring one alternative over another, but knowledge of the absence of such reasons. But this may be hard to achieve, even in apparently symmetrical cases like the outcomes of throwing a die; for example, what do we do about the possibility of its standing on edge, or the fact that the paint on the ‘six’ side will be heavier than on the ‘one’ side?

Also see: propensity theory of probability

Source:
W C Kneale, Probability and Induction (1949), p.31, 34

## Examples

The textbook examples for the application of the principle of indifference are coins, dice, and cards.

In a macroscopic system, at least, it must be assumed that the physical laws that govern the system are not known well enough to predict the outcome. As observed some centuries ago by John Arbuthnot (in the preface of Of the Laws of Chance, 1692),

It is impossible for a Die, with such determin’d force and direction, not to fall on such determin’d side, only I don’t know the force and direction which makes it fall on such determin’d side, and therefore I call it Chance, which is nothing but the want of art….

Given enough time and resources, there is no fundamental reason to suppose that suitably precise measurements could not be made, which would enable the prediction of the outcome of coins, dice, and cards with high accuracy: Persi Diaconis’s work with coin-flipping machines is a practical example of this.

### Coins

A symmetric coin has two sides, arbitrarily labeled heads (many coins have the head of a person portrayed on one side) and tails. Assuming that the coin must land on one side or the other, the outcomes of a coin toss are mutually exclusive, exhaustive, and interchangeable. According to the principle of indifference, we assign each of the possible outcomes a probability of 1/2.

It is implicit in this analysis that the forces acting on the coin are not known with any precision. If the momentum imparted to the coin as it is launched were known with sufficient accuracy, the flight of the coin could be predicted according to the laws of mechanics. Thus the uncertainty in the outcome of a coin toss is derived (for the most part) from the uncertainty with respect to initial conditions. This point is discussed at greater length in the article on coin flipping.

### Dice

A symmetric die has n faces, arbitrarily labeled from 1 to n. An ordinary cubical die has n = 6 faces, although a symmetric die with different numbers of faces can be constructed; see Dice. We assume that the die will land on one face or another upward, and there are no other possible outcomes. Applying the principle of indifference, we assign each of the possible outcomes a probability of 1/n. As with coins, it is assumed that the initial conditions of throwing the dice are not known with enough precision to predict the outcome according to the laws of mechanics. Dice are typically thrown so as to bounce on a table or other surface(s). This interaction makes prediction of the outcome much more difficult.

The assumption of symmetry is crucial here. Suppose that we are asked to bet for or against the outcome “6”. We might reason that there are two relevant outcomes here “6” or “not 6”, and that these are mutually exclusive and exhaustive. This suggests assigning the probability 1/2 to each of the two outcomes.

### Cards

A standard deck contains 52 cards, each given a unique label in an arbitrary fashion, i.e. arbitrarily ordered. We draw a card from the deck; applying the principle of indifference, we assign each of the possible outcomes a probability of 1/52.

This example, more than the others, shows the difficulty of actually applying the principle of indifference in real situations. What we really mean by the phrase “arbitrarily ordered” is simply that we don’t have any information that would lead us to favor a particular card. In actual practice, this is rarely the case: a new deck of cards is certainly not in arbitrary order, and neither is a deck immediately after a hand of cards. In practice, we therefore shuffle the cards; this does not destroy the information we have, but instead (hopefully) renders our information practically unusable, although it is still usable in principle. In fact, some expert blackjack players can track aces through the deck; for them, the condition for applying the principle of indifference is not satisfied.

## Application to continuous variables

Applying the principle of indifference incorrectly can easily lead to nonsensical results, especially in the case of multivariate, continuous variables. A typical case of misuse is the following example:

• Suppose there is a cube hidden in a box. A label on the box says the cube has a side length between 3 and 5 cm.
• We don’t know the actual side length, but we might assume that all values are equally likely and simply pick the mid-value of 4 cm.
• The information on the label allows us to calculate that the surface area of the cube is between 54 and 150 cm². We don’t know the actual surface area, but we might assume that all values are equally likely and simply pick the mid-value of 102 cm².
• The information on the label allows us to calculate that the volume of the cube is between 27 and 125 cm3. We don’t know the actual volume, but we might assume that all values are equally likely and simply pick the mid-value of 76 cm3.
• However, we have now reached the impossible conclusion that the cube has a side length of 4 cm, a surface area of 102 cm², and a volume of 76 cm3!

In this example, mutually contradictory estimates of the length, surface area, and volume of the cube arise because we have assumed three mutually contradictory distributions for these parameters: a uniform distribution for any one of the variables implies a non-uniform distribution for the other two. In general, the principle of indifference does not indicate which variable (e.g. in this case, length, surface area, or volume) is to have a uniform epistemic probability distribution.

Another classic example of this kind of misuse is the Bertrand paradox. Edwin T. Jaynes introduced the principle of transformation groups, which can yield an epistemic probability distribution for this problem. This generalises the principle of indifference, by saying that one is indifferent between equivalent problems rather than indifferent between propositions. This still reduces to the ordinary principle of indifference when one considers a permutation of the labels as generating equivalent problems (i.e. using the permutation transformation group). To apply this to the above box example, we have three random variables related by geometric equations. If we have no reason to favour one trio of values over another, then our prior probabilities must be related by the rule for changing variables in continuous distributions. Let L be the length, and V be the volume. Then we must have

{\displaystyle f_{L}(L)=\left|{\partial V \over \partial L}\right|f_{V}(V)=3L^{2}f_{V}(L^{3})},

where {\displaystyle f_{L},\,f_{V}} are the probability density functions (pdf) of the stated variables. This equation has a general solution: {\displaystyle f(L)={K \over L}}, where K is a normalization constant, determined by the range of L, in this case equal to:

{\displaystyle K^{-1}=\int _{3}^{5}{dL \over L}=\log \left({5 \over 3}\right)}

To put this “to the test”, we ask for the probability that the length is less than 4. This has probability of:

{\displaystyle Pr(L<4)=\int _{3}^{4}{dL \over L\log({5 \over 3})}={\log({4 \over 3}) \over \log({5 \over 3})}\approx 0.56}.

For the volume, this should be equal to the probability that the volume is less than 43 = 64. The pdf of the volume is

{\displaystyle f(V^{1 \over 3}){1 \over 3}V^{-{2 \over 3}}={1 \over 3V\log({5 \over 3})}}.

And then probability of volume less than 64 is

{\displaystyle Pr(V<64)=\int _{27}^{64}{dV \over 3V\log({5 \over 3})}={\log({64 \over 27}) \over 3\log({5 \over 3})}={3\log({4 \over 3}) \over 3\log({5 \over 3})}={\log({4 \over 3}) \over \log({5 \over 3})}\approx 0.56}.

Thus we have achieved invariance with respect to volume and length. One can also show the same invariance with respect to surface area being less than 6(42) = 96. However, note that this probability assignment is not necessarily a “correct” one. For the exact distribution of lengths, volume, or surface area will depend on how the “experiment” is conducted.

The fundamental hypothesis of statistical physics, that any two microstates of a system with the same total energy are equally probable at equilibrium, is in a sense an example of the principle of indifference. However, when the microstates are described by continuous variables (such as positions and momenta), an additional physical basis is needed in order to explain under which parameterization the probability density will be uniform. Liouville’s theorem justifies the use of canonically conjugate variables, such as positions and their conjugate momenta.

The wine/water paradox shows a dilemma with linked variables, and which one to choose.