Using multiplicative notation, we could have written
instead.
This definition is intuitive, since the following lemmata are satisfied:
Lemma 3.2:
Lemma 3.3:
Each lemma follows directly from the definition and the axioms holding for (definition 2.1).
From these lemmata, we obtain that for each , satisfies the defining axioms of a probability space (definition 2.1).
With this definition, we have the following theorem:
Proof:
From the definition, we have
for all . Thus, as is an algebra, we obtain by induction:
Theorem 3.5 (Theorem of the total probability):
Let be a probability space, and assume
(note that by using the -notation, we assume that the union is disjoint), where are all contained within . Then
- .
Proof:
where we used that the sets are all disjoint, the distributive law of the algebra and .
Theorem 3.6 (Retarded Bayes' theorem):
Let be a probability space and . Then
- .
Proof:
- .
This formula may look somewhat abstract, but it actually has a nice geometrical meaning. Suppose we are given two sets , already know , and , and want to compute . The situation is depicted in the following picture:
We know the ratio of the size of to , but what we actually want to know is how compares to . Hence, we change the 'comparitant' by multiplying with , the old reference magnitude, and dividing by , the new reference magnitude.
Theorem 3.7 (Bayes' theorem):
Let be a probability space, and assume
- ,
where are all in . Then for all
- .
Proof:
From the basic version of the theorem, we obtain
- .
Using the formula of total probability, we obtain
- .