Using multiplicative notation, we could have written

instead.
This definition is intuitive, since the following lemmata are satisfied:
Lemma 3.2:

Lemma 3.3:

Each lemma follows directly from the definition and the axioms holding for
(definition 2.1).
From these lemmata, we obtain that for each
,
satisfies the defining axioms of a probability space (definition 2.1).
With this definition, we have the following theorem:
Proof:
From the definition, we have

for all
. Thus, as
is an algebra, we obtain by induction:


Theorem 3.5 (Theorem of the total probability):
Let
be a probability space, and assume

(note that by using the
-notation, we assume that the union is disjoint), where
are all contained within
. Then
.
Proof:

where we used that the sets
are all disjoint, the distributive law of the algebra
and
.
Theorem 3.6 (Retarded Bayes' theorem):
Let
be a probability space and
. Then
.
Proof:
.
This formula may look somewhat abstract, but it actually has a nice geometrical meaning. Suppose we are given two sets
, already know
,
and
, and want to compute
. The situation is depicted in the following picture:
We know the ratio of the size of
to
, but what we actually want to know is how
compares to
. Hence, we change the 'comparitant' by multiplying with
, the old reference magnitude, and dividing by
, the new reference magnitude.
Theorem 3.7 (Bayes' theorem):
Let
be a probability space, and assume
,
where
are all in
. Then for all
.
Proof:
From the basic version of the theorem, we obtain
.
Using the formula of total probability, we obtain
.