Hypergeometric
Probability mass function
|
Cumulative distribution function
|
Notation
|
|
Parameters
|
|
Support
|
|
PMF
|
|
CDF
|
where is the generalized hypergeometric function
|
Mean
|
|
Median
|
mode =
|
Variance
|
|
Skewness
|
|
Ex. kurtosis
|
|
Entropy
|
???
|
MGF
|
|
CF
|
|
The hypergeometric distribution describes the number of successes in a sequence of n draws without replacement from a population of N that contained m total successes.
Its probability mass function is:
![{\displaystyle f(x)={{{m \choose x}{{N-m} \choose {n-x}}} \over {N \choose n}}{\text{ for all }}x\in [0,n]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a6c330687b40587ed4a2d75eb490552603789497)
Technically the support for the function is only where x∈[max(0, n+m-N), min(m, n)]. In situations where this range is not [0,n], f(x)=0 since for k>0,
.
We first check to see that f(x) is a valid pmf. This requires that it is non-negative everywhere and that its total sum is equal to 1. The first condition is obvious. For the second condition we will start with Vandermonde's identity
![{\displaystyle \sum _{x=0}^{n}{a \choose x}{b \choose n-x}={a+b \choose n}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d0c3b2026689d94dcbea89390d5e8b4c10cb0ab7)
![{\displaystyle \sum _{x=0}^{n}{{a \choose x}{b \choose n-x} \over {a+b \choose n}}=1}](https://wikimedia.org/api/rest_v1/media/math/render/svg/85b553c4add8771d5caa50fedd733ae19f2f40cd)
We now see that if a=m and b=N-m that the condition is satisfied.
We derive the mean as follows:
![{\displaystyle \operatorname {E} [X]=\sum _{x=0}^{n}x\cdot f(x;n,m,N)=\sum _{x=0}^{n}x\cdot {{{m \choose x}{{N-m} \choose {n-x}}} \over {N \choose n}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ee678a691b4e5c3cacc31c0093f39e4fb4128531)
![{\displaystyle \operatorname {E} [X]=0\cdot {{{m \choose 0}{{N-m} \choose {n-0}}} \over {N \choose n}}+\sum _{x=1}^{n}x\cdot {{{m \choose x}{{N-m} \choose {n-x}}} \over {N \choose n}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/39432db25cf68f4997af8b9cad284188a847e7a0)
We use the identity
in the denominator.
![{\displaystyle \operatorname {E} [X]=0+\sum _{x=1}^{n}x\cdot {{{m \choose x}{{N-m} \choose {n-x}}} \over {{N \over n}{{N-1} \choose {n-1}}}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/dc72b2fc87517df1421353858acfa5abfc9a1d09)
![{\displaystyle \operatorname {E} [X]={n \over N}\sum _{x=1}^{n}x\cdot {{{m \choose x}{{N-m} \choose {n-x}}} \over {{N-1} \choose {n-1}}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/2090566b0ae8ab1d2eba5cdd3ad10991d1e62f8b)
Next we use the identity
in the first binomial of the numerator.
![{\displaystyle \operatorname {E} [X]={n \over N}\sum _{x=1}^{n}{m{{m-1 \choose x-1}{{N-m} \choose {n-x}}} \over {{N-1} \choose {n-1}}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/3a0006dea506dff720378b04156b5ffdd03e9873)
Next, for the variables inside the sum we define corresponding prime variables that are one less. So N′=N−1, m′=m−1, x′=x−1, n′=n-1.
![{\displaystyle \operatorname {E} [X]={mn \over N}\sum _{x'=0}^{n'}{{{m' \choose x'}{{N'-m'} \choose {n'-x'}}} \over {{N'} \choose {n'}}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/5e1cd1722f894cc5a6f955a9b731d503a3cfe82b)
![{\displaystyle \operatorname {E} [X]={mn \over N}\sum _{x'=0}^{n'}f(x';n',m',N')}](https://wikimedia.org/api/rest_v1/media/math/render/svg/440dbb68d37e1dd929891ffc060edf32baf0f25b)
Now we see that the sum is the total sum over a Hypergeometric pmf with modified parameters. This is equal to 1. Therefore
![{\displaystyle \operatorname {E} [X]={nm \over N}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/6f646539677956727e04993ee125d5c96eb3b2c9)
We first determine E(X2).
![{\displaystyle \operatorname {E} [X^{2}]=\sum _{x=0}^{n}f(x;n,m,N)\cdot x^{2}=\sum _{x=0}^{n}{{{m \choose x}{{N-m} \choose {n-x}}} \over {N \choose n}}\cdot x^{2}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/2c57e1b3542512836927eb966f4b00fac3f6577b)
![{\displaystyle \operatorname {E} [X^{2}]={{{m \choose 0}{{N-m} \choose {n-0}}} \over {N \choose n}}\cdot 0^{2}+\sum _{x=1}^{n}{{{m \choose x}{{N-m} \choose {n-x}}} \over {N \choose n}}\cdot x^{2}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/945e5cce59894c9a6af230a7085439ee45c02aee)
![{\displaystyle \operatorname {E} [X^{2}]=0+\sum _{x=1}^{n}{{m{m-1 \choose x-1}{{N-m} \choose {n-x}}} \over {{N \over n}{{N-1} \choose {n-1}}}}\cdot x}](https://wikimedia.org/api/rest_v1/media/math/render/svg/cf2626751e1653818a91ded798fa6ceddb341e3d)
![{\displaystyle \operatorname {E} [X^{2}]={mn \over N}\sum _{x=1}^{n}{{{m-1 \choose x-1}{{N-m} \choose {n-x}}} \over {{N-1} \choose {n-1}}}\cdot x}](https://wikimedia.org/api/rest_v1/media/math/render/svg/3de1f2629e002b99b8cb83436e299c7e66bcc360)
We use the same variable substitution as when deriving the mean.
![{\displaystyle \operatorname {E} [X^{2}]={mn \over N}\sum _{x'=0}^{n'}{{{m' \choose x'}{{N'-m'} \choose {n'-x'}}} \over {{N'} \choose {n'}}}(x'+1)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d2d212f2d913e29e5a6bfba51a16dabc1b053b7f)
![{\displaystyle \operatorname {E} [X^{2}]={mn \over N}\left[\sum _{x'=0}^{n'}{{{m' \choose x'}{{N'-m'} \choose {n'-x'}}} \over {{N'} \choose {n'}}}x'+\sum _{x'=0}^{n'}{{{m' \choose x'}{{N'-m'} \choose {n'-x'}}} \over {{N'} \choose {n'}}}\right]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/716677fe3389d264e4c693d47a1b2504f33e6276)
The first sum is the expected value of a hypergeometric random variable with parameteres (n',m',N'). The second sum is the total sum that random variable's pmf.
![{\displaystyle \operatorname {E} [X^{2}]={mn \over N}\left[{n'm' \over N'}+1\right]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/757a9e4a2da301b9482bd1523263ec26b5d700da)
![{\displaystyle \operatorname {E} [X^{2}]={mn \over N}\left[{(n-1)(m-1) \over (N-1)}+1\right]={mn \over N}\left[{{(n-1)(m-1)+(N-1)} \over (N-1)}\right]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/31e91eee15b9c70d335de29d7ecea6c9b764598c)
We then solve for the variance
![{\displaystyle \operatorname {Var} (X)=\operatorname {E} [X^{2}]-(\operatorname {E} [X])^{2}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/cd5a922df13bdee788c0f06474fe002a42c25d8a)
![{\displaystyle \operatorname {Var} (X)={mn \over N}\left[{{(n-1)(m-1)+(N-1)} \over (N-1)}\right]-\left({mn \over N}\right)^{2}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/56995fabfd1f82e08f69ae429457ac1c2db7d222)
![{\displaystyle \operatorname {Var} (X)={Nmn \over N^{2}}\left[{{(n-1)(m-1)+(N-1)} \over (N-1)}\right]-{(N-1)(mn)^{2} \over (N-1)N^{2}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/af036c806a95a3b386c81675d1a20f547d48985f)
![{\displaystyle \operatorname {Var} (X)={nm(N-n)(N-m) \over N^{2}(N-1)}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/cd87c497412e5eb71427ac0ec3a97c8dca514d50)
or, equivalently,
![{\displaystyle \operatorname {Var} (X)={nm \over N}\left(1-{n \over N}\right)\left(1-{m-1 \over N-1}\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/f56db334ef5939957b73d87529db0872bc6747c3)