Module 5: Multivariate Normal Distribution
A variable X follows a discrete probability distribution if the possible values of X are either:
- A finite set
- A countable infinite sequence
px(xi) = P(X=xi) is called the probability mass function (PMF)
- px(xi)  >= 0 as it is a probability
- The sum of PMF for all values of X = 1
Recall that in a Discrete Probability Distribution :
In a Continuous Probability Distribution:
Moment Generating Function
Moments, such as E(X), V(X), can also be calculated using the Moment Generating Function (MGF):
The rth moment of X, E(Xr) can be obtained by differentiating Mx(t) r times with respect to t and setting t=0
- Mx(0) = 1
- MIx(0) = E(X)
- MIIx(0) = E(X2) -> V(X) = MIIx(0) - (MIx(0))2
- In general, Mx(r)(0) = E(Xr)
Uniqueness: if X and Y are two random variables and Mx(t) = My(t) when |t| < h for some positive number h, then X and Y have the same distribution
Note: MGF does not exist for all distributions (E(etx) may be infinity)
Binomial Distribution
X ~ Binomial(n, p) 𝑝 ∈ [0, 1]
X = the number of successes in n trials when the probability of success in each trail is p.
We can think of X as the sum of n independent Bernoulli(p) random variables, with the same p for every Xi
- PMF = P(X = x) =  
- Expected value = E(X) = np
- Variance = V(X) = np(1-p)
- MGF = Mx(t) = (pet + (1-p))n
- Two discrete random variables are independent if: P(X = x & Y = y) = P(X = x)*P(Y=y)
Poisson Distribution
X ~ Poisson(λ) λ > 0
X = The number of occurrences of an event of interest.
Poisson as an approximation of the Binomial Distribution
- If X ~ Binomial(n, p) and n -> infinity, p-> 0 such that np is a constant => X ~ Poisson(np)
- Often used analyzing rare diseases
Geometric Distribution
X ~ Geometric(p) 𝑝 ∈ (0, 1]
If Y1, Y2, Y3 ... are a sequence of independent Bernoulli(p) random variables then the number of failures before the first success, X, follows a Geometric distribution.
- PMF = P(X = x) = p(1-p)x
- Expected value = E(X) = (1-p)/p
- Variance = V(X) = (1-p)/p2
- MGF = Mx(t) = p / (1 - (1 - p)et)
Hyper-Geometric Distribution
X ~ Hypergeometric(N, K, n)
Suppose a finite population of size N contains two mutually exclusive events: K success events and N-K failure events. If n events are randomly chosen without replacement X is the number of success events chosen.
- PMF = P(X = x) =  
- Expected value = E(X) = nk / N
- Variance = V(X) = ((nK) / N) * ((N-K) / N) * ((N - n) / (N - 1))
More Important Distributions
Uniform Distribution
X ~ Uniform(a, b) a < b
We use this distribution we use when we have no idea how the data is distributed.
Log-Normal Distribution
X ~ Lognormal(μ , σ2) -infinity < μ < infinity, σ > 0
Gamma Distribution
X ~ Gamma(α, λ) α > 0 , λ > 0
Exponential Distribution
A special subset of the Gamma Distribution (α = 1)
X ~ Exponential(λ) λ > 0
Chi-Square Distribution
Special case of the Gamma Distribution (α = k/2, λ = 1/2)
Properties of Gamma & Chi-Square Distribution
Distribution of the sum of independent Gamma random variables.
Bivariate Normal Distribution
Function of a Discrete Random Variable
Suppose X is a discrete random variable and Y is a function of X. Y = g(X)
The Y is also a random variable: P(Y = y) = P(g(X) = y)
Function of a Continuous Random Variable
Using the same equation as above but assuming the variables are coninuous random variables:
If g is one-to-one (strictly increasing or decreasing) then g has an inverse g-1, in the above case:
Covariance and Correlation
Correlation is defined as an indication as to how strong the relationship between the two variables is:
A positive correlation has σ > 0 and negative correlation has σ < 0
Covariance  provides information about how the variables vary together:
cov(X, Y) = R[(X - E(X))(Y - E(Y))]
This is also equivalent to:
cov(X, Y) = E(XY) - E(X)*E(Y)
Thus if X and Y are independent:
cov(X, Y) = corr(X, Y) = 0
Conditional Expectation of X given Y = y, denoted E(X | Y = y):
Conditional variance can be defined similarly (use the conditional PMF or PDF)
 
                












